Http protocol details

 

HTTP Hypertext Transfer Protocol

 

Http uses connection oriented TCP as the transport layer protocol. Http itself has no connection.

  • Request message

    CRLF is carriage return and line feed
     
    Request message with GET method

    Request message with POST method

method

 

  • OPTIONS: This method enables the server to return all HTTP request methods supported by the resource. Use '*' instead of the resource name to send OPTIONS requests to the Web server to test whether the server functions normally.

  • HEAD: Like the GET method, it sends a request for a specified resource to the server. Only the server will not return this part of the resource. The advantage of this method is that "information about the resource" (meta information or metadata) can be obtained without transferring all the content.

  • GET : Issue a Display request to the specified resource. The GET method should only be used to read data, and should not be used in operations that produce "side effects", such as Web Applications. One reason is that GET may be accessed randomly by web spiders. See safety method

  • POST : Submit data to the specified resource and request the server to process (for example, submit a form or upload a file). The data is included in the request text. This request may create a new resource, modify an existing resource, or both.

  • PUT: upload the latest content to the specified resource location.

  • DELETE: The request server deletes the resource identified by Request URI.

  • TRACE: Echo the request received by the server, mainly for testing or diagnosis.

  • CONNECT: HTTP/1.1 protocol is reserved for proxy servers that can change the connection to pipeline mode. Usually used for links to SSL encryption servers (via unencrypted HTTP proxy servers).  

    Although there are eight HTTP request methods, we usually use get and post in practical applications. Other request methods can also be realized indirectly through these two methods.

URL

 

URL generally consists of<protocol>://<host>:<port number>/<path>

  • agreement

Http -- Hypertext Transfer Protocol Resources

Https -- Hypertext transport protocol using secure socket layer

Ftp -- File Transfer Protocol

Mailto - E-mail address

Ldap -- Lightweight Directory Access Protocol Search

File -- File shared by local computer or online

News -- Usenet newsgroup

Gopher -- Gopher protocol

Telnet -- Telnet protocol

  • Host - refers to the domain name on the Internet

  • Ports can sometimes be omitted

  • Path

    Absolute URL displays the full path of the file, which means that the location of the absolute URL itself is independent of the location of the referenced actual file.  

    Relative URLs use the location of the folder containing the URL itself as a reference point to describe the location of the target folder.  

    If the path omits the URL, it refers to a home page on the Internet.

The first URL omits the path, representing the home page that Baidu knows.  

The second is the relative path of the file 1742817.html, indicating its location.

They all use the https protocol. The port number is omitted.

Version No

 

The protocol used to be HTTP/1.0 is now upgraded to HTTP/1.1. What is the difference between the two?

  • The time required to request a World Wide Web document is 2 * RTT+document transmission time Because it takes three handshakes to establish a TCP connection with the server, the third handshaking takes the data related to the sending request, and then the HTTP server response message has four interactions in total, that is, 2 * RTT time. In addition to some other costs, the World Wide Web server needs to serve a large number of customers, so every time you browse, you need to establish a connection, HTTP/1.0 Non persistent connection (short link) The server is heavily burdened. HTTP/1.1 uses Continuous connection (long link) , the server still maintains the connection after sending the response.  

    Continuous linking is also divided into pipeline mode and non pipeline mode. The non pipeline mode requires that the next browsing request can only be sent after the customer sends a response. In pipeline mode, the client can send the next request without waiting for a response, and the server can respond continuously after receiving the request without waiting, saving time.

  • The continuous connection of HTTP 1.1 also needs to add new request headers to help implement it.

    For example, When the value of the Connection request header is Keep Alive, the client notifies the server to keep the connection after returning the result of this request; When the value of the Connection request header is close, the client notifies the server to close the connection after returning the result of this request.

  • HTTP 1.1 also provides request headers and response headers related to identity authentication, state management, cache caching and other mechanisms.

HTTP message header field

 

From above HTTP There are four types of header fields General header field Request header field Response header field Entity header field

  • General header field: header used by both request message and response message.

  • Request header field: header used when sending request message from client to server.

  • Response header field: the header used when the response message is returned from the server to the client.

  • Entity header field: header used for the entity part of request message and response message.

HTTP/1.1 header field

  • General header field

First field name explain

Cache

Control cache behavior

Connection

Hop by hop header and connection management

Date

Date and time of message creation

Pragma

Message instruction

Trailer

List of headers at the end of the message

Transfer-Encoding

Specify the transmission coding mode of the message body

Upgrade

Upgrade to another agreement

Via

Information about proxy server

Warning

Error Notification

  • Request header field

First field name explain

Accept

Media types that the user agent can handle

Accept-Charset

Preferred character set

Accept-Encoding

Priority content encoding

Accept-Language

Preferred language (natural language)

Authorization

Web Authentication Information

Expect

Expect specific server behavior

From

User's email address

Host

Server where the requested resource resides

if-Match

Compare Entity Tag (ETag)

if-Modified-Since

Compare the update time of resources

if-None-Match

Compare entity tags (as opposed to if Match)

if-Range

Send range request of entity Byte when resources are not updated

if-Unmodified-Since

Compare the update time of resources (as opposed to if Modified Since)

Max-Forwards

Maximum number of transmission hops

Proxy-Authorization

The proxy server requires the client's authentication information

Range

Entity's byte range request

Referer

The original method to obtain the URI in the request

TE

Priority of transmission code

User-Agent

HTTP client program information

  • Response header field

First field name explain

Accept-Ranges

Whether to accept byte range request

Age

Estimating the resource creation time

ETag

Matching information of resources

Location

Redirect the client to the specified URI

Proxy-Authenticate

Authentication information of proxy server to client

Reter-After

Timing requirements for re launching request

Server

HTTP server installation information

Vary

Management information of proxy server cache

WWW-Authenticate

Server to client authentication information

  • Entity header field

First field name explain

Allow

HTTP methods supported by resources

Content-Encoding

Applicable coding method of entity

Content-Language

Natural language of entity subject

Content-Length

Size of entity body (unit: bytes)

Content-Location

Replace the URI of the corresponding resource

Content-MD5

Message digest of entity subject

Content-Range

Location range of entity body

Content-Type

Media type of entity subject

Expires

Date and time when the entity principal expires

Last-Modified

Date and time when the resource was last modified

Http operation procedure

 

Http is a transaction oriented application layer protocol. Each World Wide Web site has a server process that constantly listens to TCP port 80 to find that a browser sends a connection request to it. Once the connection is established, the browser sends a page browsing request to the World Wide Web server. The browser and server must follow the specified format and certain rules, which are the hypertext transfer protocol http.  

Use HTTP/1.0 to describe the event after the user sends a browsing request (enter a URL in the browser address or click an optional event, and the browser will automatically find the page to be connected).  

1. Browser analysis URL.  

2. Request DNS to resolve the IP address of the domain name.  

3. Get the IP address.  

3. The browser server establishes a TCP connection (IP address+port number).  

4. Issue the file fetching command, such as GET/question/1742817.html in the URL above

5. The server responds by sending 1742817.html to the browser.  

6. Release the TCP connection.  

7. The browser displays the text in html.

  • Response message
     

Status codes and phrases

 

1xx: Indication - indicates that the request has been received and continues processing.  

2xx: Success – indicates that the request has been successfully received, understood and accepted.  

3xx: Redirect – Further action is required to complete the request.  

4xx: Client error - The request has syntax errors or cannot be implemented.  

5xx: Server side error - the server failed to implement a legitimate request.

Common status codes and status descriptions are as follows.

200 OK: The client request is successful.  

400 Bad Request: The client request has syntax errors and cannot be understood by the server.  

401 Unauthorized: The request is unauthorized. This status code must be used together with the WWW Authenticate header field.  

403 Forbidden: The server received the request but refused to provide the service.  

404 Not Found: The request resource does not exist. For example, an incorrect URL was entered.  

500 Internal Server Error: Unexpected error occurred on the server.  

503 Server Unavailable: The server is currently unable to process client requests and may recover after a period of time. For example, HTTP/1.1 200 OK (CRLF).

Difference between GET method and POST method

 

Reference link  
1. For GET submission, the requested data will be attached to the URL (that is, the data will be placed in the HTTP protocol header<request line>)? Split URL and transmit data, and connect multiple parameters with&; For example, login. action? name=hyddd&password=idontknow&verify=%E4%BD%A0 %E5%A5%BD。 If the data is English letters/numbers, send it as it is. If it is a space, convert it to+. If it is Chinese/other characters, directly encrypt the string with BASE64. The result is as follows:% E4% BD% A0% E5% A5% BD, where XX in% XX is ASCII represented by the symbol in hexadecimal.

POST submission: Place the submitted data in the HTTP package body<request body>. The red font in the above example indicates the actual transmission data

Therefore, The data submitted by GET will be displayed in the address bar, while the address bar will not be changed for POST submission

2. Size of transmitted data:

First of all, The HTTP protocol does not limit the size of the transmitted data, The HTTP protocol specification also does not limit the URL length. However, the limitations in actual development mainly include:

GET: Certain browsers and servers have restrictions on the URL length. For example, IE has 2083 bytes (2K+35) for the URL length. For other browsers, such as Netscape FireFox, etc., has no length limit in theory, and its limit depends on the support of the operating system.

Therefore, when GET is submitted, the transmission data will be limited by the URL length.

POST: Since the value is not transferred through URL, the data is not limited theoretically. However, in fact, each Web server will limit the size of post submission data, Apache and IIS6 have their own configurations.

3. Safety:
POST is more secure than GET. Note: The security mentioned here is not the same concept as the "security" mentioned by GET above. The above "security" only means no data modification, while the security here means real security. For example, when data is submitted through GET, the user name and password will appear in clear text on the URL, because (1) the login page may be cached by the browser, (2) other people can view the browser's history, so others can get your account and password.

Original link: Http protocol details , Please indicate the source for reprinting!

fabulous one