0. Additional Basic Knowledge
OSI (Open System Internetwork) Reference Model:
- Application Layer
Provides network communication interfaces for applications
-
Presentation Layer
-
Session Layer
-
Transport Layer
Data transmission unit is message
- Network Layer
Data transmission unit is packet
- Data Link Layer
Data transmission unit is frame
- Physical Layer
Data transmission unit is bit
OSI 7 layers are a theoretical model for research. It's too cumbersome for practical application, so there's the TCP/IP reference model:
P.S. TCP/IP here refers to the protocol suite, or protocol family, containing various network protocols such as SNMP, ICMP, UDP, DNS, etc.
- Application Layer
Such as HTTP, FTP, DNS
- Transport Layer
Such as TCP, UDP
- Network Layer
Such as IP
- Data Link Layer
HTTP is an application layer protocol. Internally it uses TCP at the transport layer and IP at the network layer, so HTTP also requires three-way handshake and four-way wave.
Therefore, there is consumption in establishing/releasing connections. As a performance optimization, HTTP/1.1 defaults to persistent connections, while HTTP/1.0 does not.
In addition, it's necessary to simply distinguish some concepts:
- Proxy
An application with forwarding functionality
- Gateway
A server that forwards communication data from other servers
- Tunnel
An application that can maintain communication connections between Client and Server
- Tunnel
I. URI and URL
-
URI (Uniform Resource Identifier): A string used to identify specific internet resources
-
URL (Uniform Resource Locator): A string used to identify the location of specific internet resources
URL must include Scheme, Host, and URL Path
So URL is a subset of URI, or rather, URL is part of URI.
II. HTTP Methods
- GET: Get resources
Supported in 1.0/1.1, commonly used
- POST: Transmit entity body
Supported in 1.0/1.1, commonly used
- PUT: Transmit files
Supported in 1.0/1.1. Generally websites don't use this method because PUT method itself doesn't have a verification mechanism. Anyone can upload files, which poses security issues.
- HEAD: Get message header
Supported in 1.0/1.1, commonly used
- DELETE: Delete files
Supported in 1.0/1.1, has the same problems as PUT method
- OPTIONS: Ask about supported methods
Supported in 1.1
- TRACE: Trace path
Supported in 1.1, used with Max-Forward header field to check path
- CONNECT: Request to connect proxy using tunnel protocol
Supported in 1.1, used with SSL (Secure Sockets Layer) and TLS (Transport Layer Security) to achieve TCP communication through tunnels
- (LINK): Establish connection with resources
Supported in 1.0, deprecated in 1.1
- (UNLINK): Disconnect connection
Supported in 1.0, deprecated in 1.1
III. HTTP Status Codes
- 1XX: Informational
Request is being processed
- 101: Switching Protocol, switch protocol, used with Upgrade header field to request protocol switch, not very common
- 2XX: Success
Request processed normally
- 200: OK, request processed normally
- 204: No Content, request processed successfully but no resources to return (response body is empty)
- 206: Partial Content, request partial content, Content-Range in response header indicates range
- 3XX: Redirection
Additional operations needed to complete request
- 301: Moved Permanently, permanent redirection, browser should automatically update bookmarks
- 302: Found, temporary redirection, don't update bookmarks
- 303: See Other, similar to 302, but requires using GET method to access new URL
Note: Specification requires not changing request method when encountering 301, 302 (if originally POST, continue using POST), but *de facto standard* is almost all browsers switch to GET method when encountering 301, 302, 303, which doesn't conform to specification
- 304: Not Modified, resource found but doesn't meet request conditions (If-Match, If-Modified-Since, If-None-Match, If-Range, If-Unmodified-Since)
Note: Although 304 belongs to 3XX, it *has nothing to do with redirection*
- 307: Temporary Redirection, temporary redirection, same meaning as 302
Status code introduced to correct de facto standard, hoping to strictly follow 302 without changing method, but now browsers don't necessarily do this, standard failed again
- 4XX: Client Error
Server cannot process request
- 400: Bad Request, request message has syntax errors
- 401: Unauthorized, indicates authentication needed (BASIC or DIGEST authentication) or authentication failed
- 403: Forbidden, access request to specified resource is rejected
- 404: Not Found, server cannot find requested resource, can replace 403 response when not wanting to explain reason
- 405: Method Not Allowed, method not supported, not very common
- 412: Precondition Failed, not very common
- 417: Expectation Failed, not very common
- 5XX: Server Error
Server error processing request
-
500: Internal Server Error, error occurred when server executing request
-
503: Service Unavailable, server overloaded or under maintenance
IV. HTTP Message Header
HTTP Message Header = Message Header + Empty Line (CR+LF) + Message Body
= Start Line (Request Line/Status Line) + Header + Empty Line + Message Body
= Start Line + Request/Response Header Fields + General Header Fields + Entity Header Fields + Empty Line + Message Body
P.S. Don't underestimate this Empty Line (CR+LF). HTTP header injection attacks exist because of this empty line.
1. Start Line
Start line is divided into request line and status line (corresponding to HTTP request message and response message respectively):
-
Request Line: Explains the method, URI, and HTTP version used for request
-
Status Line: Explains the HTTP version and status code of returned response
2. Header Fields
| Header Field Name | Description |
|---|---|
| Accept | Media types user agent can handle |
| Accept-Charset | Preferred character set |
| Accept-Encoding | Preferred content encoding |
| Accept-Language | Preferred language (natural language) |
| Authorization | Web authentication information |
| Expect | Expect specific behavior from server |
| From | User's email address |
| Host | Server where requested resource is located |
| If-Match | Compare entity tag (ETag) |
| If-Modified-Since | Compare resource update time |
| If-None-Match | Compare entity tag (opposite of If-Match) |
| If-Range | Send entity Byte range request when resource not updated |
| If-Unmodified-Since | Compare resource update time (opposite of If-Modified-Since) |
| Max-Forwards | Maximum transmission hop count |
| Proxy-Authorization | Authentication information proxy server requires from client |
| Range | Entity byte range request |
| *Referer* | Web authentication information |
| TE | Transfer encoding priority |
| User-Agent | HTTP client program information |
| Header Field Name | Description |
|---|---|
| Accept-Ranges | Whether byte range requests are accepted |
| Age | Estimated time since resource creation |
| ETag | Resource matching information |
| Location | Redirect client to specified URI |
| Proxy-Authenticate | Proxy server authentication information for client |
| Retry-After | Timing requirement for initiating request again |
| Server | HTTP server installation information |
| Vary | Proxy server cache management information |
| WWW-Authenticate | Server authentication information for client |
| Header Field Name | Description |
|---|---|
| Cache-Control | Control cache behavior |
| Connection | Per-hop header, connection management |
| Date | Date and time message created |
| Pragma | Message instructions |
| Trailer | Header list at message end |
| Transfer-Encoding | Specify transmission encoding method for message body |
| Upgrade | Upgrade to other protocol |
| Via | Proxy server related information |
| Warning | Error notification |
| Header Field Name | Description |
|---|---|
| Allow | HTTP methods resource supports |
| Content-Encoding | Encoding method applicable to entity body |
| Content-Language | Natural language of entity body |
| Content-Length | Size of entity body (unit: bytes) |
| Content-Location | URI replacing corresponding resource |
| Content-MD5 | Message digest of entity body |
| Content-Range | Position range of entity body |
| Content-Type | Media type of entity body |
| Expires | Expiration date and time of entity body |
| Last-Modified | Last modification date and time of resource |
P.S. For other header fields and more detailed header field information, please check Cnblogs: HTTP Message
References:
- "Illustrated HTTP"
No comments yet. Be the first to share your thoughts.