Reprinted earlier HTTP Related articles 《 Reprint and accumulation series - In depth understanding of HTTP agreement 》
With HTTPS The cost of building a station is down , Now most websites have started to use HTTPS agreement . Everybody knows HTTPS Than HTTP Security , I've heard that HTTPS The concepts related to the agreement are SSL 、 Asymmetric encryption 、 CA Certificates, etc , But for the following three soul torture may not answer ：
1. Why did you use HTTPS It's safe ？2.HTTPS How to realize the underlying principle of ？3. It was used HTTPS It must be safe ？
This article will go deep into , In principle HTTPS The safety of .
You may have heard of HTTPS The reason why the agreement is secure is HTTPS The protocol encrypts the transmitted data , The encryption process is implemented by asymmetric encryption . But in fact ,HTTPS Symmetric encryption is used for encryption of content transmission , Asymmetric encryption only works in the certificate verification phase .
HTTPS The whole process is divided into certificate verification and data transmission , The specific interaction process is as follows ：
① Certificate validation phase
Browser initiation HTTPS request
Server return HTTPS certificate
The client verifies that the certificate is legal , If the alarm is prompted in violation of the law
② Data transmission phase
First , The efficiency of asymmetric encryption is very low , and http There is a lot of interaction between the normal end and the end in the application scenario of , The efficiency of asymmetric encryption is unacceptable ;
in addition , stay HTTPS Only the server saves the private key in the scenario , A pair of public and private keys can only realize one-way encryption and decryption , therefore HTTPS The content transmission encryption in is symmetric encryption , Not asymmetric encryption .
HTTP The protocol is considered unsafe because the transmission process is easy to be monitored by the listener 、 Fake servers , and HTTPS The protocol mainly solves the security problem of network transmission .
First, we assume that there is no certification body , Anyone can make a certificate , The security risks are classic “ Man-in-the-middle attack ” problem .
“ Man-in-the-middle attack ” The specific process is as follows ：
Process principle ：
1. Local request hijacked （ Such as DNS Hijack, etc ）, All requests are sent to the broker's server 2. The broker server returns the broker's own certificate 3. Client creates random number , The random number is encrypted by the public key of the intermediary certificate and then sent to the intermediary , Then construct symmetric encryption with random number to encrypt the transmission content 4. Middleman because of the random number of clients , The content can be decrypted by symmetric encryption algorithm 5. The middleman sends the request to the regular website with the request content of the client 6. Because the communication process between the middleman and the server is legal , The regular website returns encrypted data through the established security channel 7. The middleman decrypts the content with the symmetric encryption algorithm established with the regular website 8. The middleman encrypts the data returned by the normal content through the symmetric encryption algorithm established with the client 9. The client decrypts the returned data through the symmetric encryption algorithm established with the middleman
Due to the lack of verification of certificates , So although the client initiated HTTPS request , But the client has no idea that his network has been blocked , The transmission content is stolen by the middleman .
HTTP Message is the data block sent and responded when the browser and server communicate .
The browser requests data from the server , Send a request (request) message ; The server returns data to the browser , Return response (response) message .
Message information is mainly divided into two parts
Contains the first part of the attribute (header)： Additional information （cookie, Cache information, etc ） Rule information related to caching , All contained in header in
The body part that contains the data (body)：HTTP Request what you really want to transfer
For your convenience , We think the browser has a cache database , For storing cached information .
The first time the client requests data , At this time, there is no corresponding cache data in the cache database , Need request server , After the server returns , Store the data in the cache database .
HTTP There are many rules for caching , Classify according to whether the request needs to be reissued to the server , I divide it into two categories ( Mandatory cache , Compare cache )
Before I go into detail about these two rules , Let's go through the sequence diagram first , Let's have a simple understanding of these two rules .
When cached data already exists , Based on forced cache only , The process of requesting data is as follows
When cached data already exists , Based on contrast caching only , The process of requesting data is as follows
Students who are not familiar with the caching mechanism may ask , Based on the process of contrast caching , With or without caching , You need to send a request to the server , So what else to do with caching ？
This problem , Let's put it down for a while , When we introduce each cache rule in detail later , It will give you the answer .
We can see the difference between the two types of caching rules , Force caching if it works , No more interaction with the server , And contrast caching, whether it works or not , All need to interact with the server .
Two types of caching rules can exist at the same time , Force cache priority over contrast cache , in other words , When executing a rule that forces caching , If the cache works , Use cache directly , No more comparison caching rules .
We know from the above that , Mandatory cache , In the case of cache data not failing , You can use cached data directly , So how does the browser judge whether the cache data is invalid ？
We know , When there is no cached data , When the browser requests data from the server , The server will return the data along with the caching rules , Caching rule information is contained in the response header in .
For forced caching , Respond to header There are two fields in the to indicate the invalidation rule （Expires/Cache-Control）
Use chrome Developer tools , It is obvious that when the forced cache is in effect , Network requests
Expires The value of is the expiration time returned by the server , On the next request , The request time is less than the expiration time returned by the server , Use cached data directly .
however Expires yes HTTP 1.0 Things that are , Now the default browser is used by default HTTP 1.1, So its function is ignored .
Another problem is , The expiration time is generated by the server , however There may be an error between the client time and the server time , This leads to cache hit errors .
therefore HTTP 1.1 Version of , Use Cache-Control replace .
Cache-Control Is the most important rule . Common values are private、public、no-cache、max-age,no-store, The default is private.
private: The client can cache public: Both client and proxy servers are cacheable （ Front end students , It can be said that public and private It's the same ） max-age=xxx: The contents of the cache will be in xxx Seconds after the failure no-cache: You need to use a comparison cache to verify the cached data （ Later on ） no-store: Nothing will be cached , Mandatory cache , Contrast caching doesn't trigger （ For front-end development , The more cache the better ,so… Basically say to it 886）
Take a chestnut
In the figure Cache-Control Only... Is specified max-age, So the default is private, Cache time is 31536000 second （365 God ）
in other words , stay 365 Request this data again in days , Will directly get the data in the cache database , Use it directly .
Compare cache , seeing the name of a thing one thinks of its function , A comparison is needed to determine whether caching can be used .
The first time a browser requests data , The server will return the cache ID with the data to the client , The client backs them up to the cache database .
When requesting data again , The client sends the backup cache ID to the server , The server judges according to the cache ID , After judging success , return 304 Status code , Notify client is successful , You can use cached data .
First visit ：
By comparing the two figures , We can clearly find that , When the comparison cache takes effect , Status code for 304, And the message size and request time are greatly reduced .
as a result of , After identity comparison on the server side , Only return header part , Notify the client to use the cache through the status code , It is no longer necessary to return the main part of the message to the client .
For contrast caching , The delivery of cache identity is what we need to understand , It's asking header And response header Pass it on , There are two kinds of logo transmission , Next , Let's introduce it separately .
Server in response to request , Tell the browser when the resource was last modified .
When requesting the server again , Notify the server of the last request through this field , The last modification time of the resource returned by the server .
The server found a header after receiving the request If-Modified-Since Compare with the last modification time of the requested resource .
If the last modification time of the resource is greater than If-Modified-Since, Indicates that the resource has been changed again , Response to the whole resource content , Return status code 200; If the last modification time of the resource is less than or equal to If-Modified-Since, Description no new changes to resources , The response HTTP 304, Tell the browser to continue using the saved cache.
When the server responds to a request , Tells the browser the unique identity of the current resource on the server （ The generation rules are determined by the server ）.
When requesting the server again , This field informs the server of the unique identity of the client segment cache data .
The server found a header after receiving the request If-None-Match It is compared with the unique ID of the requested resource , Different , Indicates that the resource has been changed again , Response to the whole resource content , Return status code 200; identical , Description no new changes to resources , The response HTTP 304, Tell the browser to continue using the saved cache.