Learning note: using nginx's reverse proxy and caching technology to improve tornado's throughput

Tianyuan prodigal son 2020-11-13 09:53:01
learning note using nginx reverse


stay B/S Application , Page caching technology is an important means to improve the service ability . Page cache is divided into browser cache and server cache , This article only discusses Nginx Server page cache .Nginx The basic principle of service caching is to create a local copy of the resources requested by the customer , When any user requests the resource again within a reasonable period of time ,Nginx The server does not need to make a request to the back-end server again , It's a direct reply to the cached copy . therefore , Caching technology can significantly reduce the load on the back-end servers , Reduce the burden of network transmission , Greatly improve response speed .

1. tornado The throughput capacity of

Let's use the simplest example , Test it tornado The throughput capacity of :

# -*- coding: utf-8 -*-
import os
import sys
import tornado.web
import tornado.ioloop
import tornado.httpserver
from tornado.options import parse_command_line
class Handler(tornado.web.RequestHandler):
def get(self):
self.write(' I am a tornado, I'm fast enough !')
class Application(tornado.web.Application):
def __init__(self):
handlers = [
(r"/", Handler)
]
settings = dict(
title=' Pressure test ',
debug=True,
)
tornado.web.Application.__init__(self, handlers, **settings)
parse_command_line()
http_server = tornado.httpserver.HTTPServer(Application(), xheaders=True, max_buffer_size=504857600)
http_server.listen(80)
print('Web server is started')
tornado.ioloop.IOLoop.instance().start()

After starting the script , Use browser access 127.0.0.1, Page shows “ I am a tornado, I'm fast enough !”. This example does not use file reading and writing 、 Database reading and writing and other time-consuming operations , More reflective of tornado Its own throughput .

Stress testing usually uses Apache Self contained ab.exe,ab The usage of is :

ab -n Number of requests -c Concurrency number URL

Here's concurrency 10 The total number of requests is 100 A requested stress test :

ab -n 100 -c 10 http://127.0.0.1/
This is ApacheBench, Version 2.3 <$Revision: 1843412 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/
Benchmarking 127.0.0.1 (be patient).....done
Server Software: TornadoServer/6.0.3
Server Hostname: 127.0.0.1
Server Port: 9001
Document Path: /
Document Length: 22 bytes
Concurrency Level: 10
Time taken for tests: 0.107 seconds
Complete requests: 100
Failed requests: 0
Total transferred: 21700 bytes
HTML transferred: 2200 bytes
Requests per second: 937.09 [#/sec] (mean)
Time per request: 10.671 [ms] (mean)
Time per request: 1.067 [ms] (mean, across all concurrent requests)
Transfer rate: 198.58 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 1 0.6 0 2
Processing: 4 9 3.0 9 18
Waiting: 2 9 3.2 8 18
Total: 4 10 3.1 9 19
WARNING: The median and mean for the initial connection time are not within a normal deviation
These results are probably not that reliable.
Percentage of the requests served within a certain time (ms)
50% 9
66% 10
75% 13
80% 14
90% 15
95% 15
98% 18
99% 19
100% 19 (longest request)

Its output has the following key information :

  • Concurrency Level: Concurrency number , Use -c The number specified by the parameter
  • Time taken for tests: Shared time for testing
  • Complete requests: Number of requests completed
  • Failed requests: Number of failed requests
  • Total transferred: The total amount of data transmitted
  • HTML transferred: The amount of data transmitted by the page
  • Requests per second: The average number of requests per second ( throughput )
  • Time per request: Average user request wait time [ms]
  • Time per request: Average server processing time [ms]
  • Transfer rate: Input rate

We send 10000 Requests , Use different concurrent numbers , Test many times , The results are as follows :

Concurrency number The number of requests responded per second ( throughput ) Average user request wait time [ms] Average server processing time [ms]
10 1220.87 8.191 0.819
50 1294.02 38.639 0.773
80 1302.62 61.415 0.768
90 1267.33 71.016 0.789
100 1305.69 76.588 0.766
110 1244.36 88.399 0.804
120 1290.97 92.954 0.775
150 495.69 302.606 2.017
200 504.87 396.144 1.981
300 532.26 563.632 1.879
500 505.32 989.473 1.979

As can be seen from the data , As the number of concurrent increases , The average processing time of the server and the average waiting time of the user are both increasing ; Concurrency is less than 100 when , The server is not saturated , Throughput is still increasing ; Concurrency is greater than 100 after , The processing power of the server is beginning to be affected , Throughput starts to drop .

I use windows platform , Under my test conditions ,tornado Maximum response per second 1305 Requests .Linux On the platform ,tornado It's better than windows The platform is much better .

2. nginx Reverse proxy of

A proxy server is an intermediate server between a client and a server , What we usually call an agent is a positive agent . The forward proxy is the exit of the client , The client sends the request to the forward proxy , Tell the forward proxy which server I want to access , Then send the request to the proxy server to the target server , And return the response to the client . From the perspective of the server , It doesn't know which client sent the real request , There are several clients , Accept requests only from a proxy server .

Contrary to positive agency , The reverse proxy is the portal to the server , The client doesn't know what the real server is , There are several servers , Only know which reverse proxy is . It sends requests to the reverse proxy server , The reverse proxy server will have the option to send the request to one of the servers , And return the server's response to the client .

Reverse proxy changes the server from one to multiple , And provide unified access for multiple servers , The request can be forwarded to the server with the least load according to the load of each server , This is load balancing .

nginx Is an excellent reverse proxy server , It can be downloaded from Official website Download zip , Decompress and use it directly .

First , Let's change the code of the server , Make it possible to start multiple processes at the same time :

# -*- coding: utf-8 -*-
import os
import sys
import multiprocessing
import tornado.web
import tornado.ioloop
import tornado.httpserver
from tornado.options import parse_command_line
# Page handle 
class Handler(tornado.web.RequestHandler):
def get(self):
self.write(' I am a tornado, I'm fast enough !')
class Application(tornado.web.Application):
def __init__(self):
handlers = [
(r"/", Handler),
]
settings = dict(
title=' Pressure test ',
debug=True,
)
tornado.web.Application.__init__(self, handlers, **settings)
# Start the server 
def start_web_server(port):
parse_command_line()
http_server = tornado.httpserver.HTTPServer(Application(), xheaders=True, max_buffer_size=504857600)
http_server.listen(port)
print('Web server is started on port %d.' % port)
tornado.ioloop.IOLoop.instance().start()
if __name__ == "__main__":
if len(sys.argv) == 1:
start_web_server(80)
else:
try:
ports = [int(port) for port in sys.argv[1].split(',')]
except:
try:
a, b = sys.argv[1].split('-')
ports = range(int(a), int(b) + 1)
except:
ports = list()
print ('Parameter error.')
multiprocessing.freeze_support()
for port in ports:
p = multiprocessing.Process(target=start_web_server, args=(port,))
p.start()

On the command line, type the following command , Start two server processes , Each process uses a different port :

python server.py 9001-9002

Next , To configure ngnix.nginx The configuration of is not complicated , You can copy the conf/ngnix.conf, Just modify it .
stay http Add... To the section upstream, The grammar is :

http {
upstream name {
Load balancing strategy
server IP Address : port Other parameters ;
}
}

The optional load balancing strategies are :

  • ip_hash: This kind of strategy will turn one ip Map to a fixed server , Its advantage is that it is easy to keep session The consistency of , The disadvantage is that when the server is overloaded , It can't be streamed to other servers
  • least_conn: This strategy depends on the number of server connections , Choose the server with the smallest number of connections , Different requests from the same client may enter different servers
  • least_time: This strategy calculates the response time of each server , Select servers with short response time , Different requests from the same client may enter different servers

I choose least_time, The configuration is as follows :

upstream serv {
least_conn;
server 127.0.0.1:9001;
server 127.0.0.1:9002;
}

The original location / The content of is modified as follows :

proxy_pass http://serv$request_uri;
# Here are three lines , The purpose is to transmit the information received by the proxy server to the real server
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;

among proxy_pass hinder http://serv$request_uri in ,serv For the upstream The name of .
Modifying and deleting the comments in the original configuration file will , The configuration file is as follows :

worker_processes 1;
events {
worker_connections 1024;
}
http {
sendfile on;
keepalive_timeout 65;
upstream serv {
least_conn;
server 127.0.0.1:9001;
server 127.0.0.1:9002;
}
server {
listen 80;
server_name localhost;
location / {
proxy_pass http://serv$request_uri;
# Here are three lines , The purpose is to transmit the information received by the proxy server to the real server
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
}
}
}

start-up tornado, And go to the directory of the configuration file , Start with the following command nginx:

nginx -c nginx.conf

OK, Reverse agent configuration complete . Use... Again ab Stress test , The result is not what we expected , Throughput multiplied . This is because tornado Of IO Almost to the extreme , Almost shoulder to shoulder nginx,wondows On the platform PC That's about the throughput of . When tornado Need to read and write files 、 Database read-write and other time-consuming operations , Multi process reverse agent can reflect the advantages .

3. Using caching technology

Except for reverse proxy ,nginx You can also enable caching technology , Further improve the service ability . When the client first requests url when ,nginx Forward the request to the server , After the server returns ,nginx Create cache locally . Before the cache fails ,nginx No more forwarding requests , Instead, it directly returns the cached content to the client , The load of the server is transferred to nginx On , and nginx The performance of is excellent .
stay nginx Set cache in configuration file , The grammar is :

http {
proxy_cache_path Cache path keys_zone= Cache name : Cache size levels= First level cache name length : Second level cache name length inactive= Time of deactivation max_size= Maximum size ;
server {
location url {
proxy_cache Cache name ;
proxy_cache_min_uses Number of visits (url How many times is it accessed and then cached );
proxy_cache_valid any The period of validity ;
}
}
}

After modification nginx The configuration file is :

worker_processes 1;
events {
worker_connections 1024;
}
http {
sendfile on;
keepalive_timeout 65;
upstream serv {
least_conn;
server 127.0.0.1:9001;
server 127.0.0.1:9002;
server 127.0.0.1:9003;
server 127.0.0.1:9004;
}
# Set cache path
proxy_cache_path cache keys_zone=CACHE:10m levels=1:4 inactive=1m max_size=1g;
server {
listen 80;
server_name localhost;
location / {
proxy_pass http://serv$request_uri;
# Here are three lines , The purpose is to transmit the information received by the proxy server to the real server
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
# cache
proxy_cache CACHE;
proxy_cache_min_uses 1;
proxy_cache_valid any 1m;
}
}
}

restart nginx, Use a browser to access 127.0.0.1, for the first time nginx There is no cache , The server prints out the access log , Later visit , The server no longer prints logs , explain nginx Caching works .

We are tornado Add... To the server's code 100 ms sleep, To simulate the operation of accessing database , Stress test not enabling cache and enabling cache :

Concurrency number Do not enable cache throughput Enable cache throughput
10 35.10 1239.45
50 37.32 1247.42
80 37.39 1251.62
90 38.01 1243.70
100 37.83 1256.48
110 38.11 1248.20
120 37.97 1247.26
150 38.35 1187.58
200 38.38 1233.15
300 38.51 620.97
500 38.52 630.94

It can be seen that , Cache technology is very effective for throughput improvement !

4. Cache side effects and solutions

cache , Means not up to date , If the content of a page changes quickly , Using caching technology will cause clients to receive incorrect results . If I add a url, Output the current time of the server :

# -*- coding: utf-8 -*-
import os
import sys
import time
import datetime
import multiprocessing
import tornado.web
import tornado.ioloop
import tornado.httpserver
from tornado.options import parse_command_line
# Page handle 
class StaticHandler(tornado.web.RequestHandler):
def get(self):
time.sleep(0.1)
self.write(' I am a tornado, I'm fast enough !')
class VariableHandler(tornado.web.RequestHandler):
def get(self):
now = datetime.datetime.now()
self.write(now.strftime("%Y-%m-%d %H:%M:%S"))
class Application(tornado.web.Application):
def __init__(self):
handlers = [
(r"/", StaticHandler), # Pages that can be cached 
(r"/variable", VariableHandler), # Disable cached pages 
]
settings = dict(
title=' Pressure test ',
debug=True,
)
tornado.web.Application.__init__(self, handlers, **settings)
# Start the server 
def start_web_server(port):
parse_command_line()
http_server = tornado.httpserver.HTTPServer(Application(), xheaders=True, max_buffer_size=504857600)
http_server.listen(port)
print('Web server is started on port %d.' % port)
tornado.ioloop.IOLoop.instance().start()
if __name__ == "__main__":
if len(sys.argv) == 1:
start_web_server(80)
else:
try:
ports = [int(port) for port in sys.argv[1].split(',')]
except:
try:
a, b = sys.argv[1].split('-')
ports = range(int(a), int(b) + 1)
except:
ports = list()
print ('Parameter error.')
multiprocessing.freeze_support()
for port in ports:
p = multiprocessing.Process(target=start_web_server, args=(port,))
p.start()

At this time, the browser accesses 127.0.0.1/variable, For the first time, the right time , After the 1 Within minutes, , Time does not change , etc. 1 Cache expires in minutes , And then visit out to get a new time . To solve this problem , Can be in nginx Add more than one... To the configuration location, Respectively specify whether to enable caching :

worker_processes 1;
events {
worker_connections 1024;
}
http {
sendfile on;
keepalive_timeout 65;
upstream serv {
least_conn;
server 127.0.0.1:9001;
server 127.0.0.1:9002;
server 127.0.0.1:9003;
server 127.0.0.1:9004;
}
proxy_cache_path cache keys_zone=CACHE:1m levels=1:2 inactive=1m max_size=1g;
server {
listen 80;
server_name localhost;
location / {
proxy_pass http://serv$request_uri;
# Here are three lines , The purpose is to transmit the information received by the proxy server to the real server
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
# cache
proxy_cache CACHE;
proxy_cache_min_uses 1;
proxy_cache_valid any 1m;
}
# Forward requests only , No caching
location /variable {
proxy_pass http://serv$request_uri;
# Here are three lines , The purpose is to transmit the information received by the proxy server to the real server
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
}
}
}

restart nginx after , Revisit 127.0.0.1/variable, Get the latest time every time .

版权声明
本文为[Tianyuan prodigal son]所创,转载请带上原文链接,感谢

  1. [front end -- JavaScript] knowledge point (IV) -- memory leakage in the project (I)
  2. This mechanism in JS
  3. Vue 3.0 source code learning 1 --- rendering process of components
  4. Learning the realization of canvas and simple drawing
  5. gin里获取http请求过来的参数
  6. vue3的新特性
  7. Get the parameters from HTTP request in gin
  8. New features of vue3
  9. vue-cli 引入腾讯地图(最新 api,rocketmq原理面试
  10. Vue 学习笔记(3,免费Java高级工程师学习资源
  11. Vue 学习笔记(2,Java编程视频教程
  12. Vue cli introduces Tencent maps (the latest API, rocketmq)
  13. Vue learning notes (3, free Java senior engineer learning resources)
  14. Vue learning notes (2, Java programming video tutorial)
  15. 【Vue】—props属性
  16. 【Vue】—创建组件
  17. [Vue] - props attribute
  18. [Vue] - create component
  19. 浅谈vue响应式原理及发布订阅模式和观察者模式
  20. On Vue responsive principle, publish subscribe mode and observer mode
  21. 浅谈vue响应式原理及发布订阅模式和观察者模式
  22. On Vue responsive principle, publish subscribe mode and observer mode
  23. Xiaobai can understand it. It only takes 4 steps to solve the problem of Vue keep alive cache component
  24. Publish, subscribe and observer of design patterns
  25. Summary of common content added in ES6 + (II)
  26. No.8 Vue element admin learning (III) vuex learning and login method analysis
  27. Write a mini webpack project construction tool
  28. Shopping cart (front-end static page preparation)
  29. Introduction to the fluent platform
  30. Webpack5 cache
  31. The difference between drop-down box select option and datalist
  32. CSS review (III)
  33. Node.js学习笔记【七】
  34. Node.js learning notes [VII]
  35. Vue Router根据后台数据加载不同的组件(思考-&gt;实现-&gt;不止于实现)
  36. Vue router loads different components according to background data (thinking - & gt; Implementation - & gt; (more than implementation)
  37. 【JQuery框架,Java编程教程视频下载
  38. [jQuery framework, Java programming tutorial video download
  39. Vue Router根据后台数据加载不同的组件(思考-&gt;实现-&gt;不止于实现)
  40. Vue router loads different components according to background data (thinking - & gt; Implementation - & gt; (more than implementation)
  41. 【Vue,阿里P8大佬亲自教你
  42. 【Vue基础知识总结 5,字节跳动算法工程师面试经验
  43. [Vue, Ali P8 teaches you personally
  44. [Vue basic knowledge summary 5. Interview experience of byte beating Algorithm Engineer
  45. 【问题记录】- 谷歌浏览器 Html生成PDF
  46. [problem record] - PDF generated by Google browser HTML
  47. 【问题记录】- 谷歌浏览器 Html生成PDF
  48. [problem record] - PDF generated by Google browser HTML
  49. 【JavaScript】查漏补缺 —数组中reduce()方法
  50. [JavaScript] leak checking and defect filling - reduce() method in array
  51. 【重识 HTML (3),350道Java面试真题分享
  52. 【重识 HTML (2),Java并发编程必会的多线程你竟然还不会
  53. 【重识 HTML (1),二本Java小菜鸟4面字节跳动被秒成渣渣
  54. [re recognize HTML (3) and share 350 real Java interview questions
  55. [re recognize HTML (2). Multithreading is a must for Java Concurrent Programming. How dare you not
  56. [re recognize HTML (1), two Java rookies' 4-sided bytes beat and become slag in seconds
  57. 【重识 HTML ,nginx面试题阿里
  58. 【重识 HTML (4),ELK原来这么简单
  59. [re recognize HTML, nginx interview questions]
  60. [re recognize HTML (4). Elk is so simple