How to build a high performance server (take nginx as an example)

Tan Yingzhi 2020-11-11 10:15:09
build high performance server nginx


Software level

increase CPU utilization
  • Use all CPU, worker The number of processes equals CPU

  • No useless switching between processes

  • Don't give up when you are busy CPU

  • worker There is no competition between processes CPU

  • CPU Switching needs 5us, If a large number of processes need to switch , be CPU It will waste a lot of time switching , Do useless work

  • worker Process binding CPU

    pidstat -w You can see how many times a process has switched

  • Not being robbed of resources by other processes

  • Increase process priority , Get bigger CPU Time slice

  • Reduce other processes

  • Reduce the number of surprises

    scene : Multiple worker process accept On the same port

    • Default accept_mutex on

      Multiple worker Process scramble lock to get connection , At the same time, there is only one worker Obtain a connection

    • accept_mutex off

      A connection request wakes up all worker process , At the same time, there is only one worker Obtain a connection , There's a panic problem , When worker When the number of processes is small , The impact is not big , Less fight lock , High concurrency can improve the system response capability


      kernel 3.9+ Deal with new features of large concurrent connections , After opening , Connections are allocated through the kernel worker process , Best performance

  • Improve CPU cache hit rate

    binding worker To designate CPU: worker_cpu_affinity cpumask...

Increase memory utilization
  • Use tcmalloc

    Reduce memory fragmentation

    Concurrency is higher than glibc, The more concurrent , The better the performance is ( Small memory allocation )


    You need to manually compile to nginx

increase IO utilization
  • contrast

    Mechanical drive

    • The price is low
    • Large storage capacity
    • BPS Big , Suitable for sequential reading and writing
    • IOPS Small , Not suitable for random reading and writing
    • Long life

    Solid state should be fat

    • high price
    • Small storage
    • BPS Big
    • IOPS Big
    • Write short life
  • Optimize reading

    • sendfile Zero copy

      Files are directly from kernel state files to socket The transfer

      location /video/{
      sendfile on;
      aio on;
      directio 8m;
    • gzip_static

      Compress files ahead of time , To speed up the gzip Return of message

    • Memory disk /SSD disc

  • Reduce write

    • empty_gif

      Use to return a piece of 1*1 Blank picture of , In order to reduce http The return message length of

    • AIO

      On disk reading and writing , Processes can handle other things

      aio on|off|threads=[pool]

    • direct IO, Reduce the read and write of cache once

      directio size|off exceed size Use direct io, Suitable for big documents

    • increase error_log Level

    • error.log Output to memory

      error_log memory: 32m debug

      Log in 32m The memory of is output in a loop , You can only see 32m Debug log of , Can improve performance

    • close access_log

    • Compress access_log

      access_log path [format] [gzip]

    • Open or not proxy buffering

    • syslog Replace local io

      Use UDP Write instead of io write in , Improve performance

  • Thread pool thread pool

    When certain io When it's going to block , Use thread pool

Increase the utilization of broadband network
  • syn Retry count

    net.ipv4.tcp_syn_retries = 6

  • Local port available range

    net.ipv4.ip_local_port_range=32768 60999

    It can enlarge

  • Connection timeout


  • Maximum number of receive connections (syn Handshake not complete )

    net.ipv4.tcp_max_syn_backlog = 262144

    You can zoom in properly

  • Handshake completed

    net.core.somaxconn: The system is the largest backlog The queue length

  • If the queue is exceeded, the message can be received and returned directly RST


  • Message queue length that is not processed by the kernel


  • syn/ack Retry count


  • Handle syn attack


    When syn When the queue is full , new syn Not in the queue , To calculate the cookie Back to the client , The client carries cookie Reconnect the , Server authentication cookie, Through the establishment of a connection . Back to cause TCP The optional function fails , For example, expand the window / Time stamps, etc

  • Operating system maximum handle

    fs.file-max: The operating system can use the maximum number of handles

    Use fs.file-nr You can see the currently assigned / Is using / ceiling

  • Maximum number of user handles


    root soft nofile 63535

    root har nofile 65535

  • Process limits the maximum number of handles

    worker_rlimit_nofile number

  • Process maximum connections

    worker_connections number

  • Tcp Fast Open

    When TCP When you connect again , By carrying cookie, One less time syn/ack Of rtt Time , To achieve rapid establishment TCP The purpose of the connection

net.ipv4.tcp_fastopen 0|1|2|3

listen address [:port] [fastopen=number];

fastopen=number In order to prevent syn attack , Limit maximum length , Appoint TFO Maximum length of connection queue

  • TCP buffer

    net.ipv4.tcp_rmen = 4096 87380 6291456

    net.ipv4.tcp_wmen = 4096 87380 6291456

    net.ipv4.tcp_men = 1541646 2055528 3083292

    net.ipv4.tcp_moderate_rcvbuf=1 Turn on auto adjust cache mode

    listen address [:port] [recvbuf=size] [sndbuf=size]

    net.ipv4.tcp_adv_win_scale = 1

    Application cache = buffer / (2^tcp_adv_win_scale)

    Receiving window = buffer - buffer/(2^tcp_adv_win_scale)

    BDP = bandwidth * RTT/2


  • Nagle Algorithm

    There is only one unconfirmed tabloid in the network ACK

    Purpose : Avoid a large number of tabloids on a connection , Improve network utilization

    Throughput priority : Enable Nagle tcp_nodelay off

    Low latency first : Ban Nagle tcp_nodelay on

  • Congestion window

    Actual flow rate = Minimum value of congestion window and receiving window

  • Slow start

    Exponential expansion congestion window cwnd = cwnd*2

  • Congestion avoidance

    Window greater than threshold Linear increase

  • Congestion occurs

    Packet loss ,

    RTO Overtime ,threshold = cwnd/2, cwnd=1

    Fast Retransmit: cwnd=cwnd/2, threshold=cwnd

  • Fast recovery

    When Fast Retransmit When it appears ,cwnd Adjusted for threshold+3*MSS

  • Optimize slow start

    Increase the initial cwnd=10

  • TCP keep-alive

    Turn on keepalive Can detect lost connections socket, And turn off immediately , Save system resources

    net.ipv4.tcp_keepalive_time = 7200

    net.ipv4.tcp_keepalive_intvl = 75

    net.ipv4.tcp_keepalive_probes = 9

  • timewait

    net.ipv4.tcp_orphan_retries = 0

    net.ipv4.tcp_fin_timeout = 60

    net.ipv4.tcp_max_tw_buckets = 262144 Maximum timewait The number of connections , Beyond direct closing connection

  • lingering_close Delayed closure

    When the receive buffer still receives the client's content , If the server sends it now RST Close the connection , Can cause client to receive RST And ignore http response

    lingering_close off|on|always

    reset_timedout_connection on|off; When read and write timeout takes effect, the connection is closed according to law , By sending RST Close the connection now , To release resources

  • TLS/SLL Optimize handshake


  • TLS/SSL Session ticket tickets

    Nginx Session session Information in as tickets Encryption sent to the client , Bring it with the client when the connection is established again tickets,Nignx Verify reuse session

    The advantages can reduce the number of symmetric encryption and decryption , Improve performance

    Disadvantages reduce security , It needs to be replaced frequently tickets secret key

    ssl_seesion_tickets on|off

    ssl_session_ticket_key file

  • Use HTTP A long connection

    keepalive_request number;

  • gzip Compress

    Improve network transmission efficiency

    gzip on|off

  • Use http2

Statistics function call statistics


pprof --text|pdf

goodle_perftools_profiles file


  • network card : Wan Zhao nic , for example 10G/25G/40G
  • disk : Solid state disk , Focus on IOPS/BPS indicators
  • CPU: Faster master frequency , Bigger cache , Better architecture
  • Memory : Faster access speed


本文为[Tan Yingzhi]所创,转载请带上原文链接,感谢

  1. [front end -- JavaScript] knowledge point (IV) -- memory leakage in the project (I)
  2. This mechanism in JS
  3. Vue 3.0 source code learning 1 --- rendering process of components
  4. Learning the realization of canvas and simple drawing
  5. gin里获取http请求过来的参数
  6. vue3的新特性
  7. Get the parameters from HTTP request in gin
  8. New features of vue3
  9. vue-cli 引入腾讯地图(最新 api,rocketmq原理面试
  10. Vue 学习笔记(3,免费Java高级工程师学习资源
  11. Vue 学习笔记(2,Java编程视频教程
  12. Vue cli introduces Tencent maps (the latest API, rocketmq)
  13. Vue learning notes (3, free Java senior engineer learning resources)
  14. Vue learning notes (2, Java programming video tutorial)
  15. 【Vue】—props属性
  16. 【Vue】—创建组件
  17. [Vue] - props attribute
  18. [Vue] - create component
  19. 浅谈vue响应式原理及发布订阅模式和观察者模式
  20. On Vue responsive principle, publish subscribe mode and observer mode
  21. 浅谈vue响应式原理及发布订阅模式和观察者模式
  22. On Vue responsive principle, publish subscribe mode and observer mode
  23. Xiaobai can understand it. It only takes 4 steps to solve the problem of Vue keep alive cache component
  24. Publish, subscribe and observer of design patterns
  25. Summary of common content added in ES6 + (II)
  26. No.8 Vue element admin learning (III) vuex learning and login method analysis
  27. Write a mini webpack project construction tool
  28. Shopping cart (front-end static page preparation)
  29. Introduction to the fluent platform
  30. Webpack5 cache
  31. The difference between drop-down box select option and datalist
  32. CSS review (III)
  33. Node.js学习笔记【七】
  34. Node.js learning notes [VII]
  35. Vue Router根据后台数据加载不同的组件(思考->实现->不止于实现)
  36. Vue router loads different components according to background data (thinking - & gt; Implementation - & gt; (more than implementation)
  37. 【JQuery框架,Java编程教程视频下载
  38. [jQuery framework, Java programming tutorial video download
  39. Vue Router根据后台数据加载不同的组件(思考->实现->不止于实现)
  40. Vue router loads different components according to background data (thinking - & gt; Implementation - & gt; (more than implementation)
  41. 【Vue,阿里P8大佬亲自教你
  42. 【Vue基础知识总结 5,字节跳动算法工程师面试经验
  43. [Vue, Ali P8 teaches you personally
  44. [Vue basic knowledge summary 5. Interview experience of byte beating Algorithm Engineer
  45. 【问题记录】- 谷歌浏览器 Html生成PDF
  46. [problem record] - PDF generated by Google browser HTML
  47. 【问题记录】- 谷歌浏览器 Html生成PDF
  48. [problem record] - PDF generated by Google browser HTML
  49. 【JavaScript】查漏补缺 —数组中reduce()方法
  50. [JavaScript] leak checking and defect filling - reduce() method in array
  51. 【重识 HTML (3),350道Java面试真题分享
  52. 【重识 HTML (2),Java并发编程必会的多线程你竟然还不会
  53. 【重识 HTML (1),二本Java小菜鸟4面字节跳动被秒成渣渣
  54. [re recognize HTML (3) and share 350 real Java interview questions
  55. [re recognize HTML (2). Multithreading is a must for Java Concurrent Programming. How dare you not
  56. [re recognize HTML (1), two Java rookies' 4-sided bytes beat and become slag in seconds
  57. 【重识 HTML ,nginx面试题阿里
  58. 【重识 HTML (4),ELK原来这么简单
  59. [re recognize HTML, nginx interview questions]
  60. [re recognize HTML (4). Elk is so simple