详解http报文(2)-web容器是如何解析http报文的

方丈的寺院 2020-11-13 05:41:10
http Web 容器 详解 报文


摘要

详解http报文一文中,详细介绍了http报文的文本结构。那么作为服务端,web容器是如何解析http报文的呢?本文以jetty和undertow容器为例,来解析web容器是如何处理http报文的。

在前文中我们从概览中可以了解到,http报文其实就是一定规则的字符串,那么解析它们,就是解析字符串,看看是否满足http协议约定的规则。

start-line: 起始行,描述请求或响应的基本信息
*( header-field CRLF ): 头
CRLF
[message-body]: 消息body,实际传输的数据

jetty

以下代码都是jetty9.4.12版本

如何解析这么长的字符串呢,jetty是通过状态机来实现的。具体可以看下org.eclipse.jetty.http.HttpParse

 public enum State
{
START,
METHOD,
![](https://img-blog.csdnimg.cn/20191009220511890.png?x-oss-process=image/watermark,type_ZmFuZ3poZW5naGVpdGk,shadow_10,text_aHR0cHM6Ly9mYW5nemhhbmcuYmxvZy5jc2RuLm5ldA==,size_16,color_FFFFFF,t_70),
SPACE1,
STATUS,
URI,
SPACE2,
REQUEST_VERSION,
REASON,
PROXY,
HEADER,
CONTENT,
EOF_CONTENT,
CHUNKED_CONTENT,
CHUNK_SIZE,
CHUNK_PARAMS,
CHUNK,
TRAILER,
END,
CLOSE, // The associated stream/endpoint should be closed
CLOSED // The associated stream/endpoint is at EOF
}

总共分成了21种状态,然后进行状态间的流转。在parseNext方法中分别对起始行 -> header -> body content分别解析

public boolean parseNext(ByteBuffer buffer)
{
try
{
// Start a request/response
if (_state==State.START)
{
// 快速判断
if (quickStart(buffer))
return true;
}
// Request/response line 转换
if (_state.ordinal()>= State.START.ordinal() && _state.ordinal()<State.HEADER.ordinal())
{
if (parseLine(buffer))
return true;
}
// headers转换
if (_state== State.HEADER)
{
if (parseFields(buffer))
return true;
}
// content转换
if (_state.ordinal()>= State.CONTENT.ordinal() && _state.ordinal()<State.TRAILER.ordinal())
{
// Handle HEAD response
if (_responseStatus>0 && _headResponse)
{
setState(State.END);
return handleContentMessage();
}
else
{
if (parseContent(buffer))
return true;
}
}
return false;
}

整体流程

整体有三条路径

  1. 开始 -> start-line -> header -> 结束
  2. 开始 -> start-line -> header -> content -> 结束
  3. 开始 -> start-line -> header -> chunk-content -> 结束

起始行

start-line = request-line(请求起始行)/(响应起始行)status-line

  1. 请求报文解析状态迁移
    请求行:START -> METHOD -> SPACE1 -> URI -> SPACE2 -> REQUEST_VERSION

  2. 响应报文解析状态迁移
    响应行:START -> RESPONSE_VERSION -> SPACE1 -> STATUS -> SPACE2 -> REASON

header 头

HEADER 的状态只有一种了,在jetty的老版本中还区分了HEADER_IN_NAM, HEADER_VALUE, HEADER_IN_VALUE等,9.4中都去除了。为了提高匹配效率,jetty使用了Trie树快速匹配header头。

static
{
CACHE.put(new HttpField(HttpHeader.CONNECTION,HttpHeaderValue.CLOSE));
CACHE.put(new HttpField(HttpHeader.CONNECTION,HttpHeaderValue.KEEP_ALIVE));
// 以下省略了很多了通用header头

content

请求体:

  1. CONTENT -> END,这种是普通的带Content-Length头的报文,HttpParser一直运行CONTENT状态,直到最后ContentLength达到了指定的数量,则进入END状态
  2. chunked分块传输的数据
    CHUNKED_CONTENT -> CHUNK_SIZE -> CHUNK -> CHUNK_END -> END

undertow

undertow是另一种web容器,它的处理方式与jetty有什么不同呢
状态机种类不一样了,io.undertow.util.HttpString.ParseState

 public static final int VERB = 0;
public static final int PATH = 1;
public static final int PATH_PARAMETERS = 2;
public static final int QUERY_PARAMETERS = 3;
public static final int VERSION = 4;
public static final int AFTER_VERSION = 5;
public static final int HEADER = 6;
public static final int HEADER_VALUE = 7;
public static final int PARSE_COMPLETE = 8;

具体处理流程在HttpRequestParser抽象类中

public void handle(ByteBuffer buffer, final ParseState currentState, final HttpServerExchange builder) throws BadRequestException {
if (currentState.state == ParseState.VERB) {
//fast path, we assume that it will parse fully so we avoid all the if statements
// 快速处理GET
final int position = buffer.position();
if (buffer.remaining() > 3
&& buffer.get(position) == 'G'
&& buffer.get(position 1) == 'E'
&& buffer.get(position 2) == 'T'
&& buffer.get(position 3) == ' ') {
buffer.position(position 4);
builder.setRequestMethod(Methods.GET);
currentState.state = ParseState.PATH;
} else {
try {
handleHttpVerb(buffer, currentState, builder);
} catch (IllegalArgumentException e) {
throw new BadRequestException(e);
}
}
// 处理path
handlePath(buffer, currentState, builder);
// 处理版本
if (failed) {
handleHttpVersion(buffer, currentState, builder);
handleAfterVersion(buffer, currentState);
}
// 处理header
while (currentState.state != ParseState.PARSE_COMPLETE && buffer.hasRemaining()) {
handleHeader(buffer, currentState, builder);
if (currentState.state == ParseState.HEADER_VALUE) {
handleHeaderValue(buffer, currentState, builder);
}
}
return;
}
handleStateful(buffer, currentState, builder);
}

与jetty不同的是对content的处理,在header处理完以后,将数据放到io.undertow.server.HttpServerExchange,然后根据类型,有不同的content读取方式,比如处理固定长度的,FixedLengthStreamSourceConduit

关注公众号【方丈的寺院】,第一时间收到文章的更新,与方丈一起开始技术修行之路
在这里插入图片描述

参考

http://www.blogjava.net/DLevin/archive/2014/04/19/411673.html

https://www.ph0ly.com/2018/10/06/jetty/connection/http-parser/

https://webtide.com/http-trailers-in-jetty/

http://undertow.io/undertow-docs/undertow-docs-2.0.0/

版权声明
本文为[方丈的寺院]所创,转载请带上原文链接,感谢
https://fangzhang.blog.csdn.net/article/details/102471022

  1. [front end -- JavaScript] knowledge point (IV) -- memory leakage in the project (I)
  2. This mechanism in JS
  3. Vue 3.0 source code learning 1 --- rendering process of components
  4. Learning the realization of canvas and simple drawing
  5. gin里获取http请求过来的参数
  6. vue3的新特性
  7. Get the parameters from HTTP request in gin
  8. New features of vue3
  9. vue-cli 引入腾讯地图(最新 api,rocketmq原理面试
  10. Vue 学习笔记(3,免费Java高级工程师学习资源
  11. Vue 学习笔记(2,Java编程视频教程
  12. Vue cli introduces Tencent maps (the latest API, rocketmq)
  13. Vue learning notes (3, free Java senior engineer learning resources)
  14. Vue learning notes (2, Java programming video tutorial)
  15. 【Vue】—props属性
  16. 【Vue】—创建组件
  17. [Vue] - props attribute
  18. [Vue] - create component
  19. 浅谈vue响应式原理及发布订阅模式和观察者模式
  20. On Vue responsive principle, publish subscribe mode and observer mode
  21. 浅谈vue响应式原理及发布订阅模式和观察者模式
  22. On Vue responsive principle, publish subscribe mode and observer mode
  23. Xiaobai can understand it. It only takes 4 steps to solve the problem of Vue keep alive cache component
  24. Publish, subscribe and observer of design patterns
  25. Summary of common content added in ES6 + (II)
  26. No.8 Vue element admin learning (III) vuex learning and login method analysis
  27. Write a mini webpack project construction tool
  28. Shopping cart (front-end static page preparation)
  29. Introduction to the fluent platform
  30. Webpack5 cache
  31. The difference between drop-down box select option and datalist
  32. CSS review (III)
  33. Node.js学习笔记【七】
  34. Node.js learning notes [VII]
  35. Vue Router根据后台数据加载不同的组件(思考-&gt;实现-&gt;不止于实现)
  36. Vue router loads different components according to background data (thinking - & gt; Implementation - & gt; (more than implementation)
  37. 【JQuery框架,Java编程教程视频下载
  38. [jQuery framework, Java programming tutorial video download
  39. Vue Router根据后台数据加载不同的组件(思考-&gt;实现-&gt;不止于实现)
  40. Vue router loads different components according to background data (thinking - & gt; Implementation - & gt; (more than implementation)
  41. 【Vue,阿里P8大佬亲自教你
  42. 【Vue基础知识总结 5,字节跳动算法工程师面试经验
  43. [Vue, Ali P8 teaches you personally
  44. [Vue basic knowledge summary 5. Interview experience of byte beating Algorithm Engineer
  45. 【问题记录】- 谷歌浏览器 Html生成PDF
  46. [problem record] - PDF generated by Google browser HTML
  47. 【问题记录】- 谷歌浏览器 Html生成PDF
  48. [problem record] - PDF generated by Google browser HTML
  49. 【JavaScript】查漏补缺 —数组中reduce()方法
  50. [JavaScript] leak checking and defect filling - reduce() method in array
  51. 【重识 HTML (3),350道Java面试真题分享
  52. 【重识 HTML (2),Java并发编程必会的多线程你竟然还不会
  53. 【重识 HTML (1),二本Java小菜鸟4面字节跳动被秒成渣渣
  54. [re recognize HTML (3) and share 350 real Java interview questions
  55. [re recognize HTML (2). Multithreading is a must for Java Concurrent Programming. How dare you not
  56. [re recognize HTML (1), two Java rookies' 4-sided bytes beat and become slag in seconds
  57. 【重识 HTML ,nginx面试题阿里
  58. 【重识 HTML (4),ELK原来这么简单
  59. [re recognize HTML, nginx interview questions]
  60. [re recognize HTML (4). Elk is so simple