Reading guide

Start of Gaode Serverless It's been under construction for some time , At the moment, Gaud Serverless The peak value of business has already exceeded 100000 QPS Magnitude , Platform from 0 To 1,QPS From zero to over 100000 , Become In Ali group Serverless The largest scale of application landing BU. How to realize this process , What problems have you encountered ? This article will share with you why Gaode is engaged in Serverless/Faas, How to do , What is the technical solution ? What are the current progress and follow-up plans , Hope to be helpful to the students who are interested in it .

1.  Why did GORD do it Serverless

The background reason is that Gaud started a Cloud projects on the client side , project The main purpose is to improve the iterative efficiency of client development . In the past, the business logic of the client was on the end , Changes in product requirements need to be released on the client side , The client version needs to go through a variety of test processes , Grayscale process , Solve client crash and other problems .

After the client goes to the cloud , Put some volatile business logic on the cloud . New product requirements are developed in the cloud , You don't have to go through the monthly release , Accelerate the iterative efficiency of requirements development , It is a step closer to the ideal goal of the same frequency of production and research ( Why “ also ”, It's because Gaode has also made some efforts to optimize the production and research of the same frequency , But we hope that cloud integrated development can be one of the most effective technical aids ).

1.1 The goal is : Client development mode —— End cloud integration

Although the development mode has changed from end-to-end development to cloud development + End development , The development students should be the students who were responsible for the corresponding business , And you know that , There are obvious differences between server development and client development , Client development is often oriented to stand-alone mode , Server development is usually cluster mode , We need to consider the coordination of distributed systems 、 Load balancing , All kinds of complex problems such as fail over, degradation, etc .

If we use the traditional server mode to develop , The transition risk will be greater .Faas It's a good solution to this problem . We combine the existing xbus frame ( A set of local service registration on the client 、 Calling framework ), Expanded xbus-cloud Components , Making development on the cloud like development on the end , The goal is a set of code , Two places running , A set of business code can run on the client , It can also run on the server .

Gaode client has three main terminals :iOS、Android、 Car engine ( class Linux operating system ). There are mainly two languages ,C++ and Node.js. Traditional map functions : As the map shows , Navigation path display , Navigation, broadcast, etc , Because it needs to span three ends , use C++ Language to develop . Some map application functions based on map navigation , As before 、 After line cards , Recommended destinations are mainly used for Node.js To develop .

In Ali group , Amoy front-end team developed Node.js Faas Runtime. Cloud project on Gaode client ,Node.js The existing Amoy system is adopted for the part of Node.js Runtime, To access the group's Faas platform , complete Node.js Some of this part of the business goes to the cloud .2020 The 11th National Day of the year has well supported Gaode's 11th Travel Festival business .

C++ Faas There is no existing solution , So we Decided to add to the group's infrastructure , newly build C++ Faas Foundation platform , To help Gaode clients go to the cloud .

1.1.1 The key to the best practice of end cloud Integration : Client and Faas Interface abstraction between

The original logic of the client is moved to Faas On the server , Or part of the new demand is Faas Server development , there ** The key to success is : Client and Faas Interface protocol definition of , That is to say Faas Of API Definition .** well API In addition to being beneficial to the maintainability of the system , It is also very important for the subsequent iterative development of supporting business .

Ideally : The client makes an analysis Faas A browser that returns the result data . Once the browser protocol is defined , It doesn't change very often , You see IE,Chrome Very few updates .

Of course, our browser will be more complicated , Our browser is map browser . How to verify the client and Faas The definition of the interface between , We can see the subsequent product requirements iteration , If some product requirements iterations only need to be in Faas The complete , No modification from the client is required , So this interface abstraction is successful .

1.2 BFF Layer development improves efficiency

Speaking of GORD , The first thing you think about is its tool properties : Gaud is a navigation tool ,( This statement is not very accurate now , Because Gaode has been transforming from a tool to a platform in recent years , Gaode's trading business is on the rise , Gaud taxi 、 tickets 、 Hotels and other businesses are developing very rapidly ). For Godel navigation , Compared to other businesses of the group , Compared with e-commerce , There are a large number of read-only scenarios, which is a major technical feature of Gaode business .

In these read-only scenes , A lot of demand is BFF(Backend For Frontend) Read only scenarios of type . Why do you say that? , Because the core function of navigation , for example routing, traffic, eta And so on are relatively stable , The main work of this part is to use continuous optimization algorithm , Make Gaode's navigation more accurate , The calculated path is better . These core functions are relatively stable in interface and function , And the front-end demand is changeable , For example, add a width limit pier prompt on the path, etc .

Faas It's especially suitable for BFF Layer development , stay Faas Call each of the relatively stable backend Baas service ,Faas Services to encapsulate data and call logic , Rapid development of 、 Release . In the industry ,Faas The most used scene is BFF scene ( Another name is SFF scene ,service for frontend).

1.3 Serverless It's a high-level language in the cloud age

although Gao de has been on the cloud 了 , But it's not the end of the cloud age yet , At present, it is mainly comprehensive Docker And go up to the clouds , Standardization of containers , On a large scale , In terms of resource utilization, we can fully enjoy the dividend of cloud , But the business development model is basically the same as before , Or a large distributed system .

For the R & D model, we haven't enjoyed the dividend of cloud yet , It can be compared that we are now using assembly language to write services running on the cloud . and Serverless、 Cloud Nativity can be understood as a high-level language in the cloud era , And it did Cloud as a computer, Just focus on business development , There is no need to consider the complexity of large distributed systems .

1.4 Go-Faas Add Go Language Ecology

As mentioned earlier, because the cloud project on the client , We are in alicloud FC( Function calculation ) Add on the team , Developed C++ Faas Runtime.

More Than This , We also developed Go-Faas, Why do Go-Faas Well , Here's a brief introduction to the background , Gaode server Go Part of the QPS The peak is over a million . Gaode has made up for the middleware of Ali Go client , Build with group middleware Department . Observability 、 The automatic test system is also basically perfect , at present Go The ecology has been basically improved . Made up Go-Faas after , We can use Go Write Baas service , It can be used again Go Write Faas Yes , Different service implementation methods are adopted in different business scenarios ,Go-Faas It is mainly applied to the above mentioned BFF scene .

2. Introduction to the technical scheme —— Add to the group's existing infrastructure

2.1 Overall technical framework

The above explains why we should do this thing , Now let's talk about how we do it : How to achieve , What is the specific technical solution .

We are based on In the group's existing infrastructure 、 The idea of adding is based on the existing middleware , We and CSE, Alibaba cloud FC Function computing team work together to build , Developed C++ Faas Runtime and Go Faas Runtime. The technical structure of the whole group is shown in the figure below , It is mainly divided into R & D status 、 Running state 、 There are three parts .

2.1.1 Running state

Let's start with the running state , Traffic flows in from the gateway , Call to FC API Server, Forwarding to C++/Go Faas Runtime,Runtime To complete the functions in user functions .Runtime In the next chapter, we will introduce the structure of .

and Runtime Container There's monitoring deployed together 、 journal 、Dapr Various Side car,Side car To complete a variety of log collection and reporting functions ,Dapr Side car To complete the function of calling group middleware .

in addition , at present Dapr It's still in the pilot phase , Calling middleware is mainly through Broker And middleware Proxy To complete , Middleware calls are HSF,Tair,Metaq,Diamond Such as middleware Proxy.

Last Autoscaling Module to manage the expansion and reduction of function instances , Achieve the purpose of automatic scaling function . There are various scheduling strategies here , There is a schedule based on the amount of concurrent requests , Function instance CPU Scheduling of utilization . You can also set the number of reserved instances in advance , Avoid shrinking to 0 After the cold start problem .

The underlying call is the group ASI The ability of ,ASI It can be simply understood as the group's K8S + Sigma( The group's Dispatching System ), The final deployment is FC call ASI To complete the function instance deployment . Elastic , The smallest unit of deployment is in the picture above POD, One POD It contains Runtime Container and Sidecar Set Container.

2.1.2 R & D status

Let's look at R & D status , The running state determines how a function works , R & D state focuses on the development experience of functions . How to make it convenient for developers to develop 、 debugging 、 Deploy 、 Test a function .

C++ Faas There is a difficult cross platform problem ,C++ Faas Runtime There are some dependency Libraries , These dependency libraries don't have Java Relying on library management is so convenient . The installation of these dependent libraries is troublesome ,Faas Scaffolding is to solve this problem , Call scaffolding , One click generation C++ Faas Example project , Install all kinds of dependency packages . For local convenience Debug, Developed a C++ Faas Runtime Boot modular , function Runtime Start entry at Boot In the module ,Boot Integrated in the module Runtime And the user Faas function , It can be done to Runtime To do it Debug Step by step debugging .

We and the group Aone The team cooperation , Function is integrated into Aone In the environment , Can be very convenient in Aone Come up and post Go perhaps C++ Faas,Aone One click generation is also integrated on the Example The function of the code base .

C++ and Go Faas All compilers depend on the corresponding compilation environment ,Aone The function of custom compilation image is provided , We uploaded the compiled image to the group's public image library , Function compile time , Specify the corresponding compile image in the function's code base . The compile image has Faas The dependent libraries ,SDK etc. .

2.1.3 Operation and maintenance status

Finally, let's look at the operation and maintenance monitoring of functions ,Runtime Integrated eagle eye inside 、Sunfire The function of collecting logs ,Runtime It's going to write these logs , adopt Sidecar Inside Agent Hawk's eye 、 perhaps Sunfire On the monitoring platform (FC It's through SLS To collect ) after , We can use the existing monitoring platform of the group to do Faas We're monitoring it . Can also access the group's GOC Alarm platform .

2.2 C++/Go Faas Runtime framework

It's about and Aone,FC/CSE,ASI An integrated architecture ,Runtime It's part of the overall architecture , Let's talk about it in detail Runtime What's the architecture of ,Runtime How it was designed and implemented .

Top users Faas The code just depends on Faas SDK That's all right. , Users just need to implement Faas SDK Inside Function Interface can write your own Faas.

then , If you need to call an external system , Can pass SDK Inside Http Client To call , If you want to call external middleware , adopt SDK Inside Diamond/Tair/HSF/Metaq Client To call middleware .SDK These interfaces in are masking the complexity of the underlying implementation , Users don't need to care how these calls are implemented in the end , Don't care Runtime The concrete realization of .

SDK The layers are the ones mentioned above Function Definition and interface definition of various middleware calls .SDK The code is developed for Faas User .SDK It's light , Mainly interface definition , No concrete implementation . The concrete implementation of calling middleware is Runtime There are two ways to do it in the library .

Let's look at the blue part in the middle of the picture , yes Runtime An overall structure of .Starter yes Runtime Start module of , After starting ,Runtime Itself is a Server, Start up according to Function Config Module configuration to start Runtime,Runtime After starting, start the request and management listening mode .

Below is Service layer , Realization SDK Interface of middleware call defined in , contain RSocket and Dapr Two ways of implementation ,RSocket It's through RSocket broker To call the middleware ,Runtime Integrated in Dapr(distributed application runtime) , Calling middleware can also be done through Dapr To call , In the early stage Dapr Pilot phase , If you pass Dapr Failed to call middleware , Will be downgraded to RSocket Call Middleware in the same way .

Mobile network type RSocket The protocol layer of , Encapsulates the call RSocket Various Metadata agreement .Dapr The call is made through GRPC Method . The bottom layer is integration RSocket and Dapr 了 .

RSocket The call also involves Broker The question of choice ,Upstream Module to manage Broker cluster,Broker Registration and anti registration of ,Keepalive Check and so on ,LoadBalance Module to achieve Broker Load balancing options for , And Event Management , Connection management , Reconnection and so on .

Last Runtime Inside Metrics The module is responsible for Hawkeye Trace Access to , adopt Filter Pattern to intercept Faas Link time consumption , And output Hawkeye log . Print Sunfire journal , for Sidecar Go collect . The figure below shows a real business Sunfire Monitoring interface :

2.2.1 Dapr

Dapr The architecture is shown in the figure below , Please refer to the official documents for details

Runtime In the past, the middleware was invoked RSocket Method , here RSocket Broker There will be a centralization problem , In order to solve Outgoing The problem of traffic decentralization , Gaode and the group middleware team work together to introduce Dapr framework . It's just Runtime Layers integrate Dapr, For the user Faas For example, there is no perception , You don't need to care about calling middleware through RSocket It's called through Dapr Called . Back Runtime Call middleware to switch to Dapr after , user Faas There is no need to make any changes .

3. How to access services Serverless

As mentioned earlier , Unified in Aone Access on . We provide C++ Faas/Go Faas Access documents for . Provides the function of Example The code base , The code base has examples of various scenarios , Including the code examples of calling various middleware of the group .

C++ Faas/Go Faas Our access is open to the whole group , At present, there are some other things besides Gaud BU, In his own business C++ /Go Faas 了 .

Node.js Faas Use the Runtime And templates ,Java Faas Use alicloud FC Provided Runtime And template to access it .

3.1 Access specification —— Stability three axes : Can be monitored 、 It's grayscale 、 Roll back

We may worry about the stability of the new technology , The magic weapon is the stability of Ali group : Can be monitored 、 It's grayscale 、 Roll back . establish Faas Link support group , Connect the upstream and downstream business parties 、 Basic platform together , According to the group's 1-5-10 requirement , Achieve 1 Respond to the online alarm within minutes , Quick check ,5 Within minutes ;10 Recover in minutes .

In order to standardize the access process , Avoid making mistakes and causing online breakdowns , We made Faas Access specification and CheckList, To help businesses use Faas.

Can be monitored 、 It's grayscale 、 Rollback is a hard and fast requirement , Before that , It would be better if the business side could be demoted . our C++ Cloud services on the client , At the beginning of the pilot phase , I'm ready to be demoted , If the Faas End failure , This call will be automatically demoted to the local call . It's basically lossless for client functionality , It just increases the response latency .

in addition , Version of the feature on the client , It may be a little older than the server , But the functionality is forward compatible , Basically does not affect the use of the client .

4. Our current situation

4.1 Infrastructure platform construction

  • Go/C++ Faas Runtime Development complete , docking FC-Ginkgo/CSE、Aone complete , Stable 1.0 edition .
  • A lot of stability building has been done 、 Elegant offline 、 performance optimization 、C Compiler optimization , It uses the compilation method provided by the compiler optimization team of Alibaba cloud basic software department to optimize C++ Faas Compilation of , Significantly improved performance .
  • C++/Go Faas Access Hawkeye 、Sunfire Monitoring is complete , Functions are observable .
  • The pooling function is completed , The ability to be resilient in seconds . Pooling Runtime Mirror access CSE, The time to expand a new instance is changed from minute level to second level .

4.2 Godly Serverless Business implementation

C++ Faas and Go Faas as well as Node.js Faas A large number of applications have been implemented in Gaode . Take a few examples :

The first two in the picture above are C++ Faas Development business : Long distance weather 、 Search along the way . The last two screenshots are Go-Faas Development business : Navigation Tips, Footprint map .

Gaode is a member of Ali group Serverless The largest scale of application landing BU, On the ground Serverless application , The daily peak has already exceeded 100000 QPS Magnitude .

4.3 Key benefits

Gaode is the largest company in the group Serverless After application , What are the benefits ? First , The first and most important benefit is : Development improves efficiency . We are based on Serverless End cloud integrated component , Help the cloud on the client , Release the client release dependency problem when the requirements are implemented , Improve the iterative efficiency of client development . be based on Serverless Developed BFF layer , Promoted BFF Iterative efficiency of class scenario development .

** The second benefit is : Improve operation and maintenance efficiency .** utilize Serverless Automatic elastic capacity expansion and reduction technology , Gaode is more relaxed in dealing with all kinds of travel peaks . For example, every year 10-1 Travel Festival ,5-1、 Qingming 、 Double Dan 、 Spring Festival travel peak , No longer need operation and maintenance or business development, students in advance of the festival expansion , After the festival, it will shrink again .

The characteristics of Gaode's business peak are also different from those of e-commerce . The peak traffic does not rise suddenly in a second , The second level resilience we're currently using pooling Technology , It can fully meet the needs of this business scenario of Gaode .

** The third benefit is : cost reduction .** Gaode's business features , There's a lot of traffic during the day 、 Low traffic at night , There is a big difference between high peak value and low valley value , The time period is distinct . utilize Serverless Automatic capacity reduction technology when the flow is low at night , It greatly reduces the cost of server resources .

5. The follow-up plan

  • FC Optimization is used in the calculation of function in projectile , and FC The team worked together to continuously optimize the performance of in bomb function calculation 、 stability 、 Experience with . With the rich and large flow business scenarios in the group , Keep polishing C++/Go Faas Runtime, And finally output to the public cloud , More enterprises in the wave of inclusive digital transformation .
  • Dapr to ground , solve Outcoming The problem of traffic decentralization , Step by step C++/Go Faas, Use Dapr Call group Middleware in the same way .
  • Faas Chaos Engineering , Trouble shooting , Escape capacity building .Faas In the new financial year will also participate in our BU The fault drill of , Solve the problems found in the drill one by one .
  • Access edge computing . In the scene of end cloud Integration ,Faas + Edge of computing , It can provide lower delay time , Better user experience .

There is a long way to go for the above , In addition, we will do more cloud native pilot and landing in the future , Technical students all know , Technology selection 、 Technology prototype to actual business implementation , There's still a long way to go .

Welcome to Serverless、 Cloud native 、 perhaps Go Application development interested in small partners , If you want to do something together, please join us ( No matter what technology stack it was before , Heroes don't ask where they come from , Send your resume to [email protected], The subject of the email is : full name - Technical direction - From high tech ), There is a large-scale landing scene and a simple and open technology atmosphere . Welcome to introduce or recommend .

Gao de Serverless More articles on platform construction and practice

  1. be based on Docker The practice of continuous delivery platform construction

    Reading guide : China Minmetals and Alibaba jointly build a professional platform for iron and steel services , By aggregating Alibaba in big data . Technological advantages of e-commerce platform and Internet products , One stop shopping experience for end users . This article is about the operation and maintenance technical team of wuagoe Docker Container technology in such as ...

  2. TOP100summit 2017:【 Case sharing 】 Meizu continuous delivery platform construction practice

    The content of this article comes from 10 On site sharing of Lin Zhonghong, the operation and maintenance architect of Meizu, on Meizu open day . edit :Cynthia One . The history of automation construction 1.1 Meizu Internet development timeline 2003-2008 It's called “ Internet 1.0 Time ”.2003 year , ...

  3. Ali PB level Kubernetes Log platform construction practice

    Dry cargo sharing | Ali PB level Kubernetes Log platform construction practice https://www.infoq.cn/article/HiIxh-8o0Lm4b3DWKvph The main tool for collecting logs is Agent, stay ...

  4. Kubernetes Container cloud platform construction practice

    [51CTO.com The original manuscript ]Kubernetes yes Google An open source container choreography engine , It supports automated deployment . Large scale scalability . Apply containerization Management . With the rapid rise of cloud native technology , Now Kubernetes In fact, it has ...

  5. A big data platform saves 20 individual IT Human resources —— Dunno data platform construction case sharing

    Meet dunno Dunno group was founded in 1987 year , Main clothing . The hotel . real estate , Headquartered in Pidu, China - Haining . Zhejiang DUNNU United Industry Co., Ltd ( hereinafter referred to as " Dunno ") It's a collection of developers . Design . production . Sales in one of the large professional clothing ...

  6. Time series big data platform construction (Time Series Data, abbreviation TSD)

    source :https://blog.csdn.net/bluishglc/article/details/79277455 Introduction in the ecosystem of big data , time series data (Time Series Data, abbreviation T ...

  7. Move APP Vulnerability automatic detection platform construction

    Move APP Vulnerability automatic detection platform construction   Preface : This article is about < Move APP Client security notes > The first in a series of original articles , It's mainly about enterprise mobility APP Automatic vulnerability detection platform construction , Move APP The development history and frontier technology of vulnerability detection ,A ...

  8. ServerLess Cloud function practice - The weather API

    Follow my personal blog , Discover more ServerLess Cloud function practice - The weather API Preface Cloud computing is the general trend Serverless Architecture is "⽆ service 器" framework , It's a new way of Architecture , It's cloud computing ...

  9. Windows Platform distributed architecture practice - Load balancing ( Next )

    summary We are in the last article Windows Platform distributed architecture practice - Load balancing discusses Windows Pass under platform NLB(Network Load Balancer) To realize the load balance of the website , And through the stress test demonstrated its effectiveness ...

  10. 《 Open source operation and maintenance security platform OSSIM Best practices 》

    < Open source operation and maintenance security platform OSSIM Best practices > After years of dedicated research on open source technology , It took three years to create < Open source operation and maintenance security platform OSSIM Best practices > A book will be published soon . The book uses 80 More than ten thousand words have been recorded , author 10 Years of IT industry ...

Random recommendation

  1. [UE4] Logical state machine components

    Logical state machine In order to control the target state , And what happens when the broadcast changes state , You can continue to work on behaviors in different states in the blueprint . Implementation process : 1. Inherit ActorComponent Realization LogicStateMachine: 2. Be hit ...

  2. .net Abnormal small total

    1.  ExecuteReader:CommandText Property has not been initialized namely : No, right sqlCommand Object's CommandText Attribute assignment , That is, there is no writing sql sentence . 2.  Because the code has been optimized or native frame ...

  3. bzoj 3124 [Sdoi2013] The diameter of (dfs)

    Description Small Q Recently I learned some knowledge about graph theory . According to the textbook , It has the following definition . Trees : Loop free and connected undirected graphs , Each edge has a positive integer weight to represent its length . If a tree has N Nodes , It can be proved that it has and only has N-1  side . route : One ...

  4. ( turn ) ROS NAMING AND NAMESPACES

    Original address :http://nootrix.com/2013/08/ros-namespaces/   In this tutorial, we will be talking about ROS nam ...

  5. 【 Learning summary 】GirlsInAI ML-diary day-11-while loop

    [ Learning summary ]GirlsInAI ML-diary total Yuanbo github link -day11 know while Loop execution about while/break/continue The understanding of The new value replaces the variable commonly while sentence nothing ...

  6. Cesium Dynamic change model mapping method

    Reference material github Discussion address Sample code address Sample code var viewer = new Cesium.Viewer('cesiumContainer'); var scene = viewer.s ...

  7. mysql database

    show databases Display library use databse In storage (databse For the name of the library )show tables Display table create table a select * from b where 0=1 ...

  8. Detailed explanation Windows 8.1 And Windows 8 The difference between (Win8.1 And Win8 difference )

    Detailed explanation Windows 8.1 And Windows 8 The difference between (Win8.1 And Win8 difference ) In this paper, from “ Wule bar Software Station ”, Link to the original text :http://www.wuleba.com/?p=23082 lately , I love you ...

  9. java The advantages and disadvantages of annotation

    advantage : 1. Save configuration , Reduce profile size 2. Compile to see if it's correct , Increase of efficiency shortcoming : 1. Increased program coupling , Because annotations are stored in class In file , And it's scattered 2. To modify the configuration, you need to recompile @aut ...

  10. ubuntu-12.04.5-desktop-i386.iso:ubuntu-12.04.5-desktop-i386: install Oracle11gR2

    ubuntu The installation of desktop version is not introduced . How to install oracle: Core steps and key points . ln -sf /bin/bash /bin/sh ln -sf /usr/bin/basename /bin/basena ...