Part 2 of a highly available series - Traditional layered architecture technology solutions

This article was last updated on: February 7, 2024 pm

Part 1 of a highly available series - Overview - Dongfeng Weiming Technology Blog (e-whisper.com)

3. Technical solutions

3.1 Overview

Single point is the biggest risk and enemy of high availability of the system, and should be avoided in the process of system design.

To ensure the high availability of the system, methodologically, the principle of high availability guarantee is “clustering” (or “redundancy”), there is only a single point, and all services in this single point of downtime will be affected and unavailable; If there is redundancy or backup, one of the point downs has other redundant or backup nodes capable of providing service.

To ensure high availability of the system, the core criterion of architecture design is: redundancy.

With redundancy, it is not enough, and the need for manual intervention to recover each failure will inevitably increase the MTTR of the system. Therefore, it is often through “automatic failover” to achieve high availability of the system.

In the following technical scenario, the passage is detailed in detailRedundancy + automatic failoverto ensure the high availability of the system.

3.2 Typical architecture of a manufacturing system

Manufacturing systems generally adopt a layered architecture, and the simplest typical architecture is shown as follows:

Notes:

The database uses MySQL as an example.

典型架构

Common system architectures are as above, divided into:

  1. Client layer: Typical call modes: browser and client
  2. App service layer: Implement core application logic, usually HTTP protocol or TCP protocol. This layer of subdivision, often also includes:
    1. Presentation layer: Mostly web browsing pages;
    2. Business logic layer: Logically judge and perform operations on specific problems, and also serve as a bridge between the presentation layer and the data layer
    3. Optional: Service layer: If servitization is implemented, such as SOA or microservices, there will be this layer;
    4. Data access layer: Implement operations such as adding, deleting, modifying, and querying data, and feedback the results to the business logic layer
    5. Optional: Data caching tier: Cache accelerated access to storage
  3. Database tier: Database fixed line data storage.

The high availability of the entire system is through each layerRedundancy + automatic failoverto achieve it synthetically.

Notes:

It can also be improved with more granular and granular layeringusability. For example, the above architecture is refined to:

  1. client layer;
  2. Optionally new: Reverse proxy layer
  3. Optional addition: Presentation Layer (Web Server) tier
  4. App Server layer
  5. Optional addition: Service invocation layer
  6. Optional addition: Caching layer
  7. Database tier

This article does not cover this for the time being.

PassredundancyThis core principle, the optimized high-availability architecture is shown in the following figure:

HA 架构

Description:

  • Solid line : Request call
  • Point + horizontal line: Heartbeat detection
  • dotted line: A request call that has not actually occurred
  • Short dashed horizontal line: Database master-slave synchronization
  • “X” - the corresponding node is down and unavailable.

The following describes the adjustment of the high-availability plan:

  1. Between the client tier and the App Service tier, add: Load balancing layer. Responsible for scientifically loading client requests to the outer layer of the application service through a certain load balancing method;
  2. Load balancing layer: This layer contains the following 2 components:
    1. NGINX: Responsible for load reverse proxy/load balancing of TCP/HTTP requests;
    2. Keepalived: Responsible for providing a single IP, and performing heartbeat detection and failover of 2 NGINX in case of failure;
  3. App service layer: At least 2 application servers.
  4. Database tier: The database proceedsMaster-slave synchronization, read/write separation

Notes:

In the figure above, another highly available embodiment is also depicted, which will not be discussed in detail in this article:

  • Application horizontal splitting: Monolithic applications are split according to importance, and services with high importance and services with low importance are split and deployed separately.
  • Applications of high importance, such as low-latency applications, assembly line applications, etc.;
  • Applications with low importance, such as high-latency applications, report applications, etc.

In addition, it can be further converted according to the actual business situationdatabaseTo split it, this article will not discuss it in detail:

  • The database is split into 2 libraries, one of which synchronizes data from the other library at regular intervals (real-time or non-real-time) through data synchronization. Examples: Business Library, Report Library. (The business library regularly synchronizes data to the report library)
  • For applications of high importance, read and write to the service library;
  • Applications of low importance. Such as report applications. The Business Library is not allowed, but the Report Library is used.

The following is a hierarchical discussion one by one.

3.4 Client Layer -> Load Balancing Layer High availability

1578813330834

Client layer arrive Load balancing layer High availability, passLoad balancing layerThe redundancy is achieved. The specific implementation is as follows:

There are at least 2 nginx, one of which provides services, and the other is redundant to guarantee high availability. And through the virtual IP of Keepalived to provide the same IP (as shown above: 1.2.5.6), failure detection and failover through heartbeat detection.

In the figure above, the NGINX masternode provides external services.

When the NGINX master node (e.g. 192.168.0.1) goes down, Keepalived can detect it, automatically fail over, and automatically transfer traffic to the NGINX slave node (e.g. 192.168.0.2). Since the same virtual IP is used (again: 1.2.5.6), this switching process is transparent to the caller.

1578812475962

3.5 Server Load Balancing Layer - > The application service layer is highly available

Load balancing layer arrive App service layer The high availability of is throughApp service layerThe redundancy is achieved. Profile in NGINX nginx.conf , can pass upstream Instructions configure multiple application servers, and nginx can detect the viability of multiple application servers.

应用服务层高可用 1

Knowledge Points:

In the NGINX open source version, if no third-party plug-ins are used, NGINX’s liveness detection is: passiveDetection.

whileApp service layer When one of the nodes is down, nginx can detect it, automatically fail over, and distribute traffic to the failed node, but to other normal application server nodes, and the whole process is automatically completed by nginx, transparent to the caller.

应用服务层高可用 2

3.6 Application Service Layer - > The database tier is highly available

The database layer recommends the use of “Master-slave synchronization, read/write separation” architecture. The high availability of databases can be subdivided into two categories: “high availability of read library” and “high availability of write library”.

Notes:

As the manufacturing industry employs a variety of databases, including but not limited to:

  • Oracle
  • SQL Server
  • MySQL

The high-availability solutions of different databases are not identical, so this article does not describe the specific high-availability technologies of the database in detail. Only theoretical expositions are made. (Take MySQL as an example)

Common database availability scenarios are:

  1. MySQL: Master-slave synchronization;
  2. Oracle: RAC
  3. SQL Server: Alwayon

3.6.1 The read library is highly available

The high availability of the read library is achieved through the redundancy of the read library.

If you want to achieve high availability of the read library, generally at least 2 slave libraries, the database connection pool will establish multiple connections to the read library, and each request will be routed to these read libraries.

读库高可用 1

When the library reads - from 1 to the time of downtime, the application service layerDatabase connection poolingIt can be detected, automatically failover, automatically migrate traffic to other read libraries, such as read libraries - from 2, the whole process is done automatically by the database connection pool, which is transparent to the caller.

读库高可用 2

Notes:

A database connection layer implementation of an application system or middleware is requiredDatabase connection poolingFunction.

3.6.2 Write library is highly available

The high availability of the write library is achieved through the redundancy of the write library.

Taking MySQL as an example, you can set up two MySQL dual-master synchronizations, one to provide services online, and the other redundant to ensure high availability.

3.7 Technical Selection

3.7.1 Load balancer technology selection

Selection Result:

  • Load balancer: NGINX + Keepalived
  • Version:
    • NGINX: 1.16.1 ( Selected versions are evaluated at least semi-annually, and versions are adjusted according to the evaluation results)
    • Keepalived: 2.0.10 (❗️ select versions to be evaluated at least semi-annually, adjust versions according to the evaluation results)
  • Operating System Type: Linux
  • OS version: SUSE 12 (Adapted as needed, following the relevant technical specifications of the manufacturing industry.)

Definition and Overview:

In a distributed system, load balancing is an efficient process of distributing network requests to multiple servers. By load balancing the load, the system response time can be effectively improved.Increase system availability。 As systems become more complex, with more users and more network traffic, load balancing has become a necessary part of system design.

A load balancer can be hardware or software, and it distributes network requests across a cluster of servers.

Overview of the selection process

Common hardware load balancers include:

  • F5
  • A10

Common software load balancers include:

  • Nginx
  • LVS
  • HAProxy

Combined with the best practices of the manufacturing industry, as well as the actual situation of the manufacturing industry, the manufacturing industry has multiple factories in the country and even the world, and each factory has an independent machine room. The cost of using hardware load balancing is too high. Determine adoptionSoftware Load BalancerImplemented as a load balancing technology. The Software Load Balancers are discussed below:

NGINX:

Nginx (“engine x”) is a high-performance web and reverse proxy server developed by Russian programmer Igor Sysoev.

merit:

  1. Working at layer 4 (TCP/UDP) and layer 7 (HTTP/websocket) of the network, you can do some triage strategies for http applications, such as domain names, directory structures, and its regular rules are more powerful and flexible than HAProxy, and NGINX is currently the most widely used load balancer and web server, with rich application scenarios.
  2. Nginx has very little dependence on network stability, and theoretically can perform load functions when it can be pinged; On the contrary, LVS has a large dependence on network stability;
  3. NGINX is simple to install and configure, easy to test, and NGINX has a comprehensive and customizable log, including access logs and error logs. It takes a long time to configure and test LVS.
  4. NGINX can withstand high load pressure and is stable, and it is easy to support tens of thousands of concurrent times when the hardware is not bad.
  5. Nginx can detect the failure of the application server through the port, such as the status code returned by the server processing the web page, timeout, etc., and will resubmit the request that returns the error to another node. If a user is uploading a file and the node processing the upload fails during the upload process, Nginx will switch the upload to another server for reprocessing, and if LVS is LVS, it will be cut off.
  6. Nginx is not only a great load balancer/reverse proxy software, it is also a powerful web application server. LNMP is also a very popular web architecture in recent years, and it is also very stable in high-traffic environments.
  7. NGINX is also relatively mature as a web reverse acceleration, faster than traditional Squid servers, and can also be used in future scenarios, and can also expand the use of NGINX.
  8. Nginx can be used as a reverse proxy, as a reverse proxy, Nginx is the most widely used, similar products also have lighttpd, but lighttpd has not yet achieved the full function of Nginx, the configuration is not clear and easy to read, and the community information is far less active than Nginx.
  9. Nginx can also be used as a static web page and image server, and its performance is unmatched.
  10. The Nginx community is very active, and there are many third-party modules.

shortcoming:

  1. For health checks of backend servers, the open source version of NGINX is supportedPassive detection. Not supportedActive detection.
  2. For session persistence for load balancing, the open source version of NGINX does not support it by defaultCookie session persistence. But it passes ip_hash Source address session persistence is implemented and can be implemented by third-party modulesCookie session persistence.

LVS:

LVS: Use Linux kernel clusters to implement a high-performance, highly available load-balanced server with good scalability, reliability, and manageability.

Merit:

  1. Strong load resistance, is working on the network layer 4 only for distribution, no traffic generation, this feature also determines its strongest performance in load balancing software, memory and CPU resource consumption is relatively low.
  2. The configurability is relatively low, which is a disadvantage and an advantage, because there is not much to configure, so it does not require much contact, greatly reducing the chance of human error.
  3. The work is stable, because it has a strong load resistance, and it has a complete two-machine hot standby solution, such as LVS+Keepalived.
  4. There is no traffic, LVS only distributes requests, and the traffic does not go out of itself, which ensures that the performance of the equalizer IO is not affected by large traffic.
  5. The range of applications is relatively wide, because LVS works at layer 4, so it can load balance almost all applications, including http, databases, online chat rooms, and so on.

shortcoming:

  1. The software itself does not support regular expression processing and cannot do dynamic and static separation; Now many websites have a strong demand in this regard, which is the advantage of Nginx+Keepalive.
  2. If the website application is relatively large, LVS implementation is more complicated, especially if there is a Windows Server machine behind, if the implementation and configuration and maintenance process is more complicated, relatively speaking, Nginx + Keep Saved is much simpler.

HAProxy:

HAProxy is a free and open source software written in C that provides high availability, load balancing, and application proxies based on TCP and HTTP.

merit:

  1. HAProxy supports virtual hosting.
  2. The advantages of HAProxy can complement some of the disadvantages of Nginx, such as support for session persistence and cookie booting; It also supports detecting the status of the backend server by obtaining the specified URL.
  3. HAProxy, like LVS, is itself a load-balancing software.
  4. HAProxy supports load-balanced forwarding of the TCP protocol.
  5. HAProxy load balancing policies are relatively rich

shortcoming:

  1. The POP/SMTP protocol is not supported
  2. The SPDY protocol is not supported
  3. The HTTP cache feature is not supported.
  4. The ability to overload configuration requires a process restart, and although it is also a soft restart, it is smoother and friendlier than Nginx’s reload.
  5. Multi-process mode support is not good enough

Selection conclusion:

The LVS trial scenario is single, and there are hard wounds (i.e., if there is a Windows Server machine behind, the implementation is more complicated). Direct exclusion.

HAPorxy support for HTTP is not as rich as NGINX, and the restart interruption takes longer than NGINX.

In addition, nginx can be used as a web server, HTTP cache and other scenarios in addition to the current load balancer, and is easy to use.

Finalize the choice NGINX As a load balancer for a manufacturing company.

3.7.2 NGINX high-availability technology solution selection

There is only one NGINX high-availability technology solution, namely: NGINX + Keepalived to achieve high availability.

keepalivedOpen source projectsIt consists of three components:

  • keepalivedA daemon for Linux servers.

  • Virtual router redundancy protocol(VRRP) for managing virtual routers (virtual IP addresses or VIP)。

    VRRP ensures that a master node is always present. The standby node listens for VRRP advertisement packets from the primary node. If the broadcast packet is not received for more than three times the configured broadcast interval, the standby node takes over as the primary node and assigns the configured VIP to itself.

  • A health check tool that determines whether a service, such as a web server, PHP backend, or database server, is up and running.

    If the service on the node fails the configured number of health checks,keepalived The virtual IP address is reassigned from the primary (active) node to the standby (passive) node.

3.7.3 NGINX Load Balancing Policy

Selection result:

Load balancing policy: RR (round robin, default policy)

Session Persistence Policy: (optional)

  • If session persistence is not required, it is not configured.
  • Session persistence based on source address is required: ip_hash

NGINX load balancing policy:

  1. Polling scheduling algorithm: rr (default scheduling algorithm).
  2. Weighted round-robin scheduling algorithm: WR. Applicable scenario: The performance of nginx servers is different, and high performance requires more requests.
  3. Minimum number of connections: least_conn.
  4. Session persistence policy:
    1. IP session persistence: ip_hash
    2. hash - Specifies the load balancing method for the server group, where the client-server mapping is based on the hash value. Keys can contain text, variables, and combinations thereof.

Selection process:

The load balancing policy can be selected according to the actual situation, and RR can be selected without special requirements.
The session persistence policy needs to be selected according to the specific scenario:

  • If session persistence is not required, it is not configured.
  • Session persistence based on source address is required: ip_hash
  • Need to be cookie Session persistence: sticky
  • The source address session persists and cookie Session persistence is not sufficient, and it may need to pass hash to customize the session retention strategy.

3.7.3 Application Service Layer -> Database layer high-availability selection

Outline.

Due to the diversity of application systems, this paper does not restrict the application service layer -> database layer high availability selection. Only top-level architecture requirements are provided:

  1. Database: Master-slave synchronization, read/write splitting
  2. Apply:
    1. Implement database connection pooling functionality. Multiple database connections can be configured.
    2. Horizontal unpacking is carried out according to the importance of the application function module. Separate the reporting, statistics, and batch processing functions from the important business functions with low latency.

Reference documents

Reference files
Availability - Wikipedia
High Availability - Wikipedia
system-design-primer - GitHub
High Availability Support for NGINX
Usage of web servers broken down by ranking

Part 2 of a highly available series - Traditional layered architecture technology solutions
https://e-whisper.com/posts/15216/
Author
east4ming
Posted on
September 28, 2021
Licensed under