Part of a highly available series of articles - Overview
This article was last updated on: July 24, 2024 am
A Overview
Availability is an important indicator of the quality of system operation.
With digitalization and intelligence, the availability of systems is becoming increasingly important. For example, a manufacturing assembly line requires a highly available MES system to keep the line running.
This article isHigh availabilityRelated technical documentation, from the following aspectsHigh availabilityApplications and implementations in manufacturing are described in detail.
1.1 Availability definition
GB/T3187-97 Definition of availability:
The ability of the product to perform a specified functional state under specified conditions and at a specified time or time interval, provided that the required external resources are guaranteed. It is a comprehensive reflection of product reliability, maintenance and maintenance guarantee.
Availability calculation formula:
Availability = MTBF / (MTBF + MTTR)
Regarding the calculation formula of availability, it is common to use N 9s to characterize system availability, such as 99.9% (3-nines availability), 99.999% (5-nines availability).
📓 In one sentence:
- usability ─ Receive a non-error response for every visit
DownTime:
Definition: Downtime for machine failure. Downtime is mentioned here because it is more intuitive and easier to understand to measure system availability using downtime per year.
Correspondence between Availability and Downtime:
Availability | Downtime |
---|---|
90% | 36.5 days/year |
99% | 3.65 days/year |
99.9% | 8.76 hours/year |
99.99% | 52 minutes/year |
99.999% | 5 minutes/year |
99.9999% | 31 seconds/year |
1.2 High Availablility definition
High Availability Definition:
High availability (HA) is a feature of a system whose goal is to ensure an agreed level of operational performance, usually uptime, that is higher than normal.
If the user does not have access to the system, the system is not available from the user’s point of view. Typically, downtime refers to the time when a system is unavailable.
1.3 High Availability Implementation
There are two modes that support high availability: Fail-over and redundant。 In the usual highly available scenarios, these two are used in combination.
Failover:
🔖 Definition:
In related technologies, such as computing and networking, failover refers to switching to a redundant or standby computer server, system, hardware component, or network in the event of a failure or abnormal termination of a previously active application, server, system, hardware component, or network.
Failover has several implementations, one of which is commonly used:
Active-passive switching
The active-to-standby failover process is that the active server sends a cycle signal to the standby server in standby. If the cycle signal is interrupted, the standby server switches to the IP address of the worker server and service is restored.
The downtime depends on whether the standby server is in “hot” standby or needs to be started from “cold” standby. Only the active server handles the traffic.
Active-to-standby failover is also known as active-slave failover.
🔖 Knowledge Point:
Web Server High Availability Scenario, NGINX High Availability Scenario - NGINX + Keepalived is highly available It is a typical master-slave switch.
And the standby NGINX server is in “hot” standby.
1.4 Objectives
The objectives of this article are:
formanufacturingThe system of high availability program standards provide a practical reference. to meet its to:
- Carry out on existing systemsHigh availabilityRemodel;
- Proceed to the new systemHigh availabilityArchitecture requirements;
2. Scope of application and requirements
2.1 Scope of Application
- systems with mandatory availability metrics;
- Critical systems (e.g. MES systems);
- Technical architecture conformTechnical solutions belowDescribe the system;
Reference documents
Reference files |
---|
Availability - Wikipedia |
High Availability - Wikipedia |
system-design-primer - GitHub |