Our cloud strategy framework provides the principles and requirements for the cloud architecture of the RIPE NCC services. To determine these requirements, it uses the service criticality framework, more specifically its availability component rating.
We published our cloud strategy framework in November 2021. Since then, we worked on developing the service criticality framework and published its final version in May 2022. We have recently published the criticality ratings for some of our core services. At the same time, we realised that we need to update the cloud strategy framework to better reflect the reality and the diversity of services we provide.
Changes from the previous version
Compared with the previous version, we aligned the criticality ratings with the service criticality framework. Furthermore, we introduced a new "lowered" level of strictness reserved for applications with higher recovery time objectives (RTOs), such as statistics or data analytics platforms. Thus we simplified the framework, removing the distinction between Global internet Services and RIPE NCC-specific services.
Finally, we made two smaller changes:
- Reducing the availability requirement for applications falling in the strict level of strictness from five nines to a still very good, but cost-conscious four nines (99.99%), as shown below.
- Expanding the allowed use of cloud-managed services for applications falling in the heightened level of strictness from only open to include also industry standards or interfaces. The aim here is to increase the deployment flexibility while adhering to our principles listed below and still fulfilling the requirement of being able to exit any cloud provider within one hour.
Besides the above changes, everything else remains the same, including the cloud principles and requirements listed below.
Cloud principles and requirements
Shaping this strategy are a series of basic principles regarding how we work with the RIPE community:
- We solicit input on all services that are critical for the operation of the global Internet, or that directly affect the operations of our members or the RIPE community.
- We have full authority and responsibility for the design, deployment, and operation of our services.
- We must remain neutral and operate our services for the benefit of all members.
- The integrity of our services and data must be maintained in the face of geopolitical, regulatory, and economic threats.
- Open standards should be used; where open standards are not viable, we should prefer industry standards over proprietary interfaces.
Taking the principles above, and using input from the RIPE community, we have identified the following set of requirements for our use of cloud providers:
- Ensure resilience, accessibility, availability and low latency for our services
- Minimise vendor lock-in
- Avoid dependence on any single cloud provider
- Our engineers can innovate and improve the quality of our services
- Comply with applicable laws and regulations
- Ensure the security of our services
- Prefer providers in our service region
It is important to recognise that there are tensions between some of our principles and requirements, and trade-offs will be necessary. It is not worth choosing low-quality solutions simply because they are the best fit for this framework – we will apply sanity and engage with the community if something is not working.
Cloud strategy framework
Our cloud strategy framework uses the principles and requirements outlined above to set boundaries and identify where we need to be strict and where we can be more relaxed. This provides clarity about how we will approach the use of cloud providers and supports future discussions with the community when we look at moving specific elements to the cloud.
Requirements according to strictness
We have defined four levels of strictness for each of our requirements (Strict, Heightened, Standard and Lowered). What these levels mean for each requirement is outlined in the table below. Some requirements, such as ‘Comply with applicable laws and regulations’, apply equally across all levels and so lack any differentiation, therefore they are not included in the table.
|Ensure resilience, accessibility, availability and low latency of services||> 99.99% availability||> 99.9% availability||> 99% availability||Defined by the service owner|
|Minimise vendor lock-in||Only use bare-metal, VMs or containers||Managed services can be used but only with open or industry standards or interfaces||No restriction on managed services but keep track of switching costs||No restriction on managed services but keep track of switching costs|
|Avoid dependence on any single cloud provider||Fully distributed architecture / No downtime allowed||Stand-by backup infrastructure required / Fail-over within one hour||Ability to spin-off a new instance within 48 hours / Maximum outage of 48 hours||Recovery time defined by the service owner|
Level of strictness according to criticality
The table above describes how we interpret our requirements on a scale from Strict to Lowered.
According to the service criticality framework, services within each of these categories can have one of the four criticality ratings, ranging from Very High to Low, depending on the potential impact of a service availability-related incident (outage).
The table below indicates how we identify the level of strictness that should apply to specific services, according to their criticality. Example services are included to make this more concrete.
|Criticality Rating (availability component)||Very High||High||Medium||Low|
|Example of services||RPKI||RIPE Database||LIR Portal||Internal analytics platform|