Razvan C. Oprea

Service Criticality Framework

46 You have liked this article 0 times.
0

Defining the criticality of RIPE NCC services provides a whole range of benefits. In this article, we take a look at the model we're adopting in order to determine a criticality rating for our services.


Last year, we embarked on a new project to define the criticality rating of our services. This project emerged from the discussions we had with the community about our cloud strategy. We decided that creating a model for determining the criticality level of our services was an important part of a clear strategy in this area, and we shared a first draft of the criticality framework on RIPE Labs and in a presentation at RIPE 83.

Although the feedback received from the community was largely positive, a fair number of people found the criticality framework complex and confusing. We also realised the need to extend the model beyond the availability component and incorporate data confidentiality and integrity when determining the criticality of a service. This will allow us to use the model outside of our work with the cloud. For instance, it can be used to define service-level objectives, help with deciding the monitoring and alerting setup, security controls, and more.

To fix this, we ran through multiple iterations and we're now at a version that we feel strikes the right balance between completeness and simplicity, incorporating most of the feedback we received so far.

Service Criticality Rating

We propose a model for determining the criticality rating of a service, measured on a four-point scale: Low, Medium, High and Very High. In scope are the RIPE NCC services that are important for the operation of the global Internet, or that directly affect the operations of our members or the RIPE community. These are the services we chose as they traditionally have been deemed important enough to require 24/7 support. If the community thinks other services should be included, we are very open to that.

For each of these services, we consider how severe an impact a worst-case scenario would have in terms of service availability (e.g., outages) or data confidentiality or integrity (e.g., data leaks or hacking incidents). The highest impact severity level for any type of incident (availability, confidentiality, or integrity-related) yields the criticality rating of the service.

The Process

The process starts by determining the maximum impact any type of incident can have on certain external areas for each of the services in scope. We have used the Internet Society’s Internet Impact Assessment Toolkit, which identifies the critical properties of the Internet to help us define the external impact areas of our services. Of these five properties, two relate to the RIPE NCC's services: the Single Distributed Routing System and the Common Global Identifiers (i.e., IP addresses, AS numbers, and the Internet’s Domain Name System (DNS)).

The service owner, together with the team supporting the service, fills in a form answering the following three questions, for each of the following three areas (Global Routing, IP addresses and AS numbers, and DNS):

Data Confidentiality:
What is the highest possible impact of a data confidentiality-related incident (data leak)?

Data Integrity:
What is the highest possible impact of a data integrity-related incident (hacking)?

Service Availability:
What is the highest possible impact of a service availability-related incident (outage) of up to 22 hours in a quarter? (All our services are designed with at least 99% availability)

The answers are quantified on a four-point scale (Low, Medium, High, and Very High), according to the definitions in Table 1.

Table 1: External impact areas and severity levels

The next step is asking the community for input, by sharing the form with the relevant RIPE Working Group. Table 2 lists the services in scope (at the time of writing) and the working groups we will engage for input.

Table 2: Services in scope and the relevant Working Groups

At the same time, we will also determine the maximum impact any type of incident can have on certain internal areas, which we defined using ISO/IEC 27005:2011 and our internal risk assessment. We will be looking at how availability, confidentiality, and integrity types of incidents on any of these services could affect our organisation in legal, financial, or operational areas, for example.

As stated before, the highest potential impact level on any area, internal or external, on any type of incident, will give the criticality rating of that service and will be published on our website.

In terms of applicability to the cloud, the three individual service criticality components (availability, confidentiality, and integrity) will be used in defining specific service architecture requirements. For instance, the “availability" component of the Service Criticality rating will be used for the Cloud Strategy Framework, while the “integrity" and “confidentiality” components will be used for the Security Controls Framework for Cloud Services.

Our Chief Information Officer, Kaveh Ranjbar, will present this framework to the RIPE NCC Services Working Group at the RIPE 84 Meeting.

We will be addressing the topic in each of the working groups listed above. If you would like to contribute, please make sure you are subscribed to the mailing list of the relevant working group.

46 You have liked this article 0 times.
0

You may also like

View more

About the author

Comments 0