Last updated 20/07/2021
For cost optimization and risk transfers more and more businesses are adopting cloud platforms. Challenge is, Can we apply traditional ITIL processes to manage cloud workloads/deployments? What are the differentiating factors with respect to traditional Data Centers and Cloud Environments for a change management? When it comes to ITIL / ITSM many people consider it as a service management for a single customer. Service Management for shared services on Cloud platforms gives rise to a new challenge to apply ITIL Processes to manage shared services provided to multiple customers.
I will try to address above concerns in the following article.
To start with, I will brief about ITIL Change Management Process applied for shared/private cloud environment. This article, will set the context as we are the cloud service provider, providing Cloud Infrastructure as a Service (IaaS) and managing Cloud Infrastructure.
Initially, to onboard the customer, there has to be an agreement between Cloud Service Provider (CSP) and customer on the type of cloud service deployment, for example, Private / Shared and scope for each type of deployment.
For Private service deployments, change management is little simpler than that of shared service. Any change as per the change management process will impact only that particular private cloud service which is customized to be used by a customer.
When Change is received for a shared environment, then we need to understand the scope and impact of the change. Impact parameters could be the criticality of the change, exposure to the customers and services. Based on the impact analysis, we need to plan the change.
The criticality of the Change in a shared cloud environment is driven by a number of customers getting impacted. It may happen that a particular change causes Priority 1 Incident / Major Incident which may create too many of P1 / MI to service providers as it will impact multiple customers.
For better management of customers using shared Cloud, Service Provider can divide that service into multiple sites, designate a master site and others as slave site. Any change happening on the master site can impact all slave sites too but if we are implementing the change on slave site then it might only affect the slave site.
Sometimes we need to implement changes on all sites together if the change is “No Impact” Change. But to implement change outside of office hours of customers we can divide change as per site and implement it as per off business hours for the site.
Let’s simply the concept of cloud deployment types by comparing it with an example of a house. Consider that shared service is like a house with different rooms and we are allocating 1 room to each customer. So when you are changing something in that house, it might affect all rooms or in some cases it will affect only some rooms or it can impact no rooms. It depends on what change you are implementing.
We can consider Hall of House from where we enter that house as Master Site. So when you are implementing change for Hall, it can affect all rooms or it can only impact Hall.
For eg., When you are doing change on main electricity switch in Hall, it will impact all rooms but if you are only changing the colour of the hall, it will only impact Hall (Master Site). Similarly, If you are implementing individual changes in Rooms like changing colour it will only impact that particular Room and if you are changing something like extending room then it might impact all other rooms. In case of Private or dedicated Cloud Service, You can consider that Service as entire house which you are allocating to a customer.
To establish a change management process for cloud service provider (CSP) should create cloud service inventory list for private and shared cloud services. This list should contain the names of cloud service, region, service manager, service architect, Process Managers like change manager, incident manager, problem manager. Also we need to have the list of customers per site for each shared cloud service. Both of the lists will help us to define the scope and impact of the changes on cloud services.
There should be a customer-specific change manager for each of the customers who can manage non-cloud changes. Additionally, there should be a Separate Cloud Service Management Teams who will manage the changes on shared / private cloud services.
CSP can define the scope for both Cloud change management team and the customer-specific change management team. CSP can choose to delegate the change/process management of private cloud for particular customers to customer-specific change management as the impact is limited only to that particular customer. In such cases, most of the time changes are handled in a same way as the traditional changes for that customer.
The only procedural addition could be getting approvals from the cloud service architect and cloud service manager. When we receive the change for shared cloud, we need to confirm which cloud service/services are involved. There are some situations where we receive changes for multiple cloud services.
We need to understand the design of the cloud service which is involved in the change. If that service is divided into multiple sites then we need to confirm on which site change will be implemented. If it is for single site (Slave Site) then we can consider that customers for that particular site will be only impacted. But if the change is on Master Site then we need to again confirm from change requester/implementer/architect that if only master site will be impacted or all sites will be impacted.
Accordingly, we need to take approvals from customers. Now when we are receiving changes on shared cloud service impacting customers (may be 100 customers), how we can ensure all customer approvals for every change. Again depending on service provider, we need to decide when we need to take customer approvals and when we can only send customer notification that this change is impacted and this will be impact for your customer.
Ideally, when there is only portal/API downtime that means service will be available but customer will not be able to do some things on portal/API, then we can send only customer notification and skip customer approvals. Or impact is only slow performance then also we can think of only customer notification.
But if there is service downtime then we need to take customer approvals. But this decision (where we will be taking customer approvals and where we will be only sending notification to customers) should be done in advance and agreed by Customer Service Delivery Managers and Cloud Process Head. Also Lead Times for customer approvals and customer notification should be agreed in advance and it should be documented and communicated as well. For Urgent / Emergency changes also we can manage with only customer notification if it is agreed.
Why we are doing this differentiation between customer approvals and customer notification is because a number of customers might be very large (may be 100 per change depending on Cloud Service involved in the change), so it is not possible to take approvals of 100 customers every time. But if there are less number of customers and less number of changes, then we can take customer approvals every time if you think that only customer notification is not sufficient and if resources bandwidth allow you to take customer approvals every time.
So now if we know which cloud service/services are involved, a number of regions, customers, impact of the change that means we are now clear about change scope and impact.
Now the change will be managed similarly as other changes till it is discussed in CAB. Like if your organization is working on implementation plans (including change impact, change window, pre and post-implementation plan, implementation plan, back out plan, test plan, risk, CIs) then we need to take this from implementer and review it as per change management process point of view.
Then it is better to have separate Cloud CAB where only Cloud Changes will be discussed and which is required to be joined by all cloud service managers, cloud service architects, cloud process Head, global cloud change manager. It will be iCAB so no customer will be present as changes for multiple customers will be discussed in this CAB.
We can send CAB agenda to all service managers and architects in advance so they have an idea about how many changes for their service will be discussed in CAB. Also we can send Change Implementation Plans to them in advance so that they can review changes in advance which will need their approval in CAB.
After CAB when we have approvals from Cloud Service Managers and Cloud Architects for changes Scopes, Impacts and implementations, we can send customer notification / take customer approvals as per prior decision and within agreed lead times and then implementers can implement the changes as per change windows.
Now many Service Providers can have CAB Hierarchies. If you think of a single customer CAB as Local CAB, then Cloud CAB for multiple customers will be Global CAB. After approval from Cloud CAB, if change is Major with high risk and high impact to multiple customers then we can add the changes to separate GBU CABs where all major changes for that GBU (Global Business Unit) are discussed or if required then top executive CAB where top-level management from service provider will review those changes. But this CAB hierarchy has to be pre-designed by the service provider.
Topic Related PostMonika have total of 8+ years experience in IT. She is working in Service Management / Change Management since last 5+ years. Currently working as Cloud Change Management Lead for Atos India. I have completed Masters of Computer Science from Pune University and have certifications RHCSA, RHCE 6, ITIL Expert (MALC).
* Your personal details are for internal use only and will remain confidential.
ITIL
Every Weekend |
|
AWS
Every Weekend |
|
DevOps
Every Weekend |
|
PRINCE2
Every Weekend |