Skip navigation

HANA Multitenant Database Containers, or HANA MDC, is something fairly new in the SAP world. It first became supported in SPS 09 (Revision 90), less than two years ago, in late 2014. Since it’s a new technology, there are relatively few customers using it. At oXya, we do have customers to which we helped implement MDC. In this blog I will describe the technology, what challenges it solves, what issues it actually creates, and which type of organizations can benefit from it.

 

 

Background: HANA before MDC

sap-hana.png

 

SAP HANA is an innovative in-memory database system. We’re taking large database systems, and we’re putting them all in-memory, which is pretty expensive from the hardware perspective – they need expensive hardware, with strong compute power and lots of memory.

 

Before HANA MDC, you had to have a separate SAP HANA instance – possibly on the same hardware, but still a separate SAP HANA instance – for each different database (meaning for each landscape). The meaning was that every time you wanted to install a new HANA system, you either had to buy a new physical appliance; or you had to install a new HANA instance for each different database, on the existing hardware.

 

The challenge is that HANA appliances are installed by the vendor, and delivered to the customer as an already-configured system. The vendor also certifies that the appliance was correctly configured. Now, if you want to install another HANA instance on that appliance, you either have to engage the appliance vendor again, for them to install another HANA instance in a certified manner; or the customer has to go through the painstaking steps of making sure that the new HANA instance is installed correctly.

 

In order to perform the installation, you needed a HANA expert to install the new instance, and that’s often done by the vendor that sells you the appliance. In other words, each new installation of HANA was complex, took a lot of time, and often may have also resulted in the need to purchase a new appliance.

 

All of the above is a rather inefficient way to install new HANA instances, and it obviously generated a lot of extra work.

 

Another challenge, as you probably guessed, was that the above method was very wasteful, with regards to hardware usage. Not only did SAP face the challenge of making the HANA install process quicker and easier, but also the challenge of reducing hardware costs, that was slowing down HANA sales. The hardware needed for this in-memory computing system is very expensive; SAP’s goal was to reduce hardware costs and footprint of the HANA landscape.

 

Let’s use an example, to explain all of the above. Imagine a customer (and we have an actual customer like that), who has SAP ERP and SAP BW (Business Suite applications), and they’re both running on HANA. Each of these applications has Sandbox, Development, Quality Assurance, and Production systems, all running on HANA. Before MDC, if the customer wanted to run their ERP on HANA, they needed to buy multiple appliances, one for each system:

  1. An appliance for Sandbox
  2. Separate appliance for Development
  3. Third appliance for Quality Assurance
  4. Fourth and separate appliance for Production.

 

So far we’re talking about four HANA appliances, and that’s only for their ERP application. If the customer wanted to duplicate their data and also run their BW on HANA, then they would need to buy additional appliances for the BW systems (Sandbox, Development, Quality Assurance, Production).

 

And of course, it’s not just about the hardware costs, but also a matter of license costs. Do additional instances also mean you needed additional licenses, therefore additional license costs?

 

Well, the answer in general is “Yes”, or sort of. As a general rule, HANA is licensed by the amount of memory that the application needs. There is a difference between using separate appliances, and using one larger appliance with MDC; that can affect not only hardware costs, but also (to a lesser degree) the licensing costs. I’ll explain that later on, when referring to MDC sizing.

 

 

So what is SAP HANA MDC?

container-489933 (Custom).jpg

 

SAP released a newer feature, Multitenant Database Containers (MDC), which is supported as of HANA SP9 revision 95 and higher. In a nutshell, MDC allows us to take multiple SAP systems, say 2 sandboxes, 1 ERP, 1 BW, 2 development systems and 2 quality assurance systems, and put them all on the same HANA appliance.

 

The beauty of it: once the hardware provider had provided the appliance and did the initial set-up for the HANA system, we can very quickly and easily create the new database containers for each of our different systems, and allows these different products to cohabitate in the same appliance.

 

The MDC feature provides huge economy savings in terms of hardware, because instead of buying a separate appliance for each system (or perhaps not an appliance for each system, but an appliance for each landscape – possibly have your ERP’s Dev and QA on one appliance, and the same for BW), you can now take all of these, put them all on a single appliance, and better utilize that hardware.

 

Of course, as a general rule, we're not going to mix production and non-production systems. So, we still need to separate the Production system. However, for all of our non-production systems, MDC introduced significant flexibility.

 

 

MDC Challenges

 

MDC does come with some challenges, and we hope that SAP will eliminate these challenges in the future. These are current challenges, so you are aware:

 

First, when you perform replication at present, you take your HANA system and you replicate the databases to another site. There is no ability to replicate just a single database tenant – you have to replicate everything. The replication treats all database tenants as a single database system, as opposed to completely independent database systems. Hence, in the case of a failover to a secondary site, you'd have to perform the failover for all HANA database tenants on that appliance, and not for just one database.

 

Second, there’s a challenge in terms of version maintenance. In a traditional SAP landscape, we apply patches gradually:

  • We apply the patches to our Sandbox, and test
  • Then, we apply the patches to the Development, and we test
  • Then, we apply the patches to the QA system, and test again
  • And only then, after multiple patching is done on non-production systems, plus testing, we patch the Production system.

 

The benefit of the above method is we can test patching in many different systems, before going to Production. The challenge with MDC is that when you apply patches to this multitenant environment – you’re applying patches to all the systems simultaneously, because it’s impossible to perform individual patching for each database tenant. Essentially, once you patch your HANA appliance through the MDC, you have patched all the environments that are on the appliance – all at the same time!

 

“But wait”, you may say, “is this really a challenge? Is patching everything a challenge? Don't you want to patch everything? Doesn’t it save you work and time?”

 

The answer is – take another look at the patching process I outlined above. Our goal is to prevent impact to the Production system. In an SAP environment, we perform multiple tests on patches in non-production systems, before implementing in Production. The way to ensure that patching is successful, and doesn't cause any negative impact, is by performing multiple iterations: first Sandbox, then Development, then QA. By the time you get to Production, you've already performed patching 3 times.

 

However, in the case of MDC, once you performed patching on your non-production MDC system, you’ve done that simultaneously to all the systems within the same appliance. You’re not getting those multiple iterations, which is a huge disadvantage.

 

cargo-449784 (Custom).jpgAlso, there’s another caveat to patching. Let’s say you have ERP and BW on the same appliance, and patching is only required for one product, but not for the other. In the case of MDC, we don’t have a choice – we have to patch them both, at the same time.

 

This patching challenge is not an insurmountable issue, but it is something to get used to. This is also another reason why the Production system must be separate from non-production – you certainly don’t want to do patching on Production without testing it first!

 

But all in all, and despite of the above challenges – MDC is a great technology. We're able to spin up new containers in minutes, with a simple command on the HANA system. This, as opposed to having to engage a HANA expert to install the new instance, or perhaps even purchase new hardware every time you need a new a system. MDC adds significant flexibility, and I believe SAP is working to enable the replication of individual containers, and it will be available in the near future.

 

 

Difference between HANA system and the Containers

 

One important thing to understand is the difference between the HANA system and the containers, as each container has its own SID (system ID), in addition to the SID for the HANA appliance itself. For a customer that is going to use MDC, it’s important to realize that they have to name their HANA appliance uniquely for the entire appliance, and then name each container based on its purpose. In other words, you have an SID for the HANA instance on the appliance, and then you have a separate SID for each container.

 

Let’s explain that using an example. Let’s say that the SID of the HANA appliance would be HA1. Then, we’ll give various names to the containers, usually according to some kind of a key. For example, we will use "Q" for quality assurance, “D” for Development, “E” or “B” for ERP and BW, and an "H" to indicate that it's a HANA system.

 

With the above method, your development ERP container could be “DEH”, and your development BW container could be “DBH”. The QA containers will be “QEH” and “QBH”.

 

 

MDC Sizing

tape-measure-1186496 (Custom).jpg

 

Earlier, I mentioned sizing and said that customers may also save some money on license purchases. In order to explain that, let’s start with the traditional sizing for HANA, meaning separate appliances for each system.

 

HANA is a database, and each database system (Development, QA, Production, etc.) needs its own memory to contain its database. There are no savings in terms of reducing the volume of each database, because they stay the same, whether in separate HANA appliances or when using MDC, on one larger appliance.

 

However, when performing sizing, one doesn’t size just perfectly – you always leave room for growth, meaning a buffer. When you size for separate HANA systems (say one Dev and one QA), you will oversize for both systems, to make sure that you have sufficient space. However, when you do that, you often significantly oversize, and you actually “waste” hardware. In fact, I Can guarantee that if you look around the world, at various HANA systems, there's lots of memory and CPU that are not being used. Customers are paying for this over capacity, not only in hardware costs, but also in HANA licensing costs.

 

Consolidating systems omits the above, and provides great savings. When consolidating HANA databases, we don’t need to oversize for each database separately. We can better utilize the resources on the server, and oversize to a smaller extent, for all databases combined.

 

Basically, what you're saying is that if before we needed 2 appliances, let's say each 1 with half a terabyte of memory, then when we combine them all on the same appliance, we need actually 1 appliance with 1 terabyte, that's 2 halves and you may be able to save a little out of the 1 terabyte, put a little less because you don’t buffer twice or you don't oversize twice.

 

Take one of oXya’s customers, for example. This customer has 5 different HANA systems, with each database in need of 200GB (already includes space for future growth), meaning we need a total of 1TB. If you went ahead and bought separate appliances for each database, as was the case before MDC, then you would buy five appliances with 256GB each (5 x 256GB). This means you’re wasting more than 250GB, above and beyond your needs. These systems will be gravely underutilized.

 

However, when you consolidate those systems, then you only need to buy a 1TB appliance (which again, already includes space for future growth). Then, as some databases grow, those databases that need the additional resources are able to utilize them. Furthermore, not only do you better utilize the hardware and licenses, but you may even be able to add additional systems onto the same hardware, once it's running.

 

Having said all of the above, it’s important to understand that Sizing is a bit of an art (and not really a Science, as some people claim). When you perform sizing, you make assumptions regarding what is going to be once it’s running, and it doesn’t always happen this way. That’s one reason why there’s a tendency to oversize. When we're sharing several databases onto a single system using MDC, we're able to better utilize resources.

 

Come see us at SAP SAPPHIRE this week – stop by the Hitachi booth, look for the oXya team, and discuss your SAP challenges with us.

 

----------------------------------------------------------------------

 

oXya_HGC_logo (Custom 300).jpgSevag Mekhsian is a Service Delivery Manager at oXya, a Hitachi Group company. Sevag is an SAP expert with more than eight years of experience, the last four of which leading a team of ten SAP professionals, at oXya’s service center in New Jersey.

 

oXya, a Hitachi Group company, is a technical SAP consulting company, established in 1998. oXya provides ongoing managed services (outsourcing) for enterprises around the world. In addition, oXya helps customers that run SAP with various projects, including upgrades and migrations, including to SAP HANA. oXya currently employs more than 650 SAP experts, who service more than 330 enterprise customers with more than 300,000 SAP users around the world.

This blog was co-written with Dominik Herzig, VP of Delivery for US & Canada at oXya, and Melchior du Boullay, Managing Director, Americas, at oXya.

 

----------------------------------------

Outsourcing SAP 4 - Planning Phase - screen shot (Custom).jpg

Our previous blog in the Outsourcing SAP series focused on the Planning Phase of moving your SAP environment to oXya’s datacenter. That blog was the 4th in our “Outsourcing SAP” series, covering various aspects of outsourcing your SAP environment. In the first blog, we asked should we outsource our SAP environment, and suggested a few directions to think about. In the 2nd blog we wrote about choosing the right partner that would manage our SAP environment, and listed various criteria and tips for choosing that partner. The 3rd blog discussed various infrastructure and headcount considerations that each and every customer has, when considering whether and how to outsource their SAP.

 

Today we present the second part dealing with moving your SAP landscape to oXya’s management. Last week we wrote about the Planning Phase, and this time we focus on the Technical Operations Phase.

 

As a reminder, when providing managed services for our customers’ SAP landscape, oXya has two high-level types of offerings:

  1. Run Management only: oXya provides only the managed services portion, while customer chooses to maintain responsibility for their hardware and for hosting SAP. In this case, oXya requires a quick and reliable link between oXya and the customer’s datacenter, to provide the support services. We covered that option in previous blogs in this “Outsourcing SAP” series.
  2. Full Cloud Delivered service: oXya takes full ownership and responsibility for the customer’s infrastructure and hosting, and the infrastructure is installed at one of oXya’s datacenters.

 

This blog and the previous one, about the two phases of moving your SAP landscape to oXya datacenter, focus on the second option—the process of moving your datacenter and infrastructure to the oXya private cloud. The blog is based on past experience with multiple North American customers, where we’ve migrated these customers’ full SAP landscape, plus all of their servers (including non-SAP servers), to the oXya cloud.

 

The previous blog ended with oXya and the customer finishing the Planning Phase, we had all the necessary workshops and coordination done with the customer, the hardware was prepped, and everything is ready for actual transfer of the servers. This is where the Technical Operations phase begins. This phase includes the actual transfer, dry runs, and all actions until we go live, and oXya accepts full responsibility over all systems.

 

At this point, it’s important to note that there are differences between migrating SAP servers and migrating non-SAP servers (which oXya also runs for its customers).

 

Migrating SAP Servers

biz-people-data-center-140821-2379-high (Custom).jpg

 

This is our main expertise, we’ve been performing SAP migrations and hosting for hundreds of enterprise customers over many years. There are several migration options; the specific process depends on the customer’s sensitivity to downtime – some customers are OK with having their SAP servers down over a weekend, while others are extremely sensitive to downtime, and need it minimized. Generally speaking, and this is true for any type of migration (both SAP and non-SAP systems) – every time you want to minimize the downtime, there’s a tradeoff to be had, a balance of pros and cons.

 

Whichever process is chosen, there are some common operations for any SAP system. We will:

  • Begin with Development & Quality, make sure the process completes smoothly, from start to finish.
  • Then we execute a dry run and test the process for the Production environment, get validation from the customer that everything works fine and all the data is there, and they can access and perform basic operations.
  • If customer is happy with the dry run, we can perform the actual systems migration and Go Live on the following weekend.
  • We will then finish with a test, to validate that everything is OK and everyone can do their job.

 

If the customer is not happy with the dry run (step 2), then we’ll do the necessary modifications, perform another dry run to validate that this time everything is OK, and perform the Go Live migration afterwards.

 

The way we actually migrate each SAP system depends on the criticality and sensitivity to downtime of the specific customer. I’ll list them from the most standard process, that of a customer who is the least sensitive to downtime and can afford taking the systems down over a weekend; and ending up with the most sensitive customer, who needs the systems to be down for the shortest time possible.

 

Weekend Downtime

 

If the customer is ok with having 2-3 days of downtime over the weekend, there are two methods that can be used. The difference for using which one to use, depends on whether we want to change the operating system and/or database during the migration, or we continue to use the same OS and database as in the original customer SAP system:

  • Systems Copy. This method supports changing the OS and/or DB as part of the migration process, so the OS and/or DB on the new oXya systems are different from what customer runs before the migration. Systems Copy is done via export and import, using SAP’s “R3Load” tool. The result will be the customer’s SAP system, is running on a different OS and/or database
  • Backup. If we’re not changing the OS and database, we can build an entirely new system on the oXya side, with the same OS and database, take a backup on the customer side, and ship the backup to oXya. Then, at the oXya datacenter, we’ll apply that backup to the blank SAP setup we prepared, and start the system from the backup. The result will be an exact copy of what the customer has at their datacenter.

 

Minimal Downtime

 

The above methods are the simplest approaches to a SAP migration, given that the system can be shut down for several days. However, if you need to perform the migration with minimal downtime, there’s one method that trumps all others – the DR method.

 

hourglass-620397 (Custom).jpg

Disaster Recovery-like mechanism. If downtime is really an issue, the easiest method is to use a DRP-like mechanism for SAP. We will set up an SAP system at our datacenter, which is an exact replica of the customer’s system (same hardware, OS and DB), and configure it as a DRP (asynchronous) of the customer’s existing SAP system. This method enables us to shorten the downtime to a very short time period (half an hour, if everything goes well), of switching from the primary system (customer’s datacenter) to the DR system (oXya datacenter). We will still follow the common procedure described earlier – first do a dry run and a test with customer’s users to validate everything, make any necessary corrections, and then perform the real migration.

 

The dry run is done over one weekend. We will simulate the exact same system as the customer’s primary system. If successful, we will repeat the process on the following weekend. The purpose of the dry run is to make sure the process works, and everyone can validate and test the new system.

 

Once the dry run is finished we will shut it down, bring back the primary system on the customer’s side, rebuild the DR system on oXya’s side during the week, and repeat the operation on the following weekend, for the Go Live. This process significantly reduces downtime, to almost nothing (in a nutshell, the dry run is exactly the same operation as doing a DR test, and the Go Live is the same operation as switching to a DR system in a real crisis situation).

 

The main thing to remember regarding a DR-like mechanism – the new SAP system, in the oXya datacenter, must be the same (hardware, OS and database) as the current system on the customer side. The DR method only works if you’re not changing your SAP landscape (i.e. keeping the same DB/OS platform and SAP application versions, etc.).

 

What if you need to change the system?

 

We mentioned the DR method doesn’t work if you want to change either the hardware, OS or DB between the current customer system, and the new system at the oXya datacenter. If you want to perform such a change, there are two options:

  1. We first switch to an identical system at oXya (using the DR method), and later switch components in the system (OS, DB, etc.).
  2. We use the first method mentioned, Systems Copy, to perform an export/import, in which case the OS and database can be changed. However, this approach requires a longer downtime (you cannot configure the new system as a DR).

 

The bottom line: if you want to change your landscape—hardware, OS or DB—there’s no going around a longer downtime period, typically a weekend. What can be done, to minimize the risk of “all at once”, is to perform the migration in two steps: first migrate the system “as is” to oXya using DR, and then perform the OS or DB migration over another weekend. However, it doesn’t matter whether you’re moving from the customer datacenter to the oXya datacenter, or you do the change within oXya’s datacenter – whenever you change your landscape, you’ll need a longer downtime.

 

Migrating non-SAP Servers

 

As already mentioned, we have many customers for which oXya manages their entire IT infrastructure, not just their SAP systems. With regards to migrating non-SAP servers, it depends on the type of the system.

biz-people-data-center-140821-2982-high (Custom).jpg

There are essentially 3 types of other systems:

 

  1. Systems virtualized using VMware. If the customer already has a system virtualized using VMware, we will use one of the following two methods:
    1. Migrate the VMs to oXya using the VM container (this option is when customer already has SAP virtualized with VMware, and the system stays with the exact same server names, VLAN, IP addresses, etc. – no change as part of the transfer).
    2. Export of the VMs and ship them to oXya (this option is the SAP system is already virtualized, but the move to oXya involves some type of transformation/change within the VMware system, such as changing the host name, or an IP address). If the system is small enough, we can ship it over the network; if too big, we will put it on a fast disk and fly it over to the oXya datacenter.
    3. If a very short downtime is required, there’s a 3rd method when migrating VMware-based systems, using a tool such as Veeam Backup & Replication. This tool creates a backup, moves it to oXya, and keeps it in sync with the original/customer system, using the Veeam tool. This allows us to “bypass” the transfer time to oXya, significantly shortening downtime. This method obviously has a cost.

  2. Systems virtualized with a different hypervisor. The second type of migration is for servers that are virtualized, but not using VMware (for example, with Hyper-V). At oXya, we typically use VMware, though there are some exceptions. If the customer’s servers are virtualized using Hyper-V, and oXya servers are typically virtualized with VMware, then we will perform a Virtual to Virtual (V2V) migration, using the VMware vCenter Converter tool.

 

  1. Non-virtualized systems. The last type of servers are physical servers. If the customer has these, there are two options:
    1. If the server can be virtualized, we’ll perform a Physical to Virtualized (P2V) migration using the VMware converter, and ship the newly created VMware container to oXya.
    2. Some servers cannot be virtualized. If this is the case, then the situation is a bit trickier, and requires a conversation with the customer about what to do with this server. There are a few solutions:
      1. Sometimes the application needs to be upgraded, in order to be compatible with VMware. This is seemingly the easiest solution, but it’s not always possible.
      2. Another seemingly easy solution is to physically move the server to the oXya datacenter. However, it’s often not as easy as it sounds. The downside to physically moving a server is that it takes time, both planning and the transfer itself, plus there are inherent risks involved. Timing wise, we will use a specialized moving company that does these things—take the server from the customer’s datacenter, put it in a truck, and move it to the oXya datacenter. Once we receive it at the oXya datacenter we will install it, and bring it back up. The risk is that whenever you physically move a server like that, some components may break, or the server will not start after the move, and that kind of things.
      3. If physically moving the server is not an option, then we need to get a bit creative. IT all depends on the specific application, the reason why it cannot be virtualized, and what the customer wants to do with that application. These are the type of discussions we hold with the customer, when we face the case of a server which is physical, and can’t be virtualized. When migrating a customer’s entire IT infrastructure to the oXya cloud, these servers are the most complicated to handle.

 

 

What has been your experience, regarding SAP landscape migrations?

 

Did you experience any difficulties and issues? Where exactly?

 

How sensitive to planned downtime is your organization?

 

What do you think about this entire topic?

 

 

----------------------------------------

 

Our last blog in the “Outsourcing SAP” series, to be published in 1-2 weeks, will be a Q&A, based on questions we receive from customers, on all the topics covered in this blog series. Send us questions about any aspect of “Outsourcing SAP”, either by posting them to the Comments below, or sending them directly to us, Sevag Mekhsian <smekhsian@oxya.com>,  Dominik Herzig <dherzig@oxya.com> or Melchior du Boullay <mduboullay@oxya.com>.

 

 

----------------------------------------

Sevag Mekhsian is a Service Delivery Manager at oXya, a Hitachi Group Company. Sevag is an SAP expert with more than eight years of experience, the last four of which leading a team of team of ten SAP professionals, at oXya’s service center in New Jersey.

 

Dominik Herzig is VP of Delivery for US & Canada at oXya. Dominik has 10 years of SAP experience, starting as a Basis Admin, and then moving to SAP project management and to account management. Dominik was one of the first few people to open oXya’s US offices back in 2007.

 

Melchior du Boullay is Managing Director, Americas at oXya, responsible for all of oXya’s operations across North, Central and South America since 2007, when oXya started operating in the Americas. Melchior has nearly 15 years of experience as an SAP Basis admin and technical consultant, SAP project manager, SAP account management, and business development.

 

oXya, a Hitachi Group Company, is a technical SAP consulting company, established in 1998. oXya provides ongoing managed services (outsourcing) for enterprises around the world. In addition, oXya helps customers that run SAP with various projects, including upgrades and migrations, including to SAP HANA. oXya currently employs ~600 SAP experts, who service more than 260 enterprise customers with more than 250,000 SAP users around the world.

This blog was co-written with Dominik Herzig, VP of Delivery for US & Canada at oXya, and Melchior du Boullay, Managing Director, Americas, at oXya.

---------------

 

In previous blogs of this series we covered various aspects of outsourcing your SAP environment. We started by asking should we outsource our SAP environment, and suggested a few directions to think about. Then, in the 2nd blog, we dealt with choosing the right partner that would manage our SAP environment, and listed various criteria and tips for choosing that partner. In the 3rd blog, we discussed various infrastructure and headcount considerations that each and every customer has, when considering whether and how to outsource their SAP.

 

The next two blogs of this series focus on the entire process of moving your datacenter to oXya.

 

When providing managed services for our customers’ SAP landscape, oXya has two high-level types of offerings:

  1. Run Management only: oXya provides only the managed services portion, while customer chooses to maintain responsibility for their hardware and for hosting SAP. In this case, oXya requires a quick and reliable link between oXya and the customer’s datacenter, to provide the support services. We covered that option in previous blogs in this “Outsourcing SAP” series.
  2. Full Cloud Delivered service: oXya takes full ownership and responsibility for the customer’s infrastructure and hosting, and the infrastructure is installed at one of oXya’s datacenters.

 

The next two blogs focus on the second option, and specifically on the full process of moving your datacenter and infrastructure to the oXya private cloud. These blogs are based on past experience with multiple North American customers, where we’ve migrated these customers’ full SAP landscape, plus all of their servers (including non-SAP servers), to the oXya cloud.

000045034298_high (Custom).jpg

The oXya migration process entails two main phases:

  1. Planning – covered in this blog
  2. Technical Operations – to be covered in next week’s blog

 

The project as a whole may seem a bit daunting, and like it’s going to take a lot of time, yet both impressions are untrue. Leveraging oXya’s processes and experience, our customers experience a well-defined and streamed process, which can be done relatively quickly.

 

How quickly?

 

For example, let’s take one of our customers, for which we’ve migrated their entire datacenter, consisting of ~100 servers (both SAP and non-SAP servers), to an oXya datacenter. The entire project, from the moment the contract was signed, and until the last server was moved and started running within the oXya datacenter, took 4 months (and BTW, there’s a difference in process between migrating SAP and non-SAP servers; we will cover these in next week’s blog).

 

While other vendors may try to sell you on a shorter cycle, or you may have business pressures which demand otherwise, remember that a timeframe of 4 months is considered extremely quick in the industry. Obviously, if the customer does not have extreme time constraints, then we can allow for more time and buffers within the process; still, if the customer requires that, oXya is able to make the entire transition very quickly.

 

Important First Steps

 

Timing is the first thing you think about, when starting such a project. Some things take a long time to order, so they need to be handled immediately upon signing the contract.

 

MPLS connection. Most customers prefer to have a fast and reliable internet connection (MPLS) between their location and the oXya datacenter; MPLS is a dedicated line, between these two locations. The challenge is that getting the MPLS connection in place takes (relatively) a lot of time, and so do other things that involve the ISP. When a customer asks an ISP for a date commitment, the ISP will usually commit to anything between 45 and 90 business days, meaning the minimum time to wait is more than 2 months, and it can sometimes also be more than 4 months (sometimes actual delivery is a bit sooner, but these are the timeframes that ISPs are willing to commit to). Hence, ordering an MPLS connection must be done as early as possible.

It’s important to understand that MPLS is not a must. oXya has customers who do not have MPLS; these customers access the data via VPN connection only. The difference between the two is that VPN goes over the public Internet, hence it is not as reliable as an MPLS connection.

 

Even if the customer ordered an MPLS connection and the ISP is slow in providing it, we can still perform the migration and provide the customer with a connection to their servers, using a VPN connection. It won’t be as reliable as an MPLS connection, but it will get the job done while waiting for the MPLS.

 

denver_140821_3561_hi (Custom).jpg

Hardware. For the hardware we have two options, with two different timeframes. Customers can either leverage oXya’s already-available Hitachi-based hardware (UCP) for shared / on-demand needs; or they can ask for other, specific hardware they would like their IT environment to run on:

  1. Hitachi UCP hardware: oXya had standardized on Hitachi hardware long before we were acquired by Hitachi, and we recommend customers to base their SAP on Hitachi UCP. Whether it is a dedicated infrastructure that you need, or share/on-demand infrastructure—and choosing between the two usually depends on size of landscape, needs, and additional criteria—oXya has a large pool of Hitachi UCP hardware, so we’re able to offer these options immediately.
  2. Other vendors hardware: oXya is open to accepting other types of hardware—per customer’s request—as we explained in detail in our previous blog. The challenge is it takes quite some time to get other vendors’ hardware. From our experience, and depending on the specific hardware vendor, we’re talking about a minimal timeframe of 4-6 weeks to get hardware from some vendors, and often significantly longer from others. Then, when we get the hardware, we need another couple of weeks to setup everything (to be explained later). Unfortunately, there is no working around the hardware. Until the other vendors’ hardware arrive, there is no technical action that can be performed; we do perform other preparation actions.

 

Two Approaches for a Datacenter Move

 

When you think about moving your infrastructure and your datacenter to a partner, there are two main options to perform this move:

 

The “Big Bang” move. You move all servers at once, from your datacenter to the oXya cloud (or to another datacenter). It is not easy to do, especially when you have many servers, because each server is unique. Using the “big bang” will likely result in lots of headaches, trying to troubleshoot various issues; furthermore, it will be challenging to pinpoint where the issues come from. For that reason, oXya typically do not recommend this approach. We prefer the phased approach.

 

The Phased Approach. We define groups of servers that work together, and build “migration groups”, where each migration group consists of servers to be migrated together. Each group is migrated separately, usually over a weekend, so the entire migration takes multiple weekends. In the case of the customer I mentioned earlier, for example, the 100 servers were divided into 5 migration groups. The phased approach requires some work from the customer on a functional level – we need to identify which servers work in conjunction with each other; at this early stage of the process, the customer still has a much better understanding than oXya on their servers. We work with the customer to help identify possible issues that may occur with respect to each group of server.

Timeline for the Planning Phase

 

The following steps are needed to prepare the hardware before we can begin moving servers from the customer to the oXya datacenter:

  • Install all the hardware components (not relevant if you use our Hitachi UCP hardware, only for other vendors)
  • Prepping the servers
  • Installing vOS (typically in a VMware environment you need to thread the images that you’re going to use for the other servers)
  • Network setup (typically VPN, unless MPLS has already been installed)

 

Above steps are the beginning of the Technical Operations phase, on which we’ll expand in our next blog. What’s important to understand regarding this stage is the following:

  • Above steps are relevant when customer wishes to use a non-Hitachi hardware
    • In such a case, the hardware preparation stage typically takes two weeks, from the time we get the hardware.
  • When using our available Hitachi UCP hardware, some of these steps are not relevant, hence the prep work is very fast, and server transfer can begin almost immediately.
  • At this stage, no data has yet been sent; these steps are just to get the hardware ready for actual transfer of data.

 

Knowledge Transfer & Planning

 

Another important part of the Planning Phase is around knowledge transfer and planning. This is where a joint team from the customer and oXya define the migration groups: find what each server does, how it works, what it communicates with, etc.

 

In addition, oXya will have some joint workshops with the customer (typically 5-6 workshops), to cover specific topics around the migration process:000038046066_high (Custom).jpg

  • Network & Security. Covers customer’s requirements, so we can implement in our datacenter. The Network & Security is sometimes split into two separate workshops, depending on the specific customer and their systems.
  • Servers Migration. Discuss what servers do, with what they communicate, and define the migration groups mentioned earlier.
  • Communications. Define how oXya team will work with the customer’s team. Covers how customer prefers to submit tickets: via email, or phone calls, or a ticketing tool (oXya’s or customer’s). We also address the escalation process, and everything around defining how we will work together.
  • Monitoring. Define how monitoring is going to work for every specific customer. Includes what needs to be monitored, what actions to perform if one of the checks fails (do we call someone or send an email), etc.
  • Global Governance. Overview of the project and follow-up every week, to make sure the project is on track.

 

The workshops are spread over the entire length of the project. Some workshops are more urgent than others; Servers Migration is one of the most urgent ones, and so is Security. On the other hand, the Communications and Monitoring workshops can be done later in the project, right before we move the servers.

 

All of these workshops are customized for each and every customer. We may create another workshop, should there be another subject that requires a deep dive with the customer and a thorough discussion. The overall goal of all these workshops, and of everything else that oXya does, is to make sure the project is a success.

 

It’s important to note that the same team who will manage your systems long-term, is the one that also drives and executes all the migration phases, including the workshops. This personal touch & dedicated team approach, in which oXya is unique at, ensures both consistency and proficiency in the details of your landscapes, throughout your relationship with oXya. Our approach also builds the relationship from day one, and ensures that your team knows the oXya support team who will handle your systems going forward, and vice versa. We are very proud of this personal touch & dedicated team approach, and it has been one of the cornerstones to our great success over the years.

 

 

What has been your experience, regarding SAP landscape migrations?

 

Did you experience any difficulties and issues? Where exactly?

 

Does your vendor also provide the same personal touch and dedicated team approach? Or each phase of the project is done by a different team, and each ticket you open is handled by different people?

 

 

---------------

Our next blog in the “Outsourcing SAP” series will cover the Technical Operations stage of moving your datacenter to oXya. Then, the blog after that will be a Q&A, based on questions we receive from customers, on all the topics covered in this blog series. Send us questions about any aspect of “Outsourcing SAP”, either by posting them to the Comments below, or sending them directly to us, Sevag Mekhsian <smekhsian@oxya.com>,  Dominik Herzig <dherzig@oxya.com> or Melchior du Boullay <mduboullay@oxya.com>.

 

---------------oXya_HGC_logo.png

Sevag Mekhsian is a Service Delivery Manager at oXya, a Hitachi Data Systems company. Sevag is an SAP expert with more than eight years of experience, the last four of which leading a team of ten SAP professionals, at oXya’s service center in New Jersey.

 

Dominik Herzig is VP of Delivery for US & Canada at oXya. Dominik has 10 years of SAP experience, starting as a Basis Admin, and then moving to SAP project management and to account management. Dominik was one of the first few people to open oXya’s US offices back in 2007.

 

Melchior du Boullay is Managing Director, Americas at oXya, responsible for all of oXya’s operations across North, Central and South America since 2007, when oXya started operating in the Americas. Melchior has nearly 15 years of experience as an SAP Basis admin and technical consultant, SAP project manager, SAP account management, and business development.

 

oXya, a Hitachi Group Company, is a technical SAP consulting company, established in 1998. oXya provides ongoing managed services (outsourcing) for enterprises around the world. In addition, oXya helps customers that run SAP with various projects, including upgrades and migrations, including to SAP HANA. oXya currently employs ~600 SAP experts, who service more than 260 enterprise customers with more than 250,000 SAP users around the world.

SAP is one of the most mission critical applications for any organization, because it runs the entire business. An IDC study, published in 2015, determined that for a Fortune 1000 company, an unplanned downtime of a mission critical application will cost between $500,000 to $1,000,000 per hour of downtime. In fact, I can think of many organizations for which the damage resulting from unplanned SAP downtime will be significantly higher than $1 million per hour, because these businesses rely on SAP for conducting their ongoing operations, including sales.

 

I recently had a discussion with a colleague, regarding the main causes for unplanned downtime to the SAP environment, and how these can be prevented. That discussion led to writing the blog you’re now reading. The content and advise in this blog are based not only on my own 8 years of experience as an SAP expert and team leader; this blog is based on the accumulative experience of nearly 600 SAP experts working at oXya, a Hitachi Data Systems company, specializing in managed services for SAP customers. oXya has been managing SAP environments for enterprise customer for the past 18 years; we are currently managing SAP environments for more than 260 enterprises around the world, with more than 250,000 SAP users.

 

I’ll begin with listing some of the most common causes that lead to a crash of SAP, and then move to some proactive steps that can be taken, in order to avoid these crashes.

 

What are the main causes for downtime of an SAP environment?

 

Outages may occur in the SAP Production environment due to various causes. The following list describes the most common causes we’ve seen over the years for unplanned downtime of SAP environments:oXya14 (Custom).jpg

 

Lack of Monitoring. This is by far the leading cause for outages of the SAP Production environment. For example, we have customers that, prior to working with oXya, never used to monitor their servers, not even the Production servers. By “monitoring the servers” I mean that disk drives were not being monitored, so that when drives were running out of space and there was no storage space left on them, no one was aware of that; if we’re talking about a hard drive that is crucial for the database operations, and the database could not write to the hard drive, then that would bring down the entire Production instance, causing users to not be able to do their work.

 

Hardware issues. When there is a device failure on the system’s server, whether this is a hard drive failure, a motherboard issue, or any other type of general hardware issues, and there’s also a lack of redundancy in place – this can bring down the Production instance.

 

Network related issues. Network issues can occur, for various reasons. This can be hardware-related, such as a case in which one of the firewalls or switches has an issue, but it doesn’t have to be hardware related. What happens when a network issue occurs is that the users are blocked from being able to access the SAP system. Hence, even though the SAP system itself is up and functions properly, this type of issue is considered an outage, because the business is not able to connect with the SAP systems, and users are unable to do their work. In many cases, network issues occur due to lack of redundancy, meaning systems that are not built with sufficient redundancy in place. If there is a single point of failure, somewhere, and that single point does fail, then it will cause an outage or outage-like consequences. The key here is to identify and minimize single points of failure, by as much as possible.

 

Performance related issues. Performance issues are is not necessarily an outage, yet it can impact the users to a point where it makes the SAP system unusable. For example, during end of month activities on an SAP Production system, there are critical tasks that must be completed within a certain timeframe. If there is a performance issue, such as database performance or something similar, the business is unable to complete the work within the allocated timeframe. This is considered an outage to some extent, because the system is not working as intended; the system does not provide what it needs to, even though it is still working and didn’t crash.

 

Insufficient proactivity or reactivity to the system. From a pure technical perspective, and this is also tied to monitoring—if there is a critical issue and this issue is not resolved, then down the line, if continues to be neglected, there’s a high chance that it will come back and bite your behind. This is one of the sure ways to guarantee unplanned downtime. Let’s begin with an example for what I call ‘lack of reactivity’. For example, assume that we observe a disk that is on its way to getting full. It still has some space so we’re not taking any steps now. Then, it indeed gets full and causes a system crash. This type of crash can fall under the ‘lack of monitoring’ category, but it’s also the lack of taking the necessary steps when witnessing possible things that can cause issues. This is lack of reactivity.

 

By lack of proactivity, I mean making sure that alarms are not being triggered on the system. For example, in SAP there are health check jobs that need to run – clean some tables, and so on. If these health checks and house cleaning jobs are not being performed, it can lead to issues that cause downtime. The proactivity here means that at oXya, we have people who are handling customers’ systems on a daily basis, making sure that the systems are in healthy state. If this piece of proactive treatment is missing on your Production environment, it can certainly cause issues and even downtime at some point.

 

Testing. We view testing as major cause for downtime, and it has both technical and business aspects. From a technical aspect, patches and updates (software, firmware) need to be applied, in order to photo_58702_20151221 (Custom).jpgkeep the system up to date. If a system is out of date (patch wise), there can be a software failure that causes the system to go offline. Lack of testing, when applying a patch to the Production system, can cause issues. For example, let’s say that there’s a Windows patch that needs to be applied to the server, because it’s a critical software patch that fixes a known bug. Let’s further assume that this bug is known for its ability to crash the SAP system. The correct way to perform testing is to apply this patch to your Dev environment and test it for a week, to make sure that everything is working fine. Then, you install the patch on your Quality environment, do another week of testing to make sure there are no issues. Only then, you install the patch on your Production environment.

 

We’ve seen cases where this testing routine was not performed, and the patch was applied simultaneously to all three systems (Dev, QA, Production). If this is done, you’re at risk of discovering there’s a potential issue with SAP and this patch—for whatever reason the patch causes an issue with the SAP application, which can lead to an outage.

 

The key is to perform rigorous testing whenever a change is introduced to the SAP environment, whether it’s a patch, a functionality that is being implemented by the business, etc. Such changes need to be gradually applied and thoroughly tested, with large-enough time window, so that any issue is caught before the change hits Production.

 

Updates & patches. Keeping the system up to date should be a top priority for any SAP team; unfortunately, we see many systems that are not kept up to date, and this is especially the case with Production systems. We do ‘understand’ the reasoning for systems that are not up to date; downtime, especially planned downtime where you have to bring a system down to apply a patch, is not something that can be easily provided by the customer/business. Hence, it’s always a challenge to keep the systems up to date, whether it’s an operating system patch, a firmware update, or any other type of patching that needs to be applied.

 

If the entire system is not kept up to date, there are potential bugs or issues that can occur, and can bring down the system. We’ve seen that in the past – there was a bug in the Windows environment that crashed systems; this occurred because a specific patch, that was released two years earlier, was never applied to the systems that crashed. If patching and updating of the environment is neglected, this can (and most likely – will) eventually cause downtime to your SAP environment.

 

Unplanned Downtime & Human Error

 

The majority of unplanned downtime events can be tied to human error. Whether it is the result of lack of planning, lack of testing, lack of reactivity or proactivity actions, and so on. Of course, downtime can also occur without human errors, like for example if there is a circuit outage at the datacenter, and the entire datacenter goes offline. Such a case would usually be considered as force majeure; and still, one can say that even a circuit outage can be considered a human error, because something wasn’t tested, and/or something wasn’t done right, and/or there should be a disaster recovery (DR) site so the entire system (spread across more than one site) never fully crashes, and so on. However, at the end of the day, in such cases it’s difficult to put the blame on human error, because there are a lot of teams and many elements in play.

 

Some items are usually attributed to human errors and some aren’t. The items that are usually attributed to human error are:

 

Performance issues. Let’s look at an Oracle database, for example. The database has tables, and the tables have indexes. If an index is in ‘bad shape’, and hasn’t been rebuilt (this was not caught by the database administrators), then this can lead to performance issues. Furthermore, it can lead to a scenario where it is the end of the month, and the business cannot run specific actions on time. This causes headaches and ‘downtime’ for the users. If the administration team, through their proactivity and monitoring tools, found that this index needed to be repaired and caught that before month-end, then this downtime can be avoided.

 

denver_140821_3258_hi (Custom).jpg

Patching (or lack of) issues. If a crash took place, it resulted from a bug, and that bug was known and there was a patch that was not installed, then that also falls into the human error category. Such patches should be installed, after scheduling with the customer.

 

Lack of DR and/or single points of failure. Many Production systems have disaster recovery and replication in place, especially for enterprise customers, so that even if the main Production system goes offline, it is brought up on the replicated DR site pretty quickly. Still, there are many smaller companies that do not have a DR site. Such companies often host their SAP onsite, at their own offices, sometimes in a small onsite datacenter. Such installations typically have many single points of failure, such as not having a generator and relying on a single source of electricity; not having DR and replication for your system; having just one piece of network device that failed; and so on.

 

If we take an overall, honest look at human error, we have to admit that mistakes do happen. For example, administrators have above-average access to the system, so they can accidentally delete files, shut down systems, and so on. It’s not unheard of that an administrator had deleted a critical piece of software, which was a must for the system to be up and running. So, human errors along those lines can and do cause outages.

 

The constant struggle is to have some sort of a system in place, that minimizes such human errors by as much as possible. This involves having second checks in place. I can tell you how we operate at oXya – if there’s a critical action that’s being performed to a Production system over the weekend, for example, we don’t let a single person perform that action. We have a minimum of two experts, and sometimes three, working on the same item. The purpose of having multiple people working together, on the same item, is to double check everything and go over the actions that any specific person would take, in order to eliminate human mistakes by as much as possible.

 

Through these types of check systems, we can eliminate or at least significantly reduce the types of human errors that occur. At oXya, there’s a second person assigned to checking all the work being done, especially when dealing with the customer’s Production environment.

 

How can unplanned downtime events be minimized or omitted?

 

I’ve already provided some advice earlier, like always having more than one person when applying changes to the Production system; or keeping the systems up to date.

 

Taking the ten thousand foot view – the key to minimizing downtime is proactivity. All of the monitoring and automated checks that exist on all systems (or for most) is great, but it’s not enough. You need something extra on top of those automated systems, and that something extra comes from the “human touch”. Through proactivity, and the experience that oXya has in handling these sophisticated landscapes, we provide this extra layer of security for our customers.oXya1 (Custom).jpg

 

What do I mean by “human touch”?

 

At oXya, of course we have all of the possible monitoring on the Production landscapes, including automated monitoring. However, we’re not settling with these alone. The “human touch” aspect is that on a daily basis, we have people logging into these systems, performing manual daily checks, and making sure that the systems are up, running, and in good condition. We believe that in addition to the automated checks, that are very thorough, what really ensures that everything is being caught is having human interaction with the systems, on a daily basis; and if something is being caught, then it is handled immediately by SAP experts.

 

Furthermore, testing is very important, especially with patches, to prevent unplanned outages. Make sure that both the hardware (firmware) and software are up to date, and that these updates are done through rigorous testing. If there’s a database patch that needs to be applied, for example, we would coordinate that very carefully with the business; if there is a sandbox environment then we’ll do that first, to have less impact on the business.

 

To summarize, I see four key elements to omitting unplanned downtime: patching, monitoring, reactivity to issues, and lots of proactivity. These are the key elements that oXya has been following over the years, and that have led to having lots of success with our customers.

 

------------------------------------

 

Sevag Mekhsian is a Service Delivery Manager at oXya, a Hitachi Data Systems company. Sevag is an SAP expert with more than eight years of experience, the last four of which leading a team of team of ten SAP professionals, at oXya’s service center in New Jersey.