Skip navigation

As more customers are deploying SAP HANA, they are also moving to virtualizing their SAP HANA environment.   Hitachi and  VMware recently published two joint papers around virtualizing your SAP HANA environment.

One of these is a Infosys Case Study, a global SAP service provider. Early in 2015 Infosys deployed the world’s largest single instance of SAP Business Suite on HANA (SoH) with over 150,000 users running on Hitachi’s converged platform, UCP.  Since then Infosys has decided to deploy VMware vSphere 5.5 on Hitachi UCP at their Center of Excellence (CoE).  If you want to hear more about  the Infosys story, check them at SAPPHIRE 2016 this year, on May 18 at 2pm.


Here is a little synopsis of the challenges they faced and the benefits they have gained by virtualizing the SAP HANA environment.

Infosys.tiff

At their CoE, Infosys demonstrates how SAP systems can improve their customers’ business efficiency. Infosys was challenged with meeting the growing demand for demos from its customer base. They had to accelerate the deployment time for SAP apps running SAP HANA instances. In order to tackle this problem, they first needed to upgrade their IT infrastructure. But they also wanted hardware that could be virtualized so they could maximize resource utilization. They decided to deploy with VMware running on Hitachi Unified Compute Platform. VMware vSphere was the only product certified by SAP to deliver virtualization featuring capabilities such as dynamic scalability, rapid provisioning through cloning and template based deployment.  They also deployed VMware vMotion and the Hitachi UCP was able to use these capabilities to migrate live virtual machines within clusters. “Selecting the Hitachi UCP converged platform to run the virtualization layer that would deliver SAP HANA enables us to take advantage of a ‘common building block’ approach’,” explained Mr Sharad Pandey, Senior Consultant, Infosys. “This would enable us to grow from 512 GB appliances to 6TB appliances in the scale up model and up to 32 TB on the scale out model just by adding Hitachi enterprise grade infrastructure components into the same stack, and help us maximize the returns our investments.” Additionally Hitachi created SAP HANA VM templates to further streamline the SAP HANA deployment. 

 

The benefits Infosys realized as a result of this deployment were:

 

1) New SAP HANA virtual machines deployed in few minutes using Hitachi’s SAP HANA VM templates

2) Reduction in the total cost of delivery by 85%

3) Cut SAP HANA infrastructure costs by 2.5 times with consolidation offered by Hitachi UCP

4) Reduction in power requirements by 96%

 

Learn more:

Infosys Case Study

VMware vSphere and Hitachi UCP White Paper

Every SAP professional knows about SAP TechEd. It is a huge event that encompasses many sessions, covering both present and upcoming technologies. My colleague, Melchior, already wrote a blog summarizing his experiences from TechEd and recommending a few sessions; in this blog, I’d like to recommend a few additional great sessions that I attended. I’m sure these would be interesting and helpful for those who did not attend SAP TechEd, as well as for those who attended and want a good refresh. At oXya, we use these recorded sessions for “training” and for introducing new SAP aspects to our teams. We distribute a list of recommended sessions to all oXya experts who did not attend TechEd in person.

 

The following is a short list of sessions I loved in TechEd Vegas. For each session I give a short explanation on what was interesting and a link so you can see the recording. The first two videos I recommend watching are the main keynotes that cover the SAP strategy; they show you what could possibly be the future of your IT. Then, there are four additional speeches that I recommend: a session on user interface, one on HANA direction as well as one on S/4HANA, and a session on Internet of Things.

 

 

SAP Executive Keynote: Steve Lucas, Las Vegas 2015

steve_lucas_TE14LV_01411.jpg

The first SAP keynote, given by Steve Lucas, Global President, SAP Platform Solutions Group, was not very techy. The keynote focused on how SAP was helping businesses to solve some real life challenges, using amazing tools; it showed how companies could use big data and other technologies to be more productive. They demonstrated a few things using the Amazon Alexa tool (it’s the Amazon Assistant, like SIRI is to Apple devices). Steve Lucas also spoke about SAP VORA, an analytics tool for big data, helping to analyze both structured and unstructured data. This was a “big picture” keynote, with lots of examples on how SAP could help your company, using some new technologies that would come, and how you would be able to leverage them.

 

 

SAP Executive Keynote: Bernd Leukert, Las Vegas 2015

Bernd_Leukert_488x275.jpg

The second SAP keynote was given by Bernd Leukert, a member of the Executive Board and the Global Managing Board of SAP SE. This session covered the SAP HANA platform and various tools around HANA, and how these integrate into your IT platform. One of the interesting ideas in this session was the transition from a B2C model (Business to Customers) into a C2B model (Customers to Business). The meaning was that customers provided lots of valuable information, big data, to the business (via social networks, surveys and more); the business—using various digital platforms—needed to digest all of that information and provide the right answers to customers.

 

Leukert showed all the IT you could build on the cloud, using HANA and HYBRIS. This was a general session about the HANA Cloud Platform, all the tools powered by HANA, and the architecture that SAP developed to help customers become more digital. The keynote showed a big picture of what could be achieved when using the SAP HANA Cloud Platform.

 

 

UX111: SAP Fiori Apps: An Overview

FIORI.jpg

Those of you who already read my previous blog about SAP Fiori know that I’m a big fan of the new SAP user interface (UI), and don’t like very much the older SAP GUI. The new SAP UI has been a hot topic for the last couple of years; this session about SAP Fiori Apps provided a high-level overview on all the developments in this area. It showed how you could implement the new SAP UI in your company, from developing to installing to end-user extensions, and more. Several tools were shown in this session, like the SAP Fiori Library (website provided by SAP, where you could see how your application would look using Fiori), and also the SAP Fiori Demo Platform (demo platform for selected apps; you could use it to see if it fitted with your business and your applications). This session also provided some previews on future looking developments, like Fiori on smart watches, and Fiori together with Internet of Things.

 

Many customers have questions about the new UI. This session provides a good introduction for any customer considering moving now to the new SAP UI. The session helps understand what SAP Fiori is all about and how it will help the organization, including demonstrations of current technologies and future ones. It’s one of the best sessions I’ve seen at TechEd, for both Basis developers and for business people on the IT side.

 

 

TEC103: SAP S/4HANA Overview, Strategy, and Road Map

germany-s4hana_section_1_1066x600.jpg

This is one of various S/4HANA sessions you could see during this SAP TechEd event. S/4HANA was presented as the future of SAP applications, similar to the role that business suite SAP R/3 had played in the ‘90s. This revolution starts now, and is based on IT simplification, leveraging the HANA platform and Fiori.

 

This specific session focused on S/4HANA Overview, Strategy and Roadmap, and provided the “big picture” on SAP S/4HANA and the new tools around it, as well as the future of S/4HANA. This presentation was a great introduction to Simple Finance, the first component of S/4HANA, and all the way to the latest component released, Simple Logistic. It showed which directions SAP was thinking of for the future, and how these technologies could help customers in simplifying their IT. This session also showed how it would integrate supply chain and CRM on future versions of this tool. It enabled the listeners to “see” the future, and envision how your company could implement S/4HANA. Related to that subject, I would also recommend to read Mel’s blog about considerations before migrating to SAP HANA, as it’s a great blog with valuable information.

 

 

TEC105: SAP HANA Road Map

powered-by-SAP-HANA.jpg

Another session I would recommend watching is the one covering SAP HANA Roadmap. HANA has been around for the last 5-6 years and was the main topic of SAP during that time. HANA is also one of the most drastic changes SAP had ever made to their platform, and it is the foundation for all the new projects that are happening at SAP right now, to help your company become more digital (like S/4, VORA, and more). This session was interesting because of its structure; it covered subjects for various types of professions—developers, Basis admins, data architects, UX experts and more—emphasizing what was present, and what would come in the future of HANA for all these departments. It also focused on many Basis tools and tools for developers, regarding how to handle SAP HANA, how to use it and how to leverage it, both at present and in the future (near future and further away). This session also covered Multi-Tenancy, which was pretty new on HANA and interesting to hear about.

 

 

The Internet of Things and Its Impact on Corporate IT

network-782707_640 (Custom) (2).pngThe last session I recommend watching covers Internet of Things and Its Impact on Corporate IT. Internet of Things (IoT) was one of the hottest topics at SAP TechEd 2015. This session was very impactful as it showed how IoT could reinvent businesses in helping them build new offerings, help their customers, and make a tremendous impact. One of the most surprising things for me was a piece of data brought in this presentation, according to which the impact of IoT would reach more than $10 trillion within the next 10 years. Hence, when you are planning for the future of your company, then IoT is not something you can ignore.

 

The session brought many real life examples, like capturing data during driving and then analyzing it using SAP HANA; or an interesting project done at the city of Hamburg in Germany, collecting data from trucks and other vehicles used, analyzing that data, and managing to cut down 5,000 hours of trucks’ driving, which obviously saved a lot of money and also contributed to the environment. It was an awesome session that made me realize that all of these IoT gadgets were not just geeky things, but rather the new way that IT could help your business.

 

 

The above sessions are just a small sample of all the content and recorded sessions from TechEd, that you can review online. You can find all the replays at: http://events.sap.com/teched/en/sessions.aspx?Year=2015

 

---------------------------------------------------------

Mickael Cabreiro is a Senior SAP Basis Consultant, currently based in oXya’s Montreal, Canada office. Mickael joined oXya in 2008, and has consulted to customers across Europe and Americas, capitalizing on his multilingual proficiency. oXya was acquired by Hitachi Data Systems in early 2015.

FastPacedImages2015.jpg

Hitachi is a proud Gold Sponsor for SAP FKOM 2016 from Jan 26-29 in Las Vegas.

 

Here is a line up of our powerful SAP solutions we will be discussing at the event:

 

Enterprise Real-time Visibility and Analysis
A Hitachi SAP HANA and Business Objects Solution. Ties together transactions, operational data, and financial planning in real-time.

 

Industry Cloud for SAP Solutions
Hitachi Managed Cloud Industry Solution. Subscription based, supporting Process, Discrete, Wholesale and Consumer industry verticals.

 

Telco Churn – Real-time Predictive Analytics Solution

Hitachi’s end-to-end Predictive Analytics Telco Churn solution - on premise or in the cloud.

 

SAP HANA Analytics Platform
Optimizing Big Data Analytics using SAP® S/4HANA with SAP HANA® Vora and Hadoop

 

SAP and SAP HANA Application Hosting and Remote Services
Hitachi Cloud-based SAP and SAP HANA hosting services including analytics solutions

 

Zero Zero SAP Project Financing
0% Financing. 0 Payments till Go-Live or for the first 6 Months* Payment terms aligned with project milestones. All hardware, services, and software/maintenance bundled.

 

Compute and Storage infrastructure solutions for SAP and SAP HANA (Appliance and TDI)

Hitachi offers enterprise-class business and analytic applications, database, compute and storage for SAP that deliver future-proof scalability, five nines availability and unmatched agility.

SAP is one of the most mission critical applications for any organization, because it runs the entire business. An IDC study, published in 2015, determined that for a Fortune 1000 company, an unplanned downtime of a mission critical application will cost between $500,000 to $1,000,000 per hour of downtime. In fact, I can think of many organizations for which the damage resulting from unplanned SAP downtime will be significantly higher than $1 million per hour, because these businesses rely on SAP for conducting their ongoing operations, including sales.

 

I recently had a discussion with a colleague, regarding the main causes for unplanned downtime to the SAP environment, and how these can be prevented. That discussion led to writing the blog you’re now reading. The content and advise in this blog are based not only on my own 8 years of experience as an SAP expert and team leader; this blog is based on the accumulative experience of nearly 600 SAP experts working at oXya, a Hitachi Data Systems company, specializing in managed services for SAP customers. oXya has been managing SAP environments for enterprise customer for the past 18 years; we are currently managing SAP environments for more than 260 enterprises around the world, with more than 250,000 SAP users.

 

I’ll begin with listing some of the most common causes that lead to a crash of SAP, and then move to some proactive steps that can be taken, in order to avoid these crashes.

 

What are the main causes for downtime of an SAP environment?

 

Outages may occur in the SAP Production environment due to various causes. The following list describes the most common causes we’ve seen over the years for unplanned downtime of SAP environments:oXya14 (Custom).jpg

 

Lack of Monitoring. This is by far the leading cause for outages of the SAP Production environment. For example, we have customers that, prior to working with oXya, never used to monitor their servers, not even the Production servers. By “monitoring the servers” I mean that disk drives were not being monitored, so that when drives were running out of space and there was no storage space left on them, no one was aware of that; if we’re talking about a hard drive that is crucial for the database operations, and the database could not write to the hard drive, then that would bring down the entire Production instance, causing users to not be able to do their work.

 

Hardware issues. When there is a device failure on the system’s server, whether this is a hard drive failure, a motherboard issue, or any other type of general hardware issues, and there’s also a lack of redundancy in place – this can bring down the Production instance.

 

Network related issues. Network issues can occur, for various reasons. This can be hardware-related, such as a case in which one of the firewalls or switches has an issue, but it doesn’t have to be hardware related. What happens when a network issue occurs is that the users are blocked from being able to access the SAP system. Hence, even though the SAP system itself is up and functions properly, this type of issue is considered an outage, because the business is not able to connect with the SAP systems, and users are unable to do their work. In many cases, network issues occur due to lack of redundancy, meaning systems that are not built with sufficient redundancy in place. If there is a single point of failure, somewhere, and that single point does fail, then it will cause an outage or outage-like consequences. The key here is to identify and minimize single points of failure, by as much as possible.

 

Performance related issues. Performance issues are is not necessarily an outage, yet it can impact the users to a point where it makes the SAP system unusable. For example, during end of month activities on an SAP Production system, there are critical tasks that must be completed within a certain timeframe. If there is a performance issue, such as database performance or something similar, the business is unable to complete the work within the allocated timeframe. This is considered an outage to some extent, because the system is not working as intended; the system does not provide what it needs to, even though it is still working and didn’t crash.

 

Insufficient proactivity or reactivity to the system. From a pure technical perspective, and this is also tied to monitoring—if there is a critical issue and this issue is not resolved, then down the line, if continues to be neglected, there’s a high chance that it will come back and bite your behind. This is one of the sure ways to guarantee unplanned downtime. Let’s begin with an example for what I call ‘lack of reactivity’. For example, assume that we observe a disk that is on its way to getting full. It still has some space so we’re not taking any steps now. Then, it indeed gets full and causes a system crash. This type of crash can fall under the ‘lack of monitoring’ category, but it’s also the lack of taking the necessary steps when witnessing possible things that can cause issues. This is lack of reactivity.

 

By lack of proactivity, I mean making sure that alarms are not being triggered on the system. For example, in SAP there are health check jobs that need to run – clean some tables, and so on. If these health checks and house cleaning jobs are not being performed, it can lead to issues that cause downtime. The proactivity here means that at oXya, we have people who are handling customers’ systems on a daily basis, making sure that the systems are in healthy state. If this piece of proactive treatment is missing on your Production environment, it can certainly cause issues and even downtime at some point.

 

Testing. We view testing as major cause for downtime, and it has both technical and business aspects. From a technical aspect, patches and updates (software, firmware) need to be applied, in order to photo_58702_20151221 (Custom).jpgkeep the system up to date. If a system is out of date (patch wise), there can be a software failure that causes the system to go offline. Lack of testing, when applying a patch to the Production system, can cause issues. For example, let’s say that there’s a Windows patch that needs to be applied to the server, because it’s a critical software patch that fixes a known bug. Let’s further assume that this bug is known for its ability to crash the SAP system. The correct way to perform testing is to apply this patch to your Dev environment and test it for a week, to make sure that everything is working fine. Then, you install the patch on your Quality environment, do another week of testing to make sure there are no issues. Only then, you install the patch on your Production environment.

 

We’ve seen cases where this testing routine was not performed, and the patch was applied simultaneously to all three systems (Dev, QA, Production). If this is done, you’re at risk of discovering there’s a potential issue with SAP and this patch—for whatever reason the patch causes an issue with the SAP application, which can lead to an outage.

 

The key is to perform rigorous testing whenever a change is introduced to the SAP environment, whether it’s a patch, a functionality that is being implemented by the business, etc. Such changes need to be gradually applied and thoroughly tested, with large-enough time window, so that any issue is caught before the change hits Production.

 

Updates & patches. Keeping the system up to date should be a top priority for any SAP team; unfortunately, we see many systems that are not kept up to date, and this is especially the case with Production systems. We do ‘understand’ the reasoning for systems that are not up to date; downtime, especially planned downtime where you have to bring a system down to apply a patch, is not something that can be easily provided by the customer/business. Hence, it’s always a challenge to keep the systems up to date, whether it’s an operating system patch, a firmware update, or any other type of patching that needs to be applied.

 

If the entire system is not kept up to date, there are potential bugs or issues that can occur, and can bring down the system. We’ve seen that in the past – there was a bug in the Windows environment that crashed systems; this occurred because a specific patch, that was released two years earlier, was never applied to the systems that crashed. If patching and updating of the environment is neglected, this can (and most likely – will) eventually cause downtime to your SAP environment.

 

Unplanned Downtime & Human Error

 

The majority of unplanned downtime events can be tied to human error. Whether it is the result of lack of planning, lack of testing, lack of reactivity or proactivity actions, and so on. Of course, downtime can also occur without human errors, like for example if there is a circuit outage at the datacenter, and the entire datacenter goes offline. Such a case would usually be considered as force majeure; and still, one can say that even a circuit outage can be considered a human error, because something wasn’t tested, and/or something wasn’t done right, and/or there should be a disaster recovery (DR) site so the entire system (spread across more than one site) never fully crashes, and so on. However, at the end of the day, in such cases it’s difficult to put the blame on human error, because there are a lot of teams and many elements in play.

 

Some items are usually attributed to human errors and some aren’t. The items that are usually attributed to human error are:

 

Performance issues. Let’s look at an Oracle database, for example. The database has tables, and the tables have indexes. If an index is in ‘bad shape’, and hasn’t been rebuilt (this was not caught by the database administrators), then this can lead to performance issues. Furthermore, it can lead to a scenario where it is the end of the month, and the business cannot run specific actions on time. This causes headaches and ‘downtime’ for the users. If the administration team, through their proactivity and monitoring tools, found that this index needed to be repaired and caught that before month-end, then this downtime can be avoided.

 

denver_140821_3258_hi (Custom).jpg

Patching (or lack of) issues. If a crash took place, it resulted from a bug, and that bug was known and there was a patch that was not installed, then that also falls into the human error category. Such patches should be installed, after scheduling with the customer.

 

Lack of DR and/or single points of failure. Many Production systems have disaster recovery and replication in place, especially for enterprise customers, so that even if the main Production system goes offline, it is brought up on the replicated DR site pretty quickly. Still, there are many smaller companies that do not have a DR site. Such companies often host their SAP onsite, at their own offices, sometimes in a small onsite datacenter. Such installations typically have many single points of failure, such as not having a generator and relying on a single source of electricity; not having DR and replication for your system; having just one piece of network device that failed; and so on.

 

If we take an overall, honest look at human error, we have to admit that mistakes do happen. For example, administrators have above-average access to the system, so they can accidentally delete files, shut down systems, and so on. It’s not unheard of that an administrator had deleted a critical piece of software, which was a must for the system to be up and running. So, human errors along those lines can and do cause outages.

 

The constant struggle is to have some sort of a system in place, that minimizes such human errors by as much as possible. This involves having second checks in place. I can tell you how we operate at oXya – if there’s a critical action that’s being performed to a Production system over the weekend, for example, we don’t let a single person perform that action. We have a minimum of two experts, and sometimes three, working on the same item. The purpose of having multiple people working together, on the same item, is to double check everything and go over the actions that any specific person would take, in order to eliminate human mistakes by as much as possible.

 

Through these types of check systems, we can eliminate or at least significantly reduce the types of human errors that occur. At oXya, there’s a second person assigned to checking all the work being done, especially when dealing with the customer’s Production environment.

 

How can unplanned downtime events be minimized or omitted?

 

I’ve already provided some advice earlier, like always having more than one person when applying changes to the Production system; or keeping the systems up to date.

 

Taking the ten thousand foot view – the key to minimizing downtime is proactivity. All of the monitoring and automated checks that exist on all systems (or for most) is great, but it’s not enough. You need something extra on top of those automated systems, and that something extra comes from the “human touch”. Through proactivity, and the experience that oXya has in handling these sophisticated landscapes, we provide this extra layer of security for our customers.oXya1 (Custom).jpg

 

What do I mean by “human touch”?

 

At oXya, of course we have all of the possible monitoring on the Production landscapes, including automated monitoring. However, we’re not settling with these alone. The “human touch” aspect is that on a daily basis, we have people logging into these systems, performing manual daily checks, and making sure that the systems are up, running, and in good condition. We believe that in addition to the automated checks, that are very thorough, what really ensures that everything is being caught is having human interaction with the systems, on a daily basis; and if something is being caught, then it is handled immediately by SAP experts.

 

Furthermore, testing is very important, especially with patches, to prevent unplanned outages. Make sure that both the hardware (firmware) and software are up to date, and that these updates are done through rigorous testing. If there’s a database patch that needs to be applied, for example, we would coordinate that very carefully with the business; if there is a sandbox environment then we’ll do that first, to have less impact on the business.

 

To summarize, I see four key elements to omitting unplanned downtime: patching, monitoring, reactivity to issues, and lots of proactivity. These are the key elements that oXya has been following over the years, and that have led to having lots of success with our customers.

 

------------------------------------

 

Sevag Mekhsian is a Service Delivery Manager at oXya, a Hitachi Data Systems company. Sevag is an SAP expert with more than eight years of experience, the last four of which leading a team of team of ten SAP professionals, at oXya’s service center in New Jersey.