Since my last blog on UCP for Oracle an Exadata Expert at Oracle was asked for his comments by a customer who wanted corroboration of our claims. Rather than talk to us the expert gave us a great compliment and made some assumptions (some of which were wrong) and consequently gave some inappropriate advice.
This presents us with the opportunity to explain our unique infrastructure and features more clearly and in more detail. It also highlights the incorrect assumptions made by Oracle in their findings.
Let's take a look at a summarised version of what Oracle claims:
"Thanks. I am familiar with the HDS solution, although it doesn’t look bad at first sight, [however] it is a solution with a couple of serious issues.
- In short, what is the differentiator of the HDS box? An enormous amount of flash memory and ASM to make use of it, the remaining components [merely] support this. What is good is that the management is integrated [with Oracle's native toolset] through API’s with EM and the existing API’s for their storage arrays. It is -x86 blade based, a chassis with a maximum of 8 blades and space for SSD’s (including Fusion IO flash cards), placed in separate enclosures or "Flash Containers". This actually is the core of their solution.
- They use Oracle software solutions and plugins/API’s including Management with Enterprise Manager and ASM to tier the DB over the flash layers. The Key differentiators are the large amount of flash and that this solution integrates with the database and supporting applications from Oracle software solutions in the right way.
- With each generation HDS uses other x86 components. First the IBM PureSystems OEM chassis- and servers, now they have a Cisco UCS partnership, but this picture(solution) suggest again another x86 OEM vendor. Continuity seems not to be a priority on their list, that will generate issues for support and future extensions or changes.
- they talk about "Intel virtualisation technologies" It seems they have adjusted the firmware of the servers to make use of Intel VT. Perhaps similar to what Cisco has done. The SMP parallel test results points into that direction. It’s a kind of proprietary solution, one that nobody else uses. How long do they think they can continue with this?
- Hitachi LPAR cannot be used for capping license, the document suggests this immediately after the slide on "Reducing Oracle licenses." by using LPARs. 2 slides below it suggest to consolidate your workload and thus consolidate your licenses
- old versions of the Oracle database do not automatically recognize fast memory like flash. You have to manually arrange that.
- Then on the POC results: Those are fully based on using large amounts of flash, and only based on IOPS. Which actually is referred to on slide 28, it’s all about throughput. Flash is not a generic performance improver; extreme enhancements only occur in random read.
- Sequential read/write, and random writes benefit less from flash alone. For that you require other techniques, as we offer in our Exadata software layer.
- The value of this you see back in the POC results: an FR Exadata is faster than the HDS solution with a much lower quantity of flash. Also surprising: an FR Exadata is in the POC 9x as fast as a QR. You would expect a 4x improvement ( 1/4 tov full rack). Obvious a case of selective testing, in this case a test which doesn’t fully exploit the Exadata SW but obviously only flash.
- And it’s the question if Exadata is optimized for this type of workload
- From a configuration perspective a 1/4 HDS UCP is compared with a 1/4 Exadata. The HDS has 8 blades in a chassis, so 2 blades in a 1/4. So half of the number of database servers and also half the number of cores. In that way you save on your licensee costs
- In short, a machine with much more flash than an Exadata, like a Violin solution only suitable for IOPS enhancements.
- Nice that they use the Oracle SW integration capability.
So if we are to take this point by point the reality is this:
- 1. Flash is not the core of the solution, although, as correctly pointed out is an important part of it. However, using an HUS VM with HDT and HDP you get Exadata-like performance on any version of Oracle and save on storage costs. (This is something that Oracle do not do).
- 2. He got this right and we agree, native integration makes it simple for the Administrators of the DB and the applications. However, Fusion IO cards only are part of our HybridIO stack, we also deploy different types of drive in the controller too, including FMD (or not), according to the application needs and then split the workload using Oracle tools according to the application requirements.
- 3. Again, this is an erroneous judgement; yes we do work with UCS from Cisco, and we will soon integrate it into UCP Pro (for Vmware and Vcenter), however, for these tests and for UCP 4 Oracle we use Hitachi Servers, precisely because we can use the feature set criticised in point 4.
- 4. Hitachi indeed works with Intel but it does this on its own x86 servers that have a unique feature set born out of the mainframe heritage. The feature set includes SMP, LPAR and HybridIO. Since Hitachi builds its own servers and has incorporated this technology it will continue to support them as a unique and value adding feature set. These features help with improved utilisation, improved QoS and hooking into the orchestration tools in UCP Director.
- 5. We do this by consolidating RAC via n+m or using an LPAR and right sizing based on usage. We allow the customer to consolidate servers, protect them and in the end use less licenses, where we can eliminate a RAC node, we reduce license fees. The N+M technology allows huge ROI because you can eliminate non product RAC nodes and have a cold failover.
- 6. We make it simple, just store it all into SSD (with no tuning) or if you need more put an HUS VM All Flash Array (AFA), use Hitachi Dynamic Tiering (HDT) and you will a huge benefit without needing to migrate to another Oracle version. There are options.
- 7. Actually not true, our tests show otherwise. Also, these were customer tests working against their applications, and met the criteria required (IOPS and otherwise). We used our storage on the backside and still got Exadata-like performance without all the costs and with much more flexibility (more versions of Oracle 9i 10g 11g.).
- 8. This is an incorrect assumption. The combination of flash (Hitachi's own Flash Module Drive, not OEM MLC flash drives that Exadata uses) the OS and Server cores, Symmetrical Multiple Processing (SMP or the clustering of Cores), HybridIO (splitting the IO's according to read or write) and other controller based technologies, deliver the performance. We see great sequential read/write and random writes with either flash or our own FMD’s.
- 9. All of these tests were actually tests that the customer created and met their application needs from the outset without huge amounts of tuning (see 7 above). They were not some random benchmarks. In the one example Oracle had to send in 4 experts for a week just to get the test to work, they then had to tune the Exadata FR to get the performance improvement, and it was marginally faster than the UCP 1/4 of its size (47 minutes versus 42 minutes). But, for the difference in size and cost it was worth 5 minutes.
- 10. Could not be further from the truth, this is the workload that the customer wants or needs, (we can show Oracle the workload and how it works), we have done it all for the customer, at the customer, with the customer
- 11. A quarter rack has 48 CPU’s, if the customer feels they need that many we can get them closer, 2 Blades would have 40 Cores, which handle what is needed this is all a sizing argument. Hitachi always recommends performing a HiSed (sizing exercise) first. BTW this is something Oracle don't have. Once we have sized the environment (based on application and workload output needs) then we right size the UCP without guessing, if they need more we can do 3 nodes of 16 cores, which happens to be 48…… again not a serious technical issue.
- 12. This illustrates the most lack of understanding of SMP + LPAR + HybridIO with HDS backend storage; the Hitachi environment is nothing like a Violin (or any other SSD device). Hitachi ran UCP 4 Oracle against Exadata and won on performance. The Hitachi benefit is the connection of the flash, the number of cores we can make with SMP, something that Exadata does not have. Using LPARS to put more databases in a secure platform and right sizing them. Most Oracle databases are 80% CPU idle, so would you pay for nothing. Maybe Oracle should give you 80% of your license cost back!
- 13. Exactly, that's what is so impressive about Hitachi's policy of OPENNESS; work with the native code base and it simplifies the life of the DBA and Application specialist.
In summary there are two themes we feel that illustrate the specific environments and then individual responses to each of Oracle’s assumptions
- Both PoCs that we have featured are for Decision support applications not OLTP. IOPS are thus important. DWH applications still working with indexes and not via full table scans. In both PoCs HDS used smaller configurations than Oracle. In the case of the motor manufacturer we had an “M-Size” (80 cores, 20 FusionIO Cards) against an Exadata X2-8 (160 cores and 14 Storage Nodes). HDS was slower than Exadata, but the customer’s target was to reduce from 26h to less than 1h. Both systems achieved this. There was a total runtime difference of 4 minutes, Exadata being 43 minutes and UCP 47 minutes or a 9% difference. The difference in price for the System was 80% and half the number of Oracle licenses.
- Oracle prepared and tuned their own configuration (with a team of 4 Exadata experts). To make the Exadata work the team changed table layouts (with the customer’s agreement), but there was also a need to modify the structure to get to the performance. HDS did not change anything to the setup that the customer uses today. For the transition process (should they choose to proceed with Exadata deployment) this is additional work and risk.
- For the Telco Operator we used only 4 FusionIO cards and the customer’s DBA of Vodacom tried very hard to tune the Exadata schema to beat the UCP performance. With the “raw” IO he failed to do so (but we both had to tune the systems a bit as well).
- Oracle refused to let LPARs have a status of hard partitions, so we have to license some cores in a physical partition. So we looked at other possibilities to save on Oracle licenses using LPAR and SMP. The most important is the consolidation. Almost every HiSED (sizing exercise) shows that OLTP Systems run at an average CPU usage of <20%. By using SMP we can consolidate them to increase system utilisation to 70% this represents a huge saving on Oracle License Fees and Support cost. In RAC environments we can have a higher utilization with N+M using LPAR for cold failover and either reducing the number of RAC nodes or removing RAC altogether.