Storage Economics Dojo
by David Merrill on Dec 1, 2011
Storage Economics at HDS is approaching its 13th year, and along with the thousands of engagements done around the world, we have prepared many people to think like an economist, talk like an accountant, and act like an architect.
Hundreds of HDS employees, partners and even some customers have taken the path of IT economics enlightenment. Part of this includes a formal learning path, much of which is available through the HDS Academy. I am proud that so much of our economic material (theory papers, case studies, tools) are in the public domain, and that so much training is available to partners and customers on this subject matter.
There is a formal learning path, as outlined below, and it is a mixture of formal training, informal virtual classes, and select readings. Customers, Partners and HDS employees are encouraged (when applicable) to take the white and yellow-belt courses. Blue and black belt levels tend to be reserved for practicing consultants and specialists. HDS Platinum partners are invited to become certified up through the blue belt level.
|Classification||HDS Academy Course||Additional Requirements|
If you are a frequent visitor to my blog page (and can prove it) I will put special recognition ‘bands’ on your belt as well.
- Cloud economics
- Data center economics
- VDI economics
- Hypervisor Economics
- Converged Server/storage/solutions
by David Merrill on Nov 21, 2011
Storage economic principles and concepts are expanding (at HDS) beyond the storage realm—where it has grown up over the last 12 years. Data center costs, VDI, converged platforms, servers and hypervisors infrastructures all have the same requirement to identify and reduce unit costs.
I have collaborated with hypervisor guru Michael Heffernan (his friends and colleagues call him “Heff”) to create a baseline paper on Hypervisor Economics. The framework for VM cost determination and reduction is very similar to storage economics:
- Understand and determine your costs. Storage economics has 34 types of different costs (OPEX andCAPEX), and VM economics shares some 24 of these cost categories.
- Measure your costs – you cannot improve what you cannot measure, so before undertaking a VM sprawl cost reduction plan, you better know what your total costs are today. A TCO baseline is normally the place to start, with a TCO-per-VM a common and simple metric.
- Take specific actions to reduce the VM costs. There are technical, operational, process and business ‘levers’ that can reduce costs. These levers are ties to the types of costs that you choose in step #1 above. Your plans and mixture of levers to apply will result in a predictable unit cost reduction.
This new publication outlines these three steps, the types of costs that apply to the VM world, and most importantly the levers to reduce VM costs. The paper is located at this URL below. Please tell us what you think, and if there are parts of Hypervisor Economics that need further development in future publications.
Storage Virtualization – A Foundation for Economically Superior Storage Architectures
by David Merrill on Nov 15, 2011
Two weeks ago, I shared the stage at Gartner’s ITxpo with Paulette Scheffer from Adobe. Our combined message was on the economic impact of storage virtualization. It was a great fusion of theory (me) and real-world experience (Paulette) on how storage virtualization has a direct impact on the unit cost of storage. The presentation was very popular (standing room and over-flow – check out the pics!) and we had several requests to re-record the event.
I am glad to share with you this link of a recorded session with the exact same material (and technical/economic passion) that we presented at the ITxpo.
Enjoy the presentation!
Note: You will have to download the media player to watch the presentation.
What is wrong with this picture?
If you guessed that water and hard drive manufacturing do not mix, you are correct!
Recent news articles are covering the floods in Thailand and the manufacturing impact to disk drives for the remainder of 2011. Some articles are predicting a rise in the price of drives by 10% in early 2012. Other shortages and demands for consumer markets will put pressure on enterprise quantities. I don’t think that rationing is in the forecast, but you may recall the impact of supply/demand curves when there are restraints on supply.
Global events of flooding, earthquakes, or the European debt crisis may seem like a far-away problems for an IT architect or storage planner, but we have to ingest these global events and make provisioning for our own IT plans on how these may impact price, costs of ownership, availability, etc. We have to accelerate internal capabilities to ‘do more with less’ and have some contingency plans in our back pockets if worse-case scenarios do materialize.
There are technology, process, operations and procurement ‘levers’ that can be positioned to offset unforeseen events that may impact your ability to acquire capacity. I have blogged about these for years, but the most popular these days include:
- Storage virtualization to reduce waste and reclaim usable capacity
- Thin provisioned volumes to reclaim capacity
- Compression, de-duplication to reduce backup capacity
- Dynamic tiers of storage to put data in the right place, for the right cost
- Storage cloud offerings (lower tiers, archive) to pay-as-you-grow this type of capacity
- Chargeback, showback and other internal schemes to help reduce the appetite of IT infrastructure capacity
- Consider collapsing or refining the data protection schemes (that are currently in place) in order to reclaim resource. Do you have too much coverage of infrastructure to protect your apps and data?
- Does your procurement process inherently waste capacity with a buy-ahead approach?
I do not want to be a chicken-little and claim the sky is falling, but it is important to keep an eye on the sky, as well as your balance sheet.
The Economics of Data Protection
by David Merrill on Nov 8, 2011
A few of us at HDS are combining blog entries and developing an online storytelling exercise around data protection. My contributions will be around the costs of protecting data and how to measure and compare different data protection methods to find the optimum price/cost/risk/protection point. Claus started our discussion last week with this blog entry.
I will leave the historical and technical aspects to Claus and Ros Schulman to develop over the next few weeks, and I will try to comment where we are seeing evidences of different price points and cost inflections for data protection. Here is a Wikibon article that is also a great reference point—we may refer to it from time to time.
Data protection is a classic cost and economic enigma that has been around IT since its inception. Data protection is not free, but it has to be commensurate with the value of what you are protecting. There has to be a balance between protecting the data at a certain cost, and the potential of increased risk if you do not protect data properly. My next post will describe this risk, cost, data value protection model, to help with deterministic qualities of calculating the right protection scheme given an environment of reducing costs and increasing risks. It looks something like this:
There is an element of time that has to be applied to these various dimensions of the economics of data protection:
- Diminishing value of data over time
- The change and risk over time
- Time value of money
- Probability of data recovery
We hope to stay tuned as several HDS experts chime-in on the topic of data protection, alternatives, costs and risks. We hope it will be a meaningful discussion and dialog for you, as well.
Economic Overlay On The Cloud Part 2 (of 2)
by David Merrill on Nov 3, 2011
The HDS cloud announcement presents cloud solutions in three stages, each with unique opportunities to reduce total cost.
Infrastructure Cloud Elements
- Costs to be impacted in the infrastructure cloud space need to be hard costs. Organizations will not move into a new IT operational model without clear business or operating improvements. Hard costs such as power/space reduction, HW and SW maintenance, and capital spending reduction are all examples of costs that can be targeted in this space.
- Virtualization of storage, servers and operating environments. Virtualized servers and storage provide better asset utilization, thereby reducing the total HW and SW infrastructure costs needed to run the environments. Virtual environments also reduce the cost of application, array or workload migrations and thereby reduce roadblocks to mobility. A mobile or agile infrastructure allows cost rates and performance features to change seamlessly as the time or the environment changes. Alignment of QoS and costs can be better achieved.
- CoD (capacity on demand) is another quality of this cloud layer, where resources are procured when they are needed and can be given back when the costs and resources are not required.
- Cloud Services and solution packages from HDS include specific capabilities to reduce operational and CAPEX costs
- Hitachi Cloud Service for Private File Tiering
- Hitachi Cloud Service for File Serving
- Hitachi Cloud Service for Microsoft SharePoint Archiving
- Quoting Miki Sandorfi, our resident cloud guru:
These solutions are focused on TCO reduction for unstructured data, and are delivered with self-service and pay-per-use capabilities. Here you can continue with already-deployed traditional NAS and complement it with a cloud solution for 30% or more TCO reduction (file tiering); augment or replace NAS filers with a backup-free, bottom-less cloud implementation that still “looks and feels” like traditional NAS but is deployed with next-generation cloud technology; and complement SharePoint environments by offloading a bulk of content into the private cloud. Because customer choice is extremely important, we have designed all of these new solutions to be modularly delivered: customers can purchase these offering as cloud packages and build their own cloud around it. They can optionally enable self-service and billing/chargeback by electing to deploy the management portal. Or we can provide fully managed solutions including a true OpEx pay-per-use consumption model with no upfront capital expense to the customer.
- Cost reductions in the content cloud move from on-the-floor infrastructure costs to data and operational impact areas that reduce the cost of doing business. These cost area can be hard as well, and take time and effort to find data. This can be emphasized in litigious environments and hospital environments where data discovery and assimilation is timely.
- Discoverable, searchable content has to be dollarized in terms of the time and effort to locate meaningful data. Content cloud provides a layer of intelligence to make the data more accessible and useful to the business.
- Risk reduction is another cost reduction in the content cloud, by avoiding lawsuits, mis-managing patient records or patient data can impact the cost of risk. The value of unified data and timeliness of discovery and ingestion is a cost-to-the-business measurement for the content cloud.
- Moving higher in the cloud abstraction, the business value of having analytics will provide new opportunities to not just cost reduction, but more likely increased revenue generation. Big data and the ability to span and correlate data from the far reaches of the storage cloud infrastructure will enable new business models, and therefore new business segments and revenue potential.
- The cost savings or revenue generation may be softer in this cloud layer, but the business impacts are real. As an industry we will need better metrics and methods to value new cloud services and data services enabled with information cloud capabilities.
- People will pay for meaningful and rich data, and for rich/timely information. Whether as a subscription service or an internal harvesting of information, this level of integration will provide the richest business value proposition over time.
So the cloud will impact our costs. Some costs are hard, others are softer and in the future. Moving to commercial or internal cloud services may (in some cases) just transfer the costs without an actual reduction. As you move into or evaluate cloud strategies to reduce costs, you need to ensure that you know what your costs are, how they behave now, and how they might change as part of a cloud transformation strategy.
Want to read more about our Cloud Roadmap? Visit our bit.ly bundle here: http://bitly.com/pCt5Gk
Economic Overlay On The Cloud Part 1 (of 2)
by David Merrill on Oct 27, 2011
Earlier this week, HDS announced new services and features for our three-tiered cloud strategy. I would like to put an economic overlay on these offerings and the HDS approach to cloud and the Information Center.
In my opinion, every major IT transformation needs to have a gut-check against the costs (new and old) and their benefits in any major transformation. I do not want be a wet blanket with all the interest in cloud transformation, but I believe we need financial and business case justifications along with, and supporting, the cloud transformation. I posted a blog a few weeks back warning about transferring and not reducing real costs with a cloud approach. I think this topic needs more substantive documentation (sounds like a new white paper might be needed) to ensure that the realities of growth and reductions of costs are met with any new IT transformation investment.
A few of the most demonstrable economic points to the cloud story are provided in the outline below:
Cloud Qualities – Analysts have written a lot about the qualities and characteristics of a cloud approach, saying they must be:
- Metered for consumption and utilization. This also implies that a charge-back or payment system has to be part of the metering process. We still see very, very low adoption of charge-back or any type of metering in today’s data centers. This is an important quality and business change since most IT consumers are not used to paying for what they consume. This may present some initial push-back to a cloud approach when projects or application owners have to pay-per-use.
- Virtualized. Virtualization is a common building block—for servers, storage, networks and even the end-user device. Storage and server virtualization has been proven economically over the past few years, and will continue to provide the scale and management requirements for the cloud.
- Self servicing or self-provisioning. This will be a new concept that will be tied to metering and charge-back. Applications and unchecked data that can now grow without limits (essentially) will have financial and economic tethers immediately controlling the appetite.
Delivery Options are often used to describe how we will consume virtual cloud resources.
- Private solutions will still require local effort and resources (power, cooling, management labor) to operate the assets. Many private cloud offerings will differ very little from current capitalize-and-own operational models.
- Hybrids will bring a new option to scale (with cost) to an external resource. This flex capability will blur some of the operational cost lines that exist now with traditional IT services. When engineering the hybrid cloud it will be essential to know what costs stay local and what additional costs will be incurred with the hybrid capacity. I see some surprises in hybrid cloud solutions where the total gross costs increase (not decrease) with the variable scale hybrid approach. This is not all bad, as there will be a real business benefit to this capable flexibility, which is worthy of the additional cost.
- Public – lowest cost – with some new and often highest risk. There is a general risk that exists with moving data, processing or applications to the external world. Recent issues with cloud outages highlight this issue. Gartner published a reality based report worth entitled: “The Realities of Cloud Services Downtime – What You Must Know and Do“. There is lots of news on the topic, so be sure to balance the asset lower cost with offsetting costs in terms of risk, availability or performance.
In my next post, I will review the economic points of the services and solutions HDS has recently announced.
Want to read more about our Cloud Roadmap? Visit our bit.ly bundle here: http://bitly.com/pCt5Gk
by David Merrill on Oct 21, 2011
My son sent me this article on psychic benefits and the current NBA lockout (although it can be applied to any sports management and labor conflict). There are some interesting points in the article that can also be applied to IT psychic benefits. Let me explain…
Over 20 years ago, I worked for a US Defense Industry contractor, developing high tech missile and avionics systems. These projects were often shrouded in top secret protections, until the weapon systems were prominently shown on CNN as part of some US-led coalition attack somewhere in the world. Anyway, back in those days we had a love-affair
with high end computing—not with VAX or UNIX systems, but with the ultra-elite Cray and Convex supercomputers. I had frequent access to these systems to perform thermal analysis, blast calculations and flight simulation programs that we would time-share on systems. Management in Dallas was so enamored with these system purchases, they built a glass display case data center exclusively for them— this became the showcase for any high-ranking brass that would come for a tour of our facility. Run-of-the-mill VAX and mainframe computers were boring to look at, but these supercomputers were fun to display. There was definitely a psychic benefit of owning one of these systems.
We just don’t build computers that look this cool anymore.
Even today, I find people have a strange psychic attachment to some storage systems, storage architecture and infrastructure. This attachment can get in the way of progress in terms of utilization, costs, performance and overall effectiveness. Some of these attachments come by way of personal investment or development of the systems. Sometimes someone had to ‘put their neck out’ to justify these a few years ago, and cannot lose face by letting them go too quickly. Overcoming technical or operational changes are rather easy compared to emotional attachment.
So, let’s get the owners and NBA players back in the stadium sponsoring games. And let’s not let emotional or other attachments get in the way of IT infrastructure improvements.
SAN ROI 12 Years Later
by David Merrill on Oct 14, 2011
One of the first pieces of IT economics work that I did at HDS 13 years ago was on the ROI of Storage Area Networks (SAN). At that time HDS had started supporting and re-selling FC SAN products from Inrange, Troika, Ancor, CNT, McData and others. HDS had ESCON solutions for years (at that time), but the open systems SAN market was just emerging. Directors, switches and HBA were new, unknown and expensive. From some old notes and proposals that I had kept – the purchase price of some of these devices looked something like this (at the turn of the century):
- 64 port FC director ~$200K
- 16 port switch ~ $27K
- HBA for a Unix server ~$3K
ROI and economics for SAN were essential since the costs were high, and the justification for the connection cost was fairly difficult. It was not difficult to price and configure a basic core/edge SAN proposal with about $700K of new SAN hardware. I found an old paper I authored 10 years ago on SAN ROI that put together the rationale for this new fangled connection technology, and how to help management swallow (justify) these new connection costs.Now, 10 years later, SAN costs can still be a significant cost in storage TCO, even with much lower unit costs. When I talk with customers about average SAN costs, it is reasonable to use a $2K per port cost for FC SAN connectivity. Some have told me this unit cost can be higher. This cost for acquisition (not operational cost) of a single FC SAN port @ $2K includes:
- The local HBA (just 1)
- Fractional port cost on the edge switch
- Cabling and patch panel
- Fractional cost of the director or ISL switch
- Fractional port cost on the storage array
- Software to manage the port.
If you have a server with multiple HBA, then the cost will of course be multiplied.
One of our large retail clients in North America has over 8,500 SAN ports consuming CAPEX and OPEX costs that make up about 30% of the unit cost of storage (TCO/TB/year). Clearly any effort that can be expended to identify and reduce un-used ports (anywhere in the fabric) can have an impact on total data center costs.
HDS has a technology partner, Virtual Instruments, that has scanning and workload monitoring software to help identify and recommend SAN consolidations. They told me about a recent project where they were able to do a network scan of the enterprise storage network and recommend a storage-side port reduction that saved/reclaimed hundreds of ports, worth tens-of-thousands of dollars.
SAN costs were expensive (so therefore relevant) 10-12 years ago. It seems like the opportunity to identify, measure and act on SAN costs is still important and relevant, even today.
TCO Reduction: A Customer Perspective
by David Merrill on Oct 5, 2011
I have the opportunity to speak at the Gartner ITxpo in Orlando two weeks from today, on Wednesday, October 19th. If you are attending, would love to see you at the session and exchange cards afterwards. My topic will focus on the impact to TCO that can be achieved from storage virtualization. You can find several related white papers and blogs on this topic that I have worked on over the years.
The highlight of participating in this session will be that I get to share the stage with Paulette Scheffer, Sr. Director of IT Infrastructure and Service Management at Adobe. She will review their multi-year journey of storage cost reduction, and the role that storage virtualization played in their achievement. In a four year period that produced 64% year-on-year storage capacity growth, Adobe was able to reduce unit cost (TCO-per-GB/year) from $11.00 to just under $5.00. Now this unit cost is much more than the purchase price, and incorporated some 15 types of cost that include depreciation, maintenance, labor, DR infrastructure, circuits, SAN, management, migration, cost of waste, NAS infrastructure and file servers, management tool, power, cooling and floorspace. They also measured the cost of DR and outage risk; costs that not too many companies measure or track these days.
I do not want to step on all of her talking points, but this is a compelling graphic on the unit cost achievements over the years at Adobe (note that TCO measurements were not done in 2010).
You will also notice two measurements in 2011 for unit cost. The first is 2011P and is the unit cost per physical TB. The second is 2011V and is the virtual TB unit cost. Adobe is quickly moving to deploying and presenting virtual capacity for most applications.
Since not all of my readers can attend this conference, I will dedicate two or three blog entries later in October and November to summarize my presentation on virtualization economics, as well as Paulette’s presentation on what cost reduction benefits have been achieved at Adobe.
by David Merrill on Sep 23, 2011
Actually the title of this blog should be “Cost Savings Sharing,” but that is not a very catchy way to draw you in.
But, now that I have you…
As I work with customers around the world, it is increasingly common for medium-to-large IT shops to use third party contractors, system integrators, outsources or cloud providers to enable some of their IT operations. When it comes to storage, servers, VM, backup infrastructure, there are a variety of arrangements of asset ownership, labor allocation, task allocation, etc.. Therefore, when we embark on a cost reduction plan, it is important to determine whose costs are being impacted and targeted.
HDS has defined 34 different and unique costs associated with storage and IT ownership. We also are tracking some 29-30 different ‘levers’ (technology, best practices, business operations, people skills) that can reduce costs. So when we narrow down the costs to be reduced, and the options to reduce these costs, there is often another step to determine whose money will be saved at the end.
As an example, if a customer undertakes a plan to increase storage utilization through thin provisioning, de-duplication and dynamic tiering, they may be surprised to find out their investment in these changes will save costs – but not necessarily for them. They may have a SI or outsourcer that owns and manages some of the storage capacity; the changes that they pay for will benefit the architecture and utilization rates of someone else’s infrastructure. If they want to eventually see the savings, they might have to re-negotiate the contract. I am not saying this is a bad thing, but most organizations want to reap what they sow, or recognize the cost savings to their own budget first, before helping an external provider.
One of the steps in our Storage Econ methodology (that our consultants would do in a workshop) is to:
- Identify the costs
- Measure the costs
- Determine what levers would impact the cost
- And then determine the recipient organization of the cost savings
Step four is often overlooked until the transformation process has started. If you really stop and measure point four, it may impact your choices in step three.
Go ahead and be selfish in this area. Your budget and local economics may depend on it.
Zen and the Art of Storage Maintenance
by David Merrill on Sep 19, 2011
Most of us have a car, or once had a car, and know that an oft-overlooked function is maintenance every 5,000 miles (8,000 km) as the manufacturer recommends. Due to this omission, an automobile will deteriorate – according to the 2nd law of thermodynamics (or law of entropy) and the deterioration in thermodynamic systems. This deterioration will result in accelerated engine wear, poor performance and worsening gas mileage.
You may be wondering how this relates to IT systems. Are they thermodynamic?
I am working this week in Hong Kong, and met with a banking customer here. Part of our discussion turned to the time and effort they expend to monitor, optimize and sustain the storage system. The CIO quickly stated that they do not do this. The few engineers and architects they have work on larger projects, and have no time on this ‘maintenance’ work.
That phrase caught my attention and the correlation to automobile maintenance (not to be confused with Zen and the Art of Motorcycle Maintenance). The systems seem to work okay, but the company cannot or will not afford the maintenance to the systems, and seeing it as an added cost (unjustifiable most of the time) to the IT operations.
I often find that IT organizations do not have the luxury of regularly scheduled or continuous optimization and maintenance of the storage infrastructure. So what is the impact, i.e. what is the wear-and-tear or lower performance from ignoring this task?
- Higher degrees of waste (storage capacity resources)
- The purchase of more capacity than what is needed
- Imbalance of performance, cost and availability
- Over-engineering to provide a common view of capability, regardless of cost
- Performance hot spots that come and go, and are difficult to trace
- Problematic troubleshooting when problems do arise
- External resources have to be brought in for identification, remediation of problems
And the list could go on….
The Need for Remote Operations Service
A popular option now available from many vendors is a remote operations contract or service. This is not outsourcing or management per se, but rather a service that tracks and monitors operations, OLA, SLA, bottlenecks, SNMP alarms, etc. This type of service can graduate into a MSU offering, but it does not have to.
Since the labor is remote (where costs of labor are much lower than your own system engineers) and the monitoring can be ongoing, the costs can usually outweigh the downside of a poor performing system. You would have to do the calculations (I will help you) to see if the improved performance and utilization improvement justifies this effort, but for most IT operations (somewhere around 200TB or larger) this does make very good business sense. There is also the impact to relieve some of the nagging tasks from your on-site engineers and architects, leaving them for the more pressing work of design and IT operations that do require the local presence and expertise.
HDS has such a remote operations service. Most storage vendors and outsourcers also provide this service. Be sure to differentiate this remote monitoring service from a remote management service, as there is a big gap in price and functionality between the two.
How do you maintain your storage system?
Go Figure – This Economy…
by David Merrill on Sep 15, 2011
The other day I ordered this poster to hang in my office from one of those de-motivator sites…
Many people have predictions about the second coming of another recession, QE3, QE4 (what does queen Elizabeth have to do with our economy anyway), and long-term growth in a number of emerging countries.
(You can choose to listen or ignore the pundits)
Your micro economics will be tied to global and national economics, and your company’s mico-economics will impact your IT economics. This is the nature of trickle-down economics.
There are some proven and stable principles related to IT economics that are important to help you navigate current and future trouble-spots. Here are my top 10 ideas (for this month) if you are looking into a cost reduction exercise for your company’s economy:
- Price is not cost, so focusing on price reductions alone will impact cost very little. The price of disk is only around 15-17% of storage TCO. So, even if you negotiate a 10% or better price, that effort will only impact 1.5 – 1.7% of the TCO.
- Focus on cost reductions in a down economy by pulling a fine-tooth comb over operational costs. Labor, power, cooling, maintenance, data protection and DR protection all need to be reviewed to determine where costs can be trimmed.
- Save some bullets for later – a cost reduction strategy is one that is planned for over a few years, two at a minimum. Take care of the easier cost reduction items first, saving the harder ones (political or organizational) for later.
- Do the easy things first that impact total cost, but not all of them – remember to save some easy cost options for later next year, too.
- Isolate and work on costs with those people that measure (or care about) and own the costs. You need alliances in cost reduction planning, and the IT department may not directly own all the related costs.
- Beware of quick-fix solutions (and by the way, most solutions to reduce costs are not free) that appear too good to be true. Make sure that vendors are helping you reduce the costs that are important to you – not the costs that they can achieve for you.
- Attitudinal and political changes are the most difficult to implement.
- Changes often require rewards and penalties, so factor that into your plans.
- Understand the differences of hard costs and soft costs
- Measure twice, cut once – make sure that you can measure now and measure later what you can impact.
by David Merrill on Sep 6, 2011
VM cost reduction tactics need to be tangible and actionable—hope is not a strategy. A systematic approach to employ proven tactics to reduce the costs of a VM yields the best results. When roadmaps and IT strategies are aligned with actionable cost reduction plans, OPEX and CAPEX reductions can be more predictable. There are several levers available to IT architects and planners to reduce these costs. Some levers can be purchased from a vendor, others require an internal review and change to operational and business processes. These levers include:
Hypervisor Selection: This is one of the fundamental choices a customer must make for their VM environment – “the bigger the ecosystem the more flexibility for choice.”
• Scalability – Hypervisors technology has a massive effect on cost. Memory management, compression, page sharing between Xen, VMware, KVM and HyperV are all different. For example, light workloads with memory oversubscription can lead to high levels of consolidation.
• Interoperability – Ensuring that all components are completely certified reduces the risk for downtime and makes it easier to troubleshoot when issues arise. API programs also add a massive benefit—providing standardization for hypervisor to hardware integration.
• Operational efficiencies will reduce the need for skilled engineers (skills required to manage and maintain), thereby reducing salary and training costs.
• Ecosystem flexibility is fundamental for the selection of the hypervisor and for 3rd party backup products, disaster recovery and management software. Hypervisors that build API ecosystems for partners to integrate provide choices.
• Hardware – Storage features that provide advanced integration allow for higher levels of I/O and pooled resources. This includes storage technologies like thin storage, externalized, virtualized storage and APIs like VMware’s VAAI.
• As IT systems converge, more staff and organizational convergence is possible. Redesigning the server and storage team can provide a more cross-trained and effective team member, resulting in fewer divisions of labor. It is no longer practical to have siloed teams handling servers, storage, network and basic OS management.
• The consolidation and cross pollination of a single virtual team will improve staffing levels and total labor costs.
• Implement a chargeback or show-back system in order to assess the true cost per VM to the business unit.
• Use a server/storage services catalog to communicate VM offerings, configurations, and pricing structures (total cost or cost of goods) to the requesting organization. Outline SLA and OLA arrangements by each class of service, and incentives and penalties for non-compliance with company standards.
There are several considerations that are fundamental to purchasing a hypervisor infrastructure. Some people call these a stack (Server + Storage + Hypervisor), and these options can impact OPEX, CAPEX, enterprise agility (i.e. robustness of the solution) and ease of use/management. Benefits include:
• Simplicity of procurement.
• Warranty and support (maintenance contracts and ongoing support).
• Single vs. multiple points to support this stack (who does the customer call?)
• Simplification of the stack’s lifecycle (deploying, maintaining, refreshing).
• Elasticity of the stack (are you forced to buy more than you need?)
Thanks for reading this three part series on Hypervisor Economics. What do you think? Would be great to get your feedback for the final paper. Leave comments or questions below, and I’ll do my best to respond to each of you.
Hypervisor Economics – Excerpts From An Upcoming Paper – Part 2
by David Merrill on Sep 2, 2011
This post is the second of a three-part series of excerpts from an upcoming paper on Hypervisor Economics (you can find part one here). As always, I look forward to reading your comments and feedback below.
New server and storage ecosystems—and the projections of VM sprawl—will demand better control and measurement systems to provide improvements and justifications for these growth areas. Storage and Hypervisor Economics can be a subjective approach to options that measure and improve these new systems. There is an old saying:
“When performance is measured, performance improves. When performance is measured and reported back, the rate of improvement accelerates.”
In this case we are interested in the improvement of unit costs (being reduced) for VM instances. A proven framework including measurements and reports is essential to deliver on cost reduction actions.
Step 1 is to identify the costs associated with Hypervisor TCO. There are as many as 24 different types of costs; some
are CAPEX costs but most are operational. Centering on the purchase price alone may only account for 15-20% of the total cost of a VM, so focusing on the best deals from your vendors will only get you so far with a cost reduction
Step 2 requires a measure of the costs. Taking the costs that you isolate as candidates for your VM TCO will then require that parametric values be applied for the total cost calculation. There
are several cost metrics possible and popular with hypervisors, including:
• Unit cost measurement $/VM/year
• Performance cost measurement $/IO/VM
Other metrics can include the number of VM per square unit of floor space or per rack. Defining the costs and measurements correctly will allow you to find the cost problem areas in your TCO, and that will lead to choosing the appropriate corrective action, or investments to reduce the total cost.
Step 3 is defining the actions to reduce the costs. Identification and measurement of costs allows for prioritization of the costs, and choosing technology, operation changes or business processes that will reduce the costs. There are direct and correlated investments that will reduce certain costs of a virtual machine. While understanding these options are critical, it is more important to know your costs before a rapid set of investments are made to reduce the costs of a VM.
HDS has extrapolated 24 unique types of cost that can be considered for the total cost of a VM, and can be used in VM economic calculations. They are as follows:
- Hardware depreciation (or lease) expense
- Software purchase & depreciation
- Hardware maintenance
- Software maintenance
- General admin/ management, labor
- Migration, remastering
- Data or workload mobility
- Power consumption & cooling
- Data center floor space
- Provisioning time
- Cost of growth
- Cost of scheduled outage
- Cost of unscheduled outage (machine)
- Cost of unscheduled outage (people/ process)
- Cost of disaster risk, business resumption
- Reduction of hazardous waste
- Cost of performance
- Data protection or backup infrastructure
- Disaster protection infrastructure
- CIFS, NFS related infrastructure
- Storage area networking
- Security, encryption
- Cost of procurement
Of these 24 VM OPEX and CAPEX costs, there are perhaps 8-10 costs that are used most frequently for VM total cost models. Each IT department is different, and may choose various combinations of costs that reflect their local business environment, operational demands and IT processing needs, including:
- Server and storage hardware, including memory, network adapter cards and the resulting SAN or LAN infrastructure.
- Operating systems (hypervisors) and all other software installed on the VM. This cost may or may not include the application or database software costs.
- Additional licensing costs for hypervisor software advanced features (rapid cloning, dedupe, etc.)
- All hardware and software maintenance that is charged after the warranty period is over.
- Labor, administration costs associated with managing the VM and storage infrastructure.
- Power, cooling, data center floor space.
- Migration costs (time and effort) related to VM workloads movement.
- Data protection or DR protection costs, related to redundant server clusters, network connections and staff.
- Outage risks and costs, whether planned or unplanned.
Look for the third and final post in this series on Tuesday, 9/6 where I will go through the various levers available to IT architects and planners to reduce VM costs. Have a great Labor Day weekend!
Hypervisor Economics – Excerpts From An Upcoming Paper – Part 1
by David Merrill on Aug 31, 2011
In my next few posts, I’ll be sharing some excerpts from an upcoming paper on hypervisor economics. Feel free to provide comments and feedback below.
Global, national, local and business economic factors require IT departments to constantly evaluate and implement plans to reduce costs. As the size of IT continues to grow (in terms of storage capacity, servers, applications), businesses demand that IT costs are held flat, and that more is needed to be done with less (meaning fewer resources).
Applications and architectures are introducing new technologies that both stretch the IT capability and IT budgets, like Virtual Machines (VM), which since becoming more common in recent years have been promoted from a lab-only capability into a staple for production computing. In this escalation of VM deployments (often called sprawl), a new set of views are required to understand and improve on the VM architectures and VM costs.
If you are reading this, then you are probably not interested in the ROI of virtual machines; you made that financial and business decision years go. At this point the ROI is irrelevant, and the focus for continuous improvement has shifted to cost reduction – or unit cost reduction per VM. During the IT slowdown of 2007-2008, VM architects were relatively unaffected by CAPEX limitations and budget cuts, since the VM revolution had just started and seed money necessary to get started was small—if it even existed at all.
Today’s economic times are different, and the VM segment of IT consumes a large percentage of total IT spending and operational budgets. As previously noted, VM planners and architects will be asked (in the next budget crunch) to do more with less, and reduce the unit costs if VM quantities grow. Unit cost measurement and reduction is a necessary approach as IT cannot always grow the VM budget or change VM demand. Measuring and taking steps to reduce the unit cost or TCO–per-VM is an essential business practice.
At HDS, we are observing that customers are having many management issues for their VM environments because of the complex configurations that have evolved over the years, including:
- Many data stores are on different RAID groups and arrays
- Many different architecture configurations for server and storage
- VM clones and backup issues
- Single points of failure occur for hosts running large numbers of VMs
- Examples of VMware vMotion / Microsoft Live migrations cannot move the VMs easily around in this infrastructure
- No disaster recovery plan is in place
- Islands of arrays exist with no use of storage virtualization
IT architects and planners are starting to see powerful impacts and results from converged storage and server virtualization, mostly because:
- Storage virtualization is just being discovered for these new x86 adopters
- VMs’ now need enterprise class storage with integration
- VM infrastructure needs 100% guarantees for no downtime (risk is now a factor)
- NFS is no longer suitable for large consolidated heavy I/O VM environments
- API Integrations between the hypervisor and storage array are offloading more processing into the storage array
Check back on Friday for part two of the hypervisor economics series, where I will discuss the necessary steps for reducing unit costs in VM instances.
So Let’s Do A TCO
by David Merrill on Aug 23, 2011
For years I have encouraged clients to ask for TCO (Total Cost of Ownership) data when they create requests for purchases (RFP, RFI, ITT, etc). Customers need to ask vendors for more than just the price, they need to ask for the total cost. Lately, I have seen more of these requests come in from customers around the world—nothing like a global recession to push for more cost transparencies and actionable plans to reduce total costs.
Best in TCO models/requests need to have three steps: Identify, Measure, and Reduce.
Identify the costs – Within storage economics we have defined some 34 different costs. This paper outlines each of these costs, and this blog entry compares the storage costs to hypervisor costs. If someone asks me to help build a TCO, I need to know what cost elements will be going into the TCO. Out of the 34 cost categories, the most popular tend to be:
- Hardware depreciation
- Software depreciation
- Hardware maintenance after warranty
- Software maintenance after the warranty period
- Cost of growth or upgrades
- Power and cooling
- Floor space
- Administrative labor
- i. Migration costs
- SAN costs
- Long distance and local circuit costs
- Data protection costs
- Disaster protection costs
- Scheduled or unscheduled outages
- o. Provisioning time
- Procurement time
Measure – In order to build the TCO after knowing the categories, we have to have some local parametric data to create or measure the TCO. If power and cooling is to be included, then I have to know the local cost per kilowatt hour of electricity. If administrative labor is included, then the annual labor rate (fully burdened) is required.
In some cases, people are interested in TCA (Total Cost of Acquisition) and not TCO. In this case the top 5 (above) would only be considered, since these are the common capital expenditure (CAPEX) costs.
I have some very cool predictive modeling tools that I use to approximate TCO, but only when I know the country of origin, and the types of costs to be included. I also need to know the total storage capacity, average age of the assets, and the relative growth rate of the storage over the past few years. With these few pieces of data, I can build baseline TCO models, and when I know the proposed solutions, I can further predict TCO behaviors over time.
Reduce the Costs – In a response to the tender, vendors need to map or correlate how their solution will reduce the customer’s cost. The customers have to identify what costs are important, and the vendors need to correlate their solution to those costs. This is surprisingly absent from the TCO process. I cannot be clairvoyant and guess what types of money a customer wants to include or reduce in the TCO process, but if I do have that information, the correlation of a particular solution can be made against those costs.
The following is a simple TCO model, built for a customer that wanted a six year TCO for a proposed storage replacement. The types of money included depreciation expense, maintenance (HW and SW), professional services, power and cooling, floor space and labor for management. The customer chose the types of money, and the sales team had to define the solution and overall price. The efficiency of the design (with virtualization, thin provisioning, tiering, archiving) is demonstrated in a cost per TB/year that the customer can compare to other vendor proposals.
A common mistake I see is people confusing TCA with TCO. There is nothing wrong with TCA—it tells an important CAPEX story—but it may only be 15-20% of the TCO. In the example above, the labor will be more than the HW and SW purchase price, and power/cooling will be about half of the purchase price. Compared to other proposed architectures, the customer can now review side-by-side price and cost comparisons, and make an intelligent solution as to which architecture is the lowest price and lowest cost of ownership.
In creating effective RFI or ITT requests with TCO as requirement, be sure to identify your costs, provide parametric data to measure the costs, and put the burden on the vendor to show you (the customer) how their solutions will reduce your costs. This is the best recipe to purchase systems and solutions that provide critical functionality at the best price.
Don’t Just Transfer The Costs
by David Merrill on Aug 17, 2011
Cloud offering are the hot topic of the day. There are various options for cloud services these days. Capacity on Demand, Managed Service Utility, SaaS etc. When we talk about clouds being a solution to reduce costs, we make these recommendations after we know what kind of costs the client is interested in reducing.
Within storage economics, we have a mapping process in which we align the costs (to reduce) with options (that reduce them). The graphical view of this mapping looks like this:
Solutions need to be mapped to the right costs.
This brings me back to cloud. What I am seeing in many parts of the world is a blind approach to moving to cloud in the HOPE that it will reduce costs. I am also seeing a transferring of costs, and no real net-reduction in unit costs. A customer can do a mapping process to identify and measure these costs.
When they work with a cloud provider, they transfer some of these costs to the new provider, retaining some of the costs.
Simply shifting the costs without a fundamental re-architecting of the solution may not (and does not) reduce total cost. In this example, $A< $B + $C, and the total costs have increased with a cloud offering. To be sure, cloud providers can, with economies of scale, provide better unit costs. However, with margin and transformation cost and risk, total unit costs do not change as much as was originally expected.
Before jumping on a new cloud offering, be sure to understand your costs before and after the cloud migration. If total costs do not go down, then you need to be satisfied with the other qualitative benefits that cloud services can offer (there are many). Don’t plan to be surprised if this approach does not impact your types of costs.
The Hungry Ghosts of Storage Economics
by David Merrill on Aug 12, 2011
I am working in Malaysia this week, and find it a great merging of Muslim and Chinese cultures and traditions. Meal time is interesting, since we are in the middle of the holy month of Ramadan. Muslims fast from dawn to dusk. There is also a popular Chinese tradition called Festival of the Hungry Ghosts and it is this weekend.
According to tradition every year the doors of hell are opened for one day and all the hungry ghosts come from hell to torment people on earth. To appease them, you stay at home, lock your doors, and prepare food that the ghosts would like to eat. That food is candy (like Halloween), beef, pork, rice, moon-cakes (not like our moon pies) and other sweets. You also take paper money and burn it, along with lots of incense.
How does this relate to storage economics?
Being the culturally diverse guy that I am, I started thinking about how much money we burn to appease the (quite real) ghosts of Budget, Planning, and IT finance groups. IT costs continue to rise, even though the unit cost of components (storage, servers, bandwidth) continues to contract. We are often seduced into believing that if we pressure vendors and negotiate better prices, that our costs will go down. This simply is not true.
The hungry ghosts of IT OPEX tend to come from the underworld (unfortunately many times a year) and require all of us to sacrifice budgets. Better to start planning ahead and be able to distinguish and differentiate all of our costs. Creating a storage or server TCO baseline is an important start to this process since we cannot improve what we cannot measure. If you need some help to jumpstart the costs piles of money (before it gets burned), refer to this white paper.
Above all, hunker down, feed them what they want, but avoid burning too much of your cash at one time. There are certainly hungry ghosts out there, but if we need to fast it should be for religious or personal reasons, not to appease a phantom.
Think Like an Economist, Talk Like an Accountant, Act Like a Technologist
by David Merrill on Aug 9, 2011
I have been marginalized.
With all the recent news and talk of budget caps, debt limits and austerity plans, it seems as if everyone is an economist these days. There certainly is a lot of frightening news, and some predictions that we may return to business and national economic patterns that we saw 3-4 years ago. This time, we can get ahead of the curve and make preparations to secure last minute CAPEX or budget funding to optimize our IT and storage infrastructures against another round of cutbacks and IT spending limitations.
First, think like an economist
- Look at your macro and micro economic situations and determine if your company might be heading into an austerity phase (If you work for a state, regional, provincial or national government IT shop – sorry to tell you, but you are already there)
- Understand your capital approval process and slip in an un-forecast request to invest in technology and practices that can get your storage capacity demand through another (possibly extended) drought
- In your justification, use the following phrases to capture the attention of capital owners and make a lasting impression:
- Wanting to improve ROA (return on asset), I want to invest in some areas that will increase the usage and value of what we already have, without buying more capacity needlessly (don’t admit to your utilization rates) In fact with a small investment I can reclaim enough capacity to meet our organic growth needs by xx months
- I want to sweat our assets, without incurring additional new costs, but it will take some small investments now
- By changing or influencing consumption behaviors, I can squeeze out more efficiencies in our IT infrastructure and work to provide continued storage optimization
- Did you catch the key words from above? Optimize, ROA, sweat, reclaim. There are many key words that economists like to use.
Second, talk like an accountant
- We need to understand our costs, all the costs – not just purchase price. I have a paper that outlines 34 types of costs – so this is the time to isolate and prioritize the costs associated with storage, VM, server infrastructure, etc.
- Take the high priority costs, and put in place tactical plans to invest and reduce these costs now (in this quarter)
- Measure your costs before and after you act, that way your tactical decisions (technically) will make you a rock star (economically)
Finally, act like a technologist
- In a capital crunch (like what we saw from 2007-2009) all budgets are reduced, which is more painful in the storage area where demand and growth continue, even during budget-cutting times
- Take action to invest and extend what you already have:
- Consolidation – identify and act to collapse multiple, smaller storage islands into a single pool of manageable storage. Virtualization can be a key enabler to make this happen with limited time, effort and business impact
- Reclamation – reclaim space (not delete data) with zero page reclaim, virtualization, active archive (move from production tier to archive tier)
- Optimization – with tools like thin provisioning, de-duplication, and policy based management/provisioning and capacity re-balance
- Oversight – start capturing and reporting on usage abuse, those applications or departments that notoriously underutilize or lock-up precious capacities
Now you want to instill confidence that these technology levers are proven and stable. See the Gartner Hype Curve below, and show your management and architect that indeed these are processes and game-changers that work (and are viable).
This is not a prediction of chicken-little, or a reminder call to think of Joseph (in Egypt), who advised the Pharaoh to store away during times of plenty – but this is an opportunity to look into the recent past and take actions now for cost and asset improvement.
Fool me once, or fool me twice, I will not be marginalized or spooked again.
Is IT Automation and Productivity Responsible for High Unemployment and a Slow Recovery?
by David Merrill on Jul 27, 2011
I came across several articles this weekend that implied the stalled economy and high unemployment can be attributed to IT automation, doing more with less and improved business operations and functions. Are these not the holy grails of business and IT operation professionals? Some of the points in these articles seem to place poor employment data at our very doorstep:
- Businesses’ ability to do more with the same or less — what economists term increased productivity — has been rising since the 1990s, thanks partly to technological advancements
- Structural cost reductions that we have achieved over the past few years have allowed us to see strong bottom-line results
- Economists say the ability to do more with less has helped create a two-speed U.S. recovery
- Many (companies) benefited after slashing costs when the financial crisis hit and then keeping tight control on them even as sales recovered
So I guess all the preaching and planning to drive increased productivity is now being vilified as the source for stubbornly high unemployment….
Sorry to have contributed to this mess….
But I feel that placing some of this blame will not stop the ever-increasing demand for better returns, higher efficiencies and stronger bottom-line results. That is the driving force of capitalism.
The way I see it, IT and technological advancement have taken us so far with global efficiencies and doing more with less, but in the last decade or so, IT efficiencies has be the focus of business owners.
For many organizations, the IT productivity improvement has been real and demonstrable. Look at the admin ratios for servers, applications, or storage. Look at the number of virtual servers, logical TB or applications that can sit in a single square meter of DC floor space. Look at the shrinking (relative) data center space as we move to adjacent cloud computing infrastructures. Look at the cost of a TB of storage 12, 8 or even 4 years ago compared to now. We have seen dramatic impacts to both price and operational cost.
Do you remember the days of a good $5,000 TRS80 personal computer? And do you really want to go back to those good ol’ days?
Here is a link to a great e-book / How-To Guide HDS just completed with InformationAge (UK). This is a nice overview that can support any Virtualization initiative that you still may need to sell within your IT organization.
Just be careful that all these efficiencies do not throw you under the bus when it comes to finding blame for the economy or unemployment.
by David Merrill on Jul 14, 2011
When I started college, I first selected math as my emphasis of study. I think it was the second calculus class that helped me realize that multi-variable problems were not my strong point, so I dropped my math major and I chose (for me) a more systematic or logical degree in computer engineering.
Even now, my head hurts when I have to explain 3 and 4 dimensional situations to customers or colleagues with regard to cost models, configuration options, etc. Architects have to balance multiple-dimensions in making IT commitments for the business:
- Total cost, including price and recurring costs over the life of the asset
- Size – to start small and grow, or grow into solutions. The impact of the cost of growth (and the rate of growth) is not always a simple process
- RAS – three more nested options within its own dimension. How much security and availability is good enough?
- Performance – this dimension speaks for itself
I am spending time on hypervisor economics, looking to see how storage economic patterns differ or align to virtual machines. In this whttp://blogs.hds.com/david/wp-content/uploads/2011/07/dm12.jpgork, one of the similarities that I am seeing from talking with clients around the world is the notionthat VM configurations will have tiers. These tiers (gold, silver, bronze, dirt, etc.) all have different price or RAS features. These VM tiers have to be aligned with storage tiers. The VM and storage tiers each have their own definition or dimensions around cost, size, performance and RAS. Sometimes there are network tiers or security tiers that have to be considered. Pretty soon we have 3, 4 and 5 dimensions to consider in how we vend virtual storage, servers and IT servihttp://blogs.hds.com/david/wp-content/uploads/2011/07/dm1.jpgces.To reduce the number of configurations (you don’t want 3x3x3 variations or configurations floating around), a http://blogs.hds.com/david/wp-content/uploads/2011/07/dm11.jpgservice or PaaS catalog can be created to help communicate the standard offerings of VM, applications and storage. Some of my customers think of these pre-packaged solutions as all-in-one meals, because if you buy it as this bundle, you get a better price and a toy. Infrastructure teams that cannot afford to manage hundreds of a-la-carte configurations setup popular combinations or all-in-one meals for the customers:
- Oracle production meal
- Test/Dev meal
- Email meal
- File/print/VDI meal
Multi-dimensional planning and consideration is vital so as to capture all the relevant requirements needed in complex IT systems. Users cannot choose on price alone, or may not understand the management and cost complexities if they only choose based on performance (and they will always ask for the highest performance).
IT service catalogs are valuable instruments to help constrain or contain all the variables into a packaging program that meets most of the needs of the departments and users. Wise and prudent architects will spend time working with the user community to build and offer all-in-one meal bundles that meet the balanced need of the IT consumers. Presentation and marketing of the catalog is just as important as the development of the bundles. Let the consumers know what they are getting and help them to have a positive experience acquiring virtual machines, virtual storage, and applications from your IT infrastructure.
And be sure to include a cool toy…..
Economics of the Storage Computer
by David Merrill on Jun 30, 2011
I had a meeting with a State Government CIO this week. As is the case with most government clients, the discussion centered on cost reductions, and their perspective that in the future they need to “just buy cheaper disks” to get around their current IT budget difficulties.
I blogged about cheap disk a few months back, and you can see these entries here and here. I am starting to see a divergence in thoughts and plans around storage intelligence and capacity. When budgets are tight, we tend to marginalize the intelligence and boil down data access and storage to brown spinning disk.
I can see faults in my own approach over the years, using storage economics to identify and measure costs in terms of capacity. That is because people talk about capacity and growth, and not intelligence and feature/functionality. Measuring $/TB is much easier than $/RTO or $/IO or some other measure of the intelligence. The government customer seemed to imply that the added costs of enterprise or modular storage was hard to justify, and that they may just move the entire state IT to JBOD.
So let’s step away from the economics of the capacity and talk about the economics of the storage computer where the real magic and intelligence happens. In the end, most of the storage array vendors purchase the drives from a few sources, so any real differentiation in technology and economics is left to the storage computer.
This topic will be approached in an upcoming series of blog posts, since the topics and measurement systems will be very diverse. Hu Yoshida talks all the time about the storage computer, most recently in this post.
Let me stop and talk about the features that have to be quantified in this new measurement system:
- New applications, cloud and converged architectures will depend more on compute and feature/function as compared to simple storage
- Hypervisor sprawl is an example of new IT architectures. We have seen simple storage arrays or DAS used in the first wave of VM deployments, but as larger and more critical workloads are put there, the storage computer scaling, off-load and performance capabilities become more critical. I hear a lot of people comparing VM sprawl design to be that of the old mainframe environments.
- De-dupe functionality can be done with an in-band computer or as an appliance. The intelligence and longevity of the de-dupe process will rely on the power of the computer and software
- The computer architecture determine the effectiveness of Scale-up, Scale-out and scaling deep with virtualization
- The storage computer enables virtualization, so the right computer has to be priced against the business and operational effect of virtualization
- RAS is a function of the effectiveness of the computer, tightly coupled with the MTBF of the disk drives
- The storage computer applies and enforces rules for data tiering and life cycle management
- The storage computer allows for intelligent replication, snap copies and fail-over capabilities
- The storage computer are essential for faster, easier migrations that are non-disruptive
- The storage computer attributes contribute to technical and economic differences – such as the cache separation of storage for the meta data and the global pool for processing
If we isolate or commoditize the brown spinning disk, then economic analysis is left to compare the cost effectiveness of the computer’s functions. Differences in cache management, metadata management, control path, processors will all impact performances and features. Some of the new storage computer economic metrics will help to separate capacity only views from data functionality views.
More on this in upcoming blog entries. What metrics do you think are important?
Cost of Decommissioning Applications and Data
by David Merrill on Jun 7, 2011
Hu Yoshida posted a blog last week on the issues around data center transformation and the impact of Decommissioning Applications. Hu and I exchanged emails while I was in Europe on this topic, and he has asked that I put this issue into context around storage economics.
First, the solution that he mentions in his blog can be addressed separate of the costs. I concur that data de-referencing is essential for long-term data and audit traceability. I will not touch on the costs of the solution, but rather focus on the costs of long-term application and data ownership.
There are several cost elements involved in maintaining data for a very long time. The sum of these elements has to be applied to the time period for data retention, and this period of time is now often measured in decades, not just years. As architects and data owners sit down to look at technical solutions for long-term retention, it might be beneficial to consider the recurring and one-time-spent-many-times cost associated with keeping data after an application has been de-commissioned. Based on my 34 cost categories whitepaper, here is how I would break down these costs.
1. The cost of migration
- HDS has done a lot of research on the costs associated with migrations (usually of the array or infrastructure). You can refer to a paper on this topic here
- Migration costs will emerge every time a tech refresh is applied to the core infrastructure
- Virtualization of infrastructure elements can drastically reduce the costs and time of migration, but this cost still has to be factored into a multi-decade data retention policy
2. The costs of data re-mastering
- Please refer to my recent blog posting on this topic
- Moving data to a new media type, or having to re-format the data can be time consuming tasks, and if there are immutable data requirements, this becomes more complex
3. Media and storage costs
- Regardless if the data is on brown spinning disk, tape, CD or fiche (not practical), there is a media and infrastructure cost that will be recurring. Every 4-7 years this hardware infrastructure may require a tech refresh, either in capital purchases or expensed items
- Media formats tend to (radically) transform every 5-8 years, so most depreciation terms will not span more than 1 media remastering event
4. Environmental costs
- Whatever format or method is used, power, cooling, and floor space costs are involved
- If off-site storage is employed, then there is an added cost of transportation (CTAM- Chevy truck access methods, or network bandwidth)
5. Administration (if not included in the re-mastering costs)
- People needing to touch the data for validation or remastering purposes has to be calculated. Even if the data is immutable, there are labor elements involved with validating the reference tables, index or logs associated with the data
- Will you change encryption methods once or twice during the life of the data? What will this cost?
7. Protection costs
- Even if in a deep, cold sleep, data is often protected (replication) during this time period
You can see that there are several dimensions to consider when architecting long-term data archive options. Decommissioning data can add new wrinkles in terms of effort and cost. Not considering all the related costs may lead to an architecture that ends up costing more over time. Short-sighted plans may save money up-front, but end up costing more down the road. Total lifecycle costing is appropriate and recommended for long-term retention planning.
Hypervisors: Welcome to Storage Economics
by David Merrill on Jun 1, 2011
For the past 12 years, I have been defining and refining principles, concepts, categories and models around storage economics. In the past few years, it has become necessary to move into adjacent infrastructure areas to identify, measure and reduce the costs of IT.
Next Up: Hypervisors and concepts to reduce the cost of VM sprawl.
I want to use the term Hypervisors rather than isolate various commercial products (from Microsoft, Citrix, VMware and others). As an industry (and at the local level) the hypervisor sprawl has been well documented. When I askcustomers about the move to the Virtual Machine (VM) architecture, and if it has radically reduced total costs, I cannot seem to find any clear answers. And except for product based claims, I do not see a lot written on the topic of VM total costs, or measurement techniques. All of the hypervisor vendors have tools and methods to show their product’s economic (mostly price) benefits, but trending and total cost measurements seem to be thin (no pun intended).
In talking with customers around the world, I am seeing some key patterns in VM sprawl and the impact on total costs:
• First generation Hypervisor deployments went great, and were generally well received. Management bought into the concepts of several VMs running on 1 physical server. CAPEX for servers slowed down for a period of time
• First generation VM systems were primarily deployed for Test/Dev environments, or moderate-to-low transactions/applications
• The popularity has now driven VM roll-out to higher level apps and infrastructures moving to VM. The sprawl continues with thousands of VMs running many types of applications
• Choke-points start to appear, with performance and scalability becoming real problems
• Some have to reduce their VM to physical ratio, or buy larger servers to handle the increased performance load. Some have to move from blade to rack servers, thus increasing CAPEX costs
• CAPEX costs start to rise again, and even with a 5:1 or 10:1 ratio, management forgets the old server purchase model and is starting to push back on more/larger servers
• Disk, memory, license fees, environmental costs all start to rise
• And no one is measuring the unit cost of a VM during the sprawl…..
Now there has been relief, with vendor APIs (VAAI) to help offload some workload from the server to storage arrays. This helps keep the CPU cycles dedicated to the server workload, thus improving VM ratios once again. Storage scale-up and scale-out architectures also provide relief for the VM costs. Here is a short business case (co-developed with ESG) on the impact of VAAI for example. But addressing server costs is just one element of VM TCO modeling and measurement.
It is in this time of servers, storage, memory and license cost increases that I have adapted storage economic principles and methods to VM economics. I have several papers in the works, but I will try through this blog site to dribble my findings and results to you at a steady pace.
Step one for VM Economics is to identify the costs associated with VM sprawl, and the total cost of ownership. I suspect that VM econ will behave similarly to storage in that the purchase price is just a fraction of the total (lifecycle costs). I don’t have percentages yet (let me know if you have done so), but I suspect that the price will be 15% of the total cost.
What are the other cost elements over time? The VM total cost has to include license fees, maintenance, and the storage, SAN and data circuits. It has to include power, cooling and the floorspace. In this first step, I have mapped the 34 types of costs from Storage Economics into VM economics. Here is a summary table.
There will be much more on this topic in the next few weeks and months to come. Some of the more interesting research is modeling the cross-over point for DAS, NAS and high-end SAN storage architectures, especially when vendor API can be employed to off-load work and increase the VM ratio. Bringing thin, virtual and dynamic tiers to a VM also will reduce the cost and impact the crossover point. Negotiating ELA arrangement with the hypervisor vendor will impact TCO for larger environments. Server architectures are changing and improving, and this will impact total costs.
The patterns and costs that I have modeled for the past dozen years for storage will most certainly apply to virtual machines, and later virtual desktops. I invite you to provide feedback, data points and results from your VM growth and sprawl experiences. As an industry, we cannot improve what we cannot measure, and this is a new IT cost area that we need to measure.
(down)Time is Money
by David Merrill on May 26, 2011
I was forwarded a link to an article today that reported on the cost of an unscheduled outage. The report states that downtime (across various industries) was approximately $5,600 per minute. I have collected a series of time and outage risks costs over the years (see table below), and this rate of $5K/min may seem high, but could be spot on given our on-line world.
I regularly come across people that have a hard time justifying expanded disaster recover (DR) coverage, or replication capacity in order to mitigate the risk of downtime. The issue is putting a monetary value on the risk of an outage. Perhaps this data can help in developing that business case.
VMworld Session: Saving Money While Battling VM Sprawl
by David Merrill on May 12, 2011
As my colleague Heidi Biggar has pointed out, this is the time of the year when VMworld opens up their public voting process to help select which sessions will make it to the conference’s final agenda. There are a number of HDS sessions up for the vote, one of which is my session on the economics of VM’s.
My proposed session focuses on how you can lower your total infrastructure costs per virtual machine by making sure you have the right storage solutions in place. I think it’s a compelling and important topic for storage and IT managers today, many of which are dealing simultaneously with VM sprawl and tightening budgets.
If you agree and would like to hear me speak on this topic at VMWorld in Las Vegas, please vote! You can access the voting page via http://www.vmworld.com/cfp.jspa, where you’ll need to log in with a VMworld account.
My session is number 3230, entitled “Penny Saved Is a Penny Earned: How the Right Storage Can Help You Lower Total Infrastructure Costs per VM.” You’ll find the full proposal below. Thanks for your support!
As customers look to accelerate VMware adoption, the type of storage they leverage becomes increasingly critical – not just from a performance standpoint but increasingly from a cost perspective. VAAI can help to achieve the economic benefits of greater densities, and the right storage architecture can help customers achieve short- and long-term cost savings above and beyond competitive scale-out or converged solutions.
Learn how the right storage architecture can help customers lower their total infrastructure costs per VM.
- Storage economics 101 – Uncovering hidden data center costs
- Understanding resource utilization today and how over provisioning may make virtualization more expensive. (Reference Gartner research.)
- Compare/contrast with scale-out architectures and explain how the customer benefit is directly related to the power of the VAAI-supported array.
This session will walk attendees through a storage economics tutorial with the objective of unveiling some of the hidden, but common, costs in today’s data center. Particular attention will be paid to the hidden costs of storage and how VAAI-supported scale-out architectures can not only help reduce total costs over the short- and long-term but also how, with the right underlying storage, they can enable the lowest infrastructure cost per VM. The session will compare and contrast storage architectures and well as identify key metrics to help you measure and lower costs within your environment.
VAAI-supported scale-up architectures can help customers lower their total infrastructure costs per VM. This is something competitive scale-out architectures can’t say, and something from which converged solutions provide short reprieve.
by David Merrill on Apr 27, 2011
I love the movie Inception. I have to admit it took 2-3 viewings before I understood and followed the sequence of events, especially as the scenes were split between 3, 4 or even 5 levels. The nesting of these stories (or stories within the story) made my head hurt at first, but I was later able to appreciate the details of the story-telling and the total entertainment of the film.
I believe we are seeing advanced capabilities (not just storytelling) with convergence of different forms of IT virtualization. Let me summarize:
- Desktop virtualization – where apps and processing power is abstracted from the end-user.
- Server virtualization – the ability to present several machines virtually running on the same server.
- Storage virtualization – at the array level where a subordinate array can inherit the qualities and capabilities of the host controller.
- LUN and volume virtualization – where thin volumes present a virtual capacity to the host, yet consume smaller actual capacity.
- Application virtualization – through cloud and applications as a server.
- Data center virtualization – achieved (in part) with cloud services physically local or remote to the facility.
Virtualization gets very interesting and economically very different as we compound these virtualization levels. I am not talking (yet) about virtualizing within the virtual arena, but bundling virtual elements to where the whole is much greater than the sum of the parts.I have been gathering and compiling economic data of the impact of marrying server virtualization with storage virtualization. The compounding effect is quite dramatic, but only at a certain scale for virtual machines (VM). Think about virtual or thin VM, using virtual capacity from a storage array. Key technologies like VAAI can now allow the off-loading of some storage tasks away from the server, leaving those CPU cycles for application processing and crunching. With virtual+virtual bundling, even more densities of VM per host are possible. Less physical gigabytes per host are possible. Single point of management and orchestration (of all elements) is possible.As an industry we have seen the economic impact of both server virtualization (VMware, HyperV, Xen) and storage virtualization (Hitachi, SVC). Moving into a world where virtual-upon-virtual will be the norm, we may have to re-think basic storage and server architectures. These bundlings can unleash new levels of efficiency, operation simplification, power and cooling, and therefore total cost. These bundlings are not automatic. The right storage architecture is needed to unleash server efficiencies; just as the right server architecture can unleash storage efficiencies.The key ingredients for a virtual-upon-virtual model include:
- VAAI – the newest glue to optimize CPU and inter-operation.
- Virtual Storage – arrays, volumes, pools, and tiers.
- High performance storage – able to scale for both capacity and performance without outage.
- Data mobility between tiers.
- Unified management – QoS, metering, chargeback.
As with any economic impact model, we look for cross-over points. Where does any new idea or architecture become economically superior? I know that with very few VMs (less than a few hundred), it may make economic sense to use DAS or NAS storage, and not bundle server virtualization with storage virtualization. But there comes a point of inflection where larger VM sprawls start consuming lots of extra server CPU cycles, wasting lots of physical disk (as it does not use thin volumes), and the management of these complex environments becomes untenable. I am working on locating these cross-over points now, with customer parametric data, as well as with sizing and configuration models from very large VM environments. I hope to share these in a blog posting in the next month or two, and will have these results formally published.
Then we can continue the virtualization (inception) bundling by adding cloud and VDI concepts, thereby modeling the IT infrastructure of the future. Then we can also model the total costs of that nested bundle of virtualization. No doubt my head will hurt again, but the second or third time around the block it will become clearer for all of us.
So how cheap is that disk – Part 2
by David Merrill on Apr 5, 2011
A couple of comments from my previous post, in which I presented a TCO analysis of the retail-store-hard-disk-as enterprise-disk, have driven me to do anther posting on this topic.
As you can read in the comments, there are several aggressive or conservative approaches to simulate enterprise-class storage with bargain-store commodity parts. The comments from Steve and others indicate that you can re-create this scenario at lower or higher prices and costs. And the good news is that everyone is right.
One thing to remember is how the price and cost change at scale. Building a 10 or 20TB disk enclosure can be done one way, but when you approach thousands of TB, then the configurations and price points start to change. The cost of enclosures, racks, cabling systems, replication software, data movers, remote and local management consoles all start adding up quickly. For smaller systems, you may not need these features. As the capacity increases, and business impact/criticality increases, so does the investment. And the investment is non-linear compared to the price of the drives.
This phenomenon reminds me of a discussion I had with a colleague after attending a college football game (BYU and OU) at the Dallas Cowboys stadium in 2009. One key feature of the stadium are the enormous screens in the middle of the field. Built by Mitsubishi, each HD screen measures 60 yards by 25 yards on each long side of the field. Additionally, the end-zone seats have mini-versions of these monitors. The total cost of these four screens was $42M, and would be more than enough to build very nice traditional stadiums.
So let’s create a fictional discussion with the stadium designer and the stadium owner over how to construct this video system, using the concept of cheap commodity components to build a large system like the Cowboys Stadium screen.
Owner: So I’m looking for four large screens for my stadium, the largest of their kind. I have a bid from Mitsubishi over $42 million. I think they make nice TVs, but the price is a bit high.
Designer: What options do you think you have to build a system that large, and with the demanding picture quality that you need?
Owner: Well I think a bunch of iPad 2 systems bolted together would work. I love watching TV on my new iPad.
Designer: Given the dimensions, you would need over 50,000 devices…..
(25,000 square feet of viewing screen, would need a little more than 50,000 units)
Owner: Yeah, but I can get them for only $400 each , and I’ll bet I can probably negotiate a volume discount. Compare roughly $20M to $42M from Mitsubishi.
Designer: But….. How would you assemble it to work with….
Owner: So I was thinking we get some large cork boards and glue them side by side. Then get some 6-plug power strips…about 8,000 of them at $4 each…. And some apps to do the modulation/demodulation… I think there is an app for that…. So about $7 for each iPad app that we get from the app store…..
Owner: So worst case the total cost is $20 or $21M. Still miles ahead of the $42M TV proposal. Whaddya say we go for that instead?
I think we can all see how ludicrous this example is. And there certainly is correlation in using low cost consumer drives, cobbled together to approximate enterprise solutions. Still, you are free to shop it out, and see what your total costs would look like.
Keep the comments and design options/pricing coming.
There’s one for you nineteen for me
by David Merrill on Mar 30, 2011
I was working on my taxes two weeks ago, and as usual ended up humming the Taxman song from the Beatles. Part of the lyrics penned by George Harrison goes as follows: “Let me tell you how it will be; There’s one for you, nineteen for me. ‘Cause I’m the taxman, Yeah, I’m the taxman”. Perhaps our income tax rate is not 95% here in the US, but I am interested to note personal income tax levels from different countries around the world.
I also find interesting an apparent tax rate that we pay related to un-utilized storage capacity. I have written about the cost of waste before and the options (virtualization, thin provisioning) that are available to reduce the rate of wasted capacity.
We all realize that it takes many GB of actual raw disk capacity to store a single GB of data — many times more. Especially when you factor in:
- RAID overhead
- Reserve capacity held by the storage admin
- Reserve capacity held by the DBA or application owner in what is used vs. what is allocated
- Usable vs. allocated waste
- Replication or copy space (this is not wasted capacity, but will increase the total GB needed)
I often see IT operations reports on metrics that shows this written-to-raw ratio. This can be a damaging or embarrassing metric to show management, but it can underscore the need for a more advanced architecture and operational process related to storage allocation and usage. The written-to-raw ratio is simply a measurement of the first instance of data written to disk compared to raw capacity purchased to store that original data.One of our storage economic black-belts here at HDS is Reg Mills. He recently shared with me some analysis where a customer had a 16:1 (raw to written ratio). Not quite as bad as the taxman, but darn close. Here is an excerpt from his work:
TCDO we often discuss, but rarely do we get real client numbers to show the effect in his own world. In a recent Storage Economics workshop I was having my challenges around getting some significant hard savings around a new virtual architecture, looking for some avenues to identify some deeper hard cost saves to build a viable business case, I asked the storage team to prepare the vendor-based utilization reports. Analyzing the reports identified a 16X TCDO between raw array storage and the used server storage (data). Using this as a discussion point that $6 per GB raw (for purchase) was really costing them $96 per GB for data stored, and the focus should be eliminating the highest cost waste in the server through reclamation, we were able to identify over $2M in savings. Some of the key points in this 16:1 ratio:
- Using the client’s own numbers (16X and $77 per GB) got them immersed in the storage economics approach and metrics
- In accounts willing to give their utilization reports, we can reasonably quickly identify the TCDO and the related reclamation story
- Savings to reclaim space (and therefore grow into current capacity) can be archived with a combination of software ( HDT, HDP and others) as well as a new virtualized core storage array with VSP
- TCDO also uncovered additional potential saving opportunities at each abstraction layer of storage to address/discuss. In this situation we selected to only focus on Thinning/reclamation
- By reclaiming current space waste, the new storage curve drops down permanently. Through use of reclaimed space waste over the next 2 plus years no storage purchases are required, then once reclaimed storage is fully utilized in year 3 the new purchase curve has a much lower 6:1 ratio requiring purchase of significantly lower storage quantities.
- This next chart shows business at usual (BaU) with storage growth when you have the 16:1 waste-taxation rate. With a new architecture (shown in blue) the same effective (virtual and physical) capacities can be presented but with a much lower tax rate. The blue line is an HDS architecture using array virtualization, thin provisioning and dynamic tiering.
So if you think the taxman rates are high, take a closer look at your own cost-of-storage-waste-taxation-ratio to see where you stand. The impact and benefit of reclaiming some of that waste and changing your storage demand appetite should be a positive message to your management. Especially during tax season.
by David Merrill on Mar 28, 2011
Like most of us, I have been following the tragic events in Japan since the March 11 earthquake/tsunami and the ever-changing nuclear crisis unfolding in Fukushima. We continue to see swings in global stock, bonds and commodities markets. Energy demands (in terms of supply and price) are impacted by events in Japan, as well as in Libya. There is little economic certainty, as it seems, as we scan the headlines and seek any kind of positive news or trends.
Closer to home, we see countries inspecting national nuclear plants. We see closer scrutiny and monitoring of financial markets- wanting to avoid another 2010 flash crash. We review our own personal economic layout, and (for me) balance heartburn and euphoria over noticeable fluctuations in retirement balances. It just confirms the butterfly effect (or other popular chaos theories) that we deal with in business, finance, technology planning, and in our own personal lives.
In the middle of chaos though, there are threads of resiliency, stability and business as usual. I have been profoundly impressed with updates and news of manufacturing, logistics and accelerated business effort in Japan, from my own company and many others. The resiliency and determination of the Japanese nation is very inspiring. We observed this same get-the-job-done attitude after the events of 9/11. We are reading about it real-time in Egypt and New Zealand. The Japanese have a tendency for long-term strategic planning, and even with the devastation impacting those plans, they will return. So many investors are showing their bullishness with their money.
Economic resiliency is more than a state of mind. It is more than hope (I always tell my customers that hope is not a strategy). Resiliency comes as a byproduct of planning, acting, reacting, adjusting and measuring. Not all unpredicted events are a disaster, in the IT sense of the word. There are many examples of non-disaster events that require resiliency through planning and action:
- What is your plan in the event of a hack-attack or breach of your company secure data?
- What options are planned for in the event of a work stoppage, strike or pandemic that strikes your workforce?
- If supply chains are disrupted, do you have a contingency?
- What if someone (won’t say who) turns off the Internet?
- Is there a volcano nearby, perhaps not shutting down the IT center but impacting logistics, regional flights, or mild panic?
- If the cost of electricity doubles or triples in the next year, do you have a contingency?
- What if your local currency drops, or rises at a meteoric rate?
- If IT budgets, demands, availability or capability radically changes, do you have a contingency?
Those with the best recoverability are resilient, because they have planned and protected against the worst of times.
Becoming resilient and recoverable is not by accident. It does not happen with hope alone. It cannot be left alone to will-power or luck. It is more than cultural, national, or a personal trait. Resiliency comes as a result of careful planning and executing to strategic plans, with clear goals, dominating business principles of integrity and vision. Sometimes, it take a series of national and global events to evaluate our own position and situation and resolve – and then to act.
So how cheap is that disk??
by David Merrill on Mar 17, 2011
We have all heard it before.
“I can go down to <insert name of local retail tech store> and buy a 1TB disk for $60. Why is your enterprise disk so expensive??”
We have all tried to fumble our way around price and cost and how the enterprise disk is so much better… yada yada yada. Well, someone went down to their local retail store, used petty cash and built a system. Yes, the unit cost of disk was cheap ($160), but at the end the total cost was not at all cheap — $8,500 for that TB of disk.
A colleague of mine in Australia did this experiment. I asked him to document the process, and the math behind the amazing results. I have done some editing to his material (they speak a different English down-under), but I give him full credit for the write-up and the total experiment.
Remember that Price ≠ Cost. Happy reading.
True <discount store> Disk TCO
Consider that if a person was to buy a 1TB drive from <discount store> for $168 (that was the <discount store> price as of January 2011 for a half decent-quality drive, even though you can pay as little and $79), they would need to buy a second drive to have redundancy like we deliver, and they would need to buy a third drive to have hot-sparing like we deliver. That makes $504.
Now consider that the formatted size would actually be 940GB not 1TB so the cost is closer to $534 per Terabyte usable. A 1TB USB drive to use at home seems to be what uneducated (ignorant?) people always use as a comparison so you can see already their argument is flawed.
Then factor in how you would back that up. The current technology capable of backing that up in a single run is LTO4 and a cheap but reliable LTO4 drive costs $2,800 with one year’s warranty. You would need a minimum of 41 tapes to maintain the GFS (grandfather, father, son) daily-fortnightly-twice annual backup rotation and retention for one year that we use, at a cost of $69 per tape ($2,829).
So, the cheap 1TB of storage at <discount store> is now looking like $504 + $2,800 + $2,829 = $6,133.
And no one is looking after it, and monitoring it. No one will come out and replace/fix it if it has an issue at 2:00 a.m. on a Thursday (best you might get is 1 years return to shop warranty). The alternative would be to buy 41 extra hard drives to make copies on to, at $168 each although you could then drop the hot spare so only 40 extra drives would be required to use more <discount store> disks as the backup target. 1.
The 41 extra hard drives is to counter the argument that you just need to make a copy of the data on another disk the way people would in their own homes. If you do that you don’t actually have a point in time backup. You just have a single copy of the current data for DR that does not meet any regulatory requirements. If you were to make a copy every day and every week as you would with a backup regime, once the 1TB SATA drive was more than half full you would need 41 extra drives to be able to have point in time recovery for every business day for the most recent 14 days, and every fortnight going back a full year.
Now realize that this is only a SATA disk and that is in fact the slowest type of storage we offer and has no cache in front of it. It would not be capable of supporting databases or enterprise applications and has a very slow access rate because it is only USB2.0 or at best Fire Wire.
NOTE:Most enterprise storage arrays have a background process that does block level error checking so for a more accurate comparison you need to factor in the time it would take to write a script that did a file-by-file CRC check and then you would need to manually monitor the results and do the restores for all the corrupt content. You would need to do it at least twice a week so that is a minimum of 52 weeks times twice a week times 30 minutes each time at an IT Support Tech pay rate, which is $2,300 a year.
So now your 1TB <discount store> drive is costing you close to $8,500 a year to deliver something close to enterprise levels of service but still delivering woeful performance — with no on-site or after hours support.
This is all to the extreme and of course if you bought 100 drives from <discount store> you would demand a substantial discount but the TCO has more backup and management costs than capex spend. If you could buy a tape drive with 3 years warranty the TCO would be less than $2,800 per year and a proportion of the backup media could be recycled for a couple of years so the media cost would be less over a multiyear model. BUT you normally only get a 1 year warranty on a hard drive from <discount store>, so the cost modeling over a single year is still valid.
Remember, if you build all systems with cheap SATA internal or DAS disk there are issues backing them up. Anything larger than a few hundred GB can’t be reliably written out to tape every night (like a database) without dedicating a backup device to that specific system. Now we have a $10,000 tape library attached to every server. This is not very cost effective or space efficient. If we just use a single standalone drive for every system instead of a library, there is no bar-coding to manage the thousands of media required each year in an enterprise environment. Now tapes are getting lost, misplaced or simply overwritten because no one has a central console to track and monitor them.
Bear in mind that the reason SAN even exists is because having disk specifically locked in a given server is inefficient. Not all boxes will use up their storage at the same rate. But extra space in one server can’t be used by another server the way SAN space can be assigned to whichever system needs it.
People who have been in IT long enough remember the 80s and 90s and it wasn’t great for data management. The models from back then simply cannot scale to the size required in the 21st century.
So there you go. It is possible to construct an apples-to-apples comparison from discount store disk drives to enterprise disk architectures. There are features and functions that chance the price and total cost:
- Drive failure protection with RAID
- Reliability of the device
- Disaster protection
- Data protection (backups and replication)
- High performance connection (Fibre channel, not USB ports)
- Management and operations
In discussion total price with total cost, you may want to point your management team to a paper outlining all the types of costs that make up storage TCO.
Storage Economics Talking Points
by David Merrill on Feb 18, 2011
I have been blogging about storage economics for almost five years. I have spoken to hundreds of customers, and at hundreds of events over the past three years on this topic. We have a lot of material available on our Storage Economics website. But sometimes, a short review with talking points is needed.
There is not a week that goes by without a colleague or customer comparing enterprise disk to the price of a single disk drive (typically USB) for sale at the local retail store. Why is data center or enterprise disk 10X more expensive? Where is the value?
And so the questions go.
The following is a short outline of my talking points around price, cost and achieving economic nirvana. If you have heard me speak, these will sound familiar. They are also outlined in this paper
1. Price does not equal cost
The price of disk is becoming less significant relative to the total cost of owning the assets over time
10-12 years ago, the price of disk represented 60-70% of the TCO, now it is below 20% (and declining)
I completed a recent economic analysis for a large customer (15PB) where the price of disk was only 8% of the TCO
My colleague Claus will tell you that you have to assume that the price of disks is approaching zero (the drives that is, not the arrays)
There are many other cost elements that may surpass the price of storage, and those tend to be more relevant when trying to reduce OPEX
CAPEX is important, but most organizations will be focused on reducing OPEX. Ask your CFO if he/she will approve CAPEX to reduce OPEX
2. There are 34 (used to be 33) types of costs that make up storage TCO
Price (capital expense or lease costs) are one type, there are 33 others
I cannot tell a customer what their cost interests are, but I can present 34 and then you decide what applies to you
Some anecdotal observations of costs:
- For most parts of the world, the four-year costs to electrify and cool a storage array will surpass the original purchase price of the storage array
- For organizations that think fully-depreciated disk is ‘free,’ stop and calculate the monthly costs of HW and SW maintenance for gear that is out of warranty
- The cost of migration, that is incurred at the end of the asset life, can be calculated at $7,000/TB (see more here)
- The cost of waste or the cost of copies usually consume 50% or more of the capital expense costs within the TCO
- The cost of labor and storage management may be as much or more than the purchase cost of the storage
- Most people tend to copy production databases six-nine times, and those copies tend to be on the same tier as the original data
- Data protection, remote DR protection and copies can quickly inflate the total cost of ownership
- Reading, understanding and selecting the costs that are important to your organization is critical in starting a cost reduction plan.
Your unique choice of cost interests will determine your storage cost sensitivities
- Are you labor sensitive?
- Do you have environmental sensitivities around floor space, power, cooling and carbon emissions?
- Are you risk sensitive? Has there been outage, data loss or corruption that impact your company?
- Are you litigation sensitive? Do you need to find and recover data (from different sources) in the event of frequent lawsuits?
- Are you growth sensitive? Can you not afford long lead times to acquire, provisionand present storage to the business?
3. There are economically superior Storage ArchitecturesThese superior architectures are directly defined by your cost sensitivitiesDo not be seduced by a low price. Align your architecture to the variety of costs that make up your storage TCO4. Econometrics are essentialYou cannot improve what you cannot measure. You cannot start a cost reduction initiative until you have isolated, prioritized and measured your total costsMany metrics are available to help tell your storage progression story
- Total cost of data ownership (dividing all the costs NOT by the usable capacity, but by the first instance data capacity)
- Carbon emissions per TB
- TB per kWatt or TB per square meter of data center floor space
Continuous improvement can only be effective when the right metrics are created, reported and measured
- “When performance is measured, performance improves. When performance is measured and reported back, the rate of improvement accelerates.” (Monson)
So there you have it — a quick primer on diffusing the ever-decreasing-price-of-disk pressure. Turn the conversation to environmentally and economically superior storage architectures. Start to create econometrics that shows your total cost, and set plans to reduce those costs with the right technologies, processes and personnel improvements.
Remastering vs. Migration
by David Merrill on Feb 11, 2011
One of the 33 types of costs outlined in storage economics is the cost of migration, or the cost of remastering. Last week, colleagues in Australia asked me the difference between the two, and how these costs factor into the total cost of ownership (TCO) of certain storage architectures. If you follow the links above, you will see how we differentiate the costs, but let me summarize them here.
Remastering tends to be applied at the data level. Over time the data may change in terms of media it was stored on, software used to access or modify the data, and the use of the data.
A patient record or rich media used by a hospital is a good example. An x-ray image may be stored in digital format, paper photo format or actual film format (if it is old enough). Going forward, it is hard to predict future applications or media, but we know that the data may reside on paper, film, digital tape, digital disk etc. Over time, the data may be required to transform 7 or 10 times as the media and infrastructure changes. This is the effort to re-master the data.
This remastering effort is not free, and since the data needs to be accessible for the life of the patient, the remastering exercise will continue every time there is a media or infrastructure change. The only media that does not need to be remastered is paper. And we all know the risks and impact of time with paper (just look at my shoebox full of tax receipts for 1994 as an example).
Migration costs tend to be applied at the array level. The storage array will hold many types of data, in different formats (but on one media type-disk). As arrays are replaced (due to depreciation rules or obsolescence), the data has to be migrated to the new platform. This migration may or may not involve a remastering event. A migration event will impact all data on the array. All data has to be moved at the (relative) same time. Hosts are impacted with this move. Applications are usually set to point to a LUN, but they tend to be media-unawares.
Now, there can be extra effort as data, volumes or LUNs are impacted with an array change where they may be put into thin volumes or the data is de-duplicated. This again is an impact to the original or source data, but should be transparent to the application that accesses the data. Nevertheless, there is a migration or remastering effort involved, and this is more time and money for the event.
We can be sure of one constant- there will be change. Keeping data for five, 25 or 100 years will require several migration and remastering events. Remastering and migration may be coincident. Underlying storage architectures, operational processes and data retention rules all will impact the effort, frequency and costs of these events. The economic approach to moving to thin volumes or virtual LUNS will reduce total capacity, but the time and effort to migrate/remaster has to be factored into the transformation investment.
What has your experience with remastering and migration been? Let’s talk about it here.
2011 Storage Economics Forecast – Mostly cloudy with a 70% chance of……
by David Merrill on Jan 20, 2011
Like most people, I enjoy reading predictions for the coming year. Predictions related to sports, politics, and what Apple is going to announce in 2011 are all very entertaining. This is true of storage predictions as well, so when I came across this article from Dave Raffo of SearchStorage (and since I was not interviewed for my opinions or projections for 2011), I thought that I can contribute to the prognostications, at least from a storage economics perspective.
Here they are, in no significant order.
1. Price reduction of SSD, and the blending with traditional drives with sub-LUN tiering will promote further adoptions. The drives themselves continue to be cost effective, but the hybrid use will continue to increase popularity with larger sub-system designs.
2. Environmental pressures will mount for larger data centers; and in organizations with very large growth
- Pressure continues on data center floor space
- Metrics measuring TB per square meter, or per rack unit will be a telling tale of these limited resources
- Pressure for cost effective power, power reduction and all things green will continue
3. I agree with the cloud projections made by Raffo, but I believe that the underlying storage architecture for private clouds may differ from public clouds. The seductive price points for public services will continue to stretch internal IT architects to come up with internal/local storage solutions that drive internal storage solutions to a cloud ‘cost parity’.
- Continued surge of local/DAS/commodity storage architecture for some cloud architectures
- There needs to be continued caution of price taking control of the conversation, without measuring the real multi-year costs
4. Reluctance to hire more storage administrators, even with improving economic climate
- the trend continues to do more with less (or the same) staff
- I would have thought that the hiring freezes over the last 18-36 months would finally require a catch-up phase, but what I see and observe is that automation and resourcefulness will be the mantra for the next few years
5. We are not done with reclamation, and many organizations may choose a reclamation route before specific investments in data reduction. Reclamation can be accomplished by
- Archiving old data
- Moving data from high-value tiers to lower tiers
- White space reclaim with thin provisioning
- Reserve capacity reclamation with array virtualization
I’m looking forward to tracking these issues during 2011, and writing about them as appropriate.
In the meantime, what storage economic factors do you see having an impact on your strategies and deployments this year?
What was of most interest in storage economics in 2010?
by David Merrill on Jan 5, 2011
I’m always interested in the trends in storage economics and in identifying them, take into account as many sources of information as possible, especially among our customers. Sharing these trends with you on this blog is an important focus, so it’s interesting to consider what generated the most interest over the past year.
In the spirit of being transparent with you, I want to share three of the most popular posts of 2010. Hopefully, this will help you as you think about storage economics in the coming year. It’s also instructive to me on what trends and topics to consider revisiting in 2011.
1. Hybrid Tiers
We have seen a shift in the automotive world to hybrid vehicles with part electric and part gas-powered motors. There are points of efficiency to operate with batteries, and then again when horsepower is needed. The intelligence of the engine control determines when the car should operate on battery or pistons, thus removing the decision-making for the driver.
The analogy of hybrid cars may clarify where we are as a storage industry with piston disks (mechanical SATA or SAS) and the new electric motors (solid state drives). We know that all SSD systems are powerful, consume less electricity and cooling, but are expensive and have a limiting factor with the amount of volumes available in the industry today.
2. ROI and ROA
I am frequently asked me about the difference of ROI and ROA, both techniques that we use in defining storage economics. Here you’ll find my IT-econ perspective.
Earlier in the year, I was in a global summit meeting, strategizing on extending HDS’ economically superior storage architecture theme with our cloud-enabling and cloud-producing storage products. As we discussed TCO comparative methods the notion came up to “just ignore the cost or price of the disk drive.” Everyone in the room sat silent. We are a storage company; how can we ignore the price/cost of the drives? Heresy? Insanity?
Happy New Year — here’s to more storage economics dialogue in 2011.