Data virtualisation is high on the agenda of many organisations. Traditional data warehouses are no longer equipped to keep up with the explosive growth of data volumes and user queries and more and more companies are looking for an alternative architecture. The Logical Data Warehouse (LDW), where data duplication is minimised to the greatest extent possible, is then the most obvious choice. By unlocking data in a virtual instead of physical way, you can anticipate and respond to changes faster and access new information much faster. Then, and only then, does it truly involve Agile BI.
Two justifiable questions
A professional data virtualisation platform forms the foundation for a logical data warehouse. Software products like Denodo Platform offer the possibility to integrate data sources into one uniform information model without having to duplicate data, independent of the location or format. The possibilities that a platform like this offers in terms of connectivity, authorisation and performance optimisation enables companies to migrate to a logical data warehouse architecture with exceptional speed. In actual practice, I see that data virtualisation often exceeds the expectations, but that many organisations struggle with the investments in new software for data integration. In fact, they usually ask me the same two questions:
- Which alternatives can I choose between and what are the differences?
- How can I justify the investments to my organisation?
These two questions are justifiable and relevant. So essential, in fact, that they must be answered at the earliest stage possible.
With respect to the alternatives: of course, there are alternatives. The purchase costs also differ significantly. However, the golden rule of ‘you get what you pay for’ also applies here! The available data virtualisation solutions differ a great deal in functionality. At first glance, they do more or less the same thing (virtualise data streams), but the devil is in the detail. The benefits that you reap from a data virtualisation platform depends strongly on the (standard) functionality offered by the software. Therefore, certainly consider alternatives, but properly investigate the costs and savings associated with each alternative, in advance. Only then will you be able to make a well-considered decision based on a realistic business case.
This takes us straight to the second question. How do I substantiate the investment in data virtualisation software? In this article, I explain how to put together a business case in four steps. This not only allows you to quickly compare various alternatives, but the business case will also demonstrate, rather soon, that your organisation stands to earn lots of revenue through data virtualisation!
A business case in four steps
The business case should demonstrate that the data virtualisation investment is justified. Or: that the revenue will exceed the costs. To retrieve this information, we must compare the costs to the expected revenue over time. You will have to answer these four questions for each alternative that’s up for consideration:
- Which costs will we incur when implementing the new platform?
- Which costs are incurred in the current process?
- What are the potential savings of the new platform, compared to the current process?
- What opportunities does the new platform offer and of what value are they?
1 – Implementation costs
We start with the easiest question: what does it cost to implement a data virtualisation platform and to keep it up and running? Not just the initial costs, but the annual recurring costs as well. We can place these costs in three categories: licenses, infrastructure and personnel.
Licensing fees. The amount and type of licensing fees differ per product and often depend on the number of servers (OTAP, production clusters, etc.) and CPU cores. Correctly calculate which licenses are required for your situation, beforehand. A proof of concept based on realistic use cases may help with this.
Costs related to infrastructure. Take all servers into account that must be acquired for the data virtualisation platform, in addition to your existing infrastructure. Also remember to include database servers that might be required by the platform for data caching.
Personnel costs. The costs for setting up the platform depend markedly on the complexity of the platform, the available internal knowledge and the required external expertise. Best practices are available for many platforms, to give you a proper indication of the efforts that are required.
The last category boasts many differences between the various software products for data virtualisation. Comprehensive and mature platforms with ample standard functionality, like Denodo, might be more expensive to acquire, but definitely pay for themselves in terms of the simplicity of installation and configuration. Experience with ETL and SQL is sufficient to be able to start with such a platform.
2 – Costs associated with the current process
In the second step, you identify the costs for the processes that you currently use to unlock data and that you aim to replace with data virtualisation. An obvious example is the traditional BI process: the chain from the source system to the report. However, you can also analyse the data counter: the supply of information to external parties. What does it cost to manage your processes and implement changes?
It is important to gain insight into the total cost of ownership (TCO) for the entire process, from analysis, development, testing and acceptance to deployment and management. This is an essential step, in which you perform a baseline measurement, as it were. You are suddenly given insight into the costs that you are currently incurring to unlock information, and, in my experience, the analysis is often very confrontational. Despite this, it is an essential step in determining which savings are feasible.
3 – The savings achieved with data virtualisation
This is where it gets interesting: what will you actually be afforded in savings through the data virtualisation platform that you are considering? To find out, you’ll need to know how much more efficient you’ll be working with the new platform. But, how do you go about this?
First, you must think in detail about what the future work processes will look like when you apply data virtualisation. It would be an illusion to think that you could soon replace everything with data virtualisation. Firstly, you must, therefore, get a clear picture of the near future.
- What portion of the data streams (in %) can be virtualised in the near future?
- Of those streams that CAN be virtualised: what is the actual potential savings (in %) in terms of TCO
- Of those streams that CANNOT be virtualised: are there sub steps in the processes where data virtualisation can, in fact, provide assistance and added speed? And if so, what does this mean for the savings on these streams (in %) in terms of TCO.
Once you’ve conducted this analysis, you’ll have a nuanced picture of the savings that can be afforded through the relevant data virtualisation platform. This applies in terms of analysis and development, but also in terms of maintenance, management, testing and acceptance. The savings can be substantial, since the process of unlocking data through data virtualisation is a much shorter and simpler one than in a traditional architecture where data is duplicated several times.
The best way to conduct the analysis is to draw (dissect) the processes and to pinpoint the savings per process step. Think realistically about how you plan on working with the relevant platform in the future, which functionality you’ll be using and what knowledge you’ll need for it. Only then will you be able to properly estimate the actual savings that the platform will afford you. Employ best practices or inquire about the historical figures of companies that are already working with a data virtualisation platform. A proof of concept is a very good way to substantiate and verify the results of your analysis.
If it’s done right, you’ll notice major differences in the possible savings between platforms. The more functionality a platform offers for your specific situation, the greater the savings on your current processes. In my opinion, usability also plays a major role here. The potential savings will already be much smaller beforehand as soon as highly specialised knowledge is required for the virtualisation of data streams. The more accessible the platform and the broader the functionality, the more you stand to save.
4 – New opportunities with data virtualisation
Often, a business case already comes out positive after simply looking at the potential savings. That is great, but it is obviously at least as interesting to look at new opportunities resulting from data virtualisation. Opportunities that you can’t capitalise on right now, but that your organisation could benefit from if you had more information available, faster. Why? Since some (big) data, which you could not analyse in the past, becomes accessible. Or, because you can quickly link external information to your own data. Or, perhaps, merely because you can introduce changes to your data warehouse within a matter of hours instead of weeks or months.
In my opinion, this is where the greatest benefits can be reaped from data virtualisation. If it becomes much simpler to unlock new information fast, perhaps in real-time, it mostly leads to an immediate competitive advantage or even to entirely new business models. We can’t rely on a ready-to-use recipe for this, because the opportunities differ per company. My only advice is to always draw up the business case with representatives from the business. Take the time to properly inform people about the data virtualisation concept and to think creatively about the potential opportunities. Explore all the opportunities and think outside the box! Think about what an opportunity can mean for your organisation and quantify it in monetary terms. Once you’ve done that, you’ll have all the building blocks for your business case.
Pros and cons
The thinking work is done and now it’s just a matter of adding and subtracting. The first version of your business case is ready.
In the example below, I compared the investment for the implementation of a ‘medium-scope’ data virtualisation platform (Denodo, in this case) to a conservative estimate of the costs for a traditional data warehouse. Converted to a multiannual ROI overview, you’ll see that – even with a conservative estimate – the investment has already been recovered in the second year. This, even though a mere 25% of the savings and revenue are added for the first year.
This example is no exception. Actual practice shows that organisations often recover the investments in data virtualisation over a very short period.
That’s only the start
You’ve now drawn up the first version of your business case, or perhaps multiple versions for different alternatives. This gives you the information to choose one or more platforms for further investigation in a proof of concept (PoC). In this PoC, you can verify and substantiate the assumptions from your business case. You can supplement and fine-tune the business case based on the results of the PoC. This will give you all the required information to make a well-considered decision to invest in data virtualisation.
Also, it doesn’t stop after this decision. The business case gives you a tool with which to continue to monitor and adjust the process – after implementation – so you can capitalise on the intended savings and revenue in actual practice!