The goal of data analysis is simple – inform a business decision. Whether you are testing a hypothesis, or explaining a pattern, the goal is the same.
The buzz in the data space is often about advanced Machine Learning (ML), Robotic Process Automation (RPA), or some other advanced functionality. Those capabilities are important – they bring distinct business value, and your analytics platform should accommodate them. However, this should not be the first step in your analytics journey.
What we do not talk about enough in the data space are the foundational data aspects – and these are not just technical in nature. Before you get your user community to actively engage with your analytics environment, for example, you must build trust in the data. Data modeling and analytics practitioners tend to think about this in terms of “master data” or governance. Those are necessary supporting activities, but data trust – from a user’s perspective – is more fundamental than that. Data trust occurs when a user can see a number, drill into the details and quickly explain “why” something is occurring in simple terms.
Your data environment does not have to have every piece of information conceivable. But it does need to be set up in a way that allows the business community to ask an unanticipated question (around a set of subject areas) and quickly determine an answer.
Put yourself in the end user’s position: They are performing analysis based on a request, likely from their management team. Decisions are perishable. They must be made within a given timeframe to be effective. And people want to make decisions with as much information as possible. At the same time, decisions must be made regardless of whether complete data is available or not.
Data trust is about being able to be confident in what your analysis is telling you and being able to explain why, quickly. This requires access to information from myriad sources (or modules) in the modern enterprise.
Historically, technology implementations focused on answering known questions – and even then, take months or years to deploy. Data trust requires integrating data access in an agile fashion to support the ever-increasing pace of the marketplace. Legacy technologies and approaches do not work that way. They were built for the world, as it was, two decades ago. Data trust requires a different approach.
Supporting an Agile Business and Building Trust with Direct Data Mapping
Incorta’s Direct Data Mapping allows you to traverse across databases and multiple data-source joins at scale without a logarithmic memory curve.
So, what can you actually do with it? Here’s one big thing: In near real time, you can have access to very detailed data from Oracle EBS to understand patterns you’re seeing and make business decisions based on trusted information, as opposed to going with your gut.
When we think about the data in Oracle EBS, we’re talking about a platform with something like 138 different functions and databases—inventory, order management; transportation management; accounts receivable; accounts payable, etc. All of this information is stored in different areas inside of EBS, and within each area you might have 50,000 or 60,000 different tables, comprising billions upon billions of lines of information.
Because of the size and complexity, there are some inherent difficulties in trying to work with system data – the primary one being that EBS is designed to be a system of record. It’s not designed for people to run heavy duty analytics queries against it.
For example, let’s say you’re a major retailer. Every time you make a sale, multiple records are generated and stored in various databases. If you want to do analytics to answer a basic question such as, “what is my cost of sales?” the data could come from 10 different systems and require you to churn through a billion rows. But when you start pulling that much data from EBS, it slows down because of the memory and processing power that requires. You can’t slow down the source system you’re using to sell something to somebody in the store and record that transaction, so you have to work around that.
Deciding with Incomplete Data
You could replicate the data, but you still may not have the processing power to answer the question, or it may be cost prohibitive. There’s also a question of latency; the data is out of date as soon as you copy it.
Up until now, the solution has been to have an IT person or an analyst build a cube or data mart for you. They will go to those 10 tables with the billion rows and construct a small data set for you that allows you to do pivot tables or something of that nature. It’s going to take weeks for them to do that for you, and you will probably have to learn some SQL to make sense of it.
And then you will probably have more questions. You can get more answers, but it will take another couple of weeks and you still may not get to the bottom of it. Eventually you are going to have to go with the data you have and your gut. This has been a pain point for decades.
Adding a Bigger Engine
To address it, data analytics software vendors have tried to find ways to make data transformation a little simpler; to shave a week off here and there. They’ve thrown more memory and processing power at the questions – an Oracle Exadata box, for example. They’ve given people visualization tools to make it easier to make sense of the data.
Instead of coming up with a better car, most vendors have been putting a bigger and bigger engine in it to the point where we’ve got an engine that’s wider than the two lanes of road we’re trying to drive down.
The business climate over the last decade has become more and more unforgiving of this approach. If we had the processing power of today with the volume of information and the cultural expectations around information of 20 or 30 years ago, it would be unreal how fast things would operate. But now we find ourselves in a situation where the process by which we do analysis is not different, but the amount of data and the speed at which we demand answers are radically different. You’ve got to be able to deliver data faster and at a level of detail that was previously inconceivable.
Creating a Map of the Data
Direct Data Mapping provides a way to do this. Instead of assuming that you have to process data in a certain way, Incorta creates a map of the data.
When you type a query into a search engine like Google, for instance, you’re going to get an enormous number of results (not all of them relevant, but that is beside the point). That’s because the search engine is constantly indexing information on the internet so that for any given query, it knows where to find related information. If it had to do an on-demand check of every web page in response to every query, it would be unusable.
The way Direct Data Mapping operates is similar. Instead of holding your question in memory, looking for the matching data, holding those matches in memory, looking for more matches and holding them in memory, etc., our engine looks at the data as it is consumed and determines how all of it is related.
Imagine you are looking for a needle in a haystack. Following the old way, you have to bring in the entire hayfield, bale the hay, and then take apart every bale until you find the needle. With Direct Data Mapping, we walk the field of hay and map it, and we know exactly where the needle is: 100 yards over there and two inches deep. As more hay grows (more data comes in) we’re constantly updating the map. Now we’ve got a nail sitting over here, a hundred yards the other way, four inches deep. We saw that come in yesterday. And then, by the way, if you’ve got barley, we’ll map out the barley next. Incorta is doing that for all the data in your 138+ subject areas in Oracle EBS.
Answering Unlimited Questions
I don’t mean to trivialize the math or the technology, because it’s very advanced. The radical idea here is that instead of trying to tack on something to make the existing process work better, Incorta is doing it differently. Now, instead of having something that can answer 10 questions, you have something that can answer 1,000 questions.
It’s important to note that Direct Data Mapping doesn’t change or replace EBS. It allows you to interact with and analyze the data that’s already there.
When you have a product that generates as much data as EBS – but don’t have a system that allows you to easily work with it in that granularity – then the best you can do is inspect a few hay bales and hope the needle is in there. And if it’s not, you have to go back and bale more hay.
You don’t have time for that. Every decision is perishable. You will make it with all of the data, or you will make it with whatever you have at hand. Because of the limitations of technology and our inability to keep pace with the growth of data and the expectations of an ever more dynamic, business mindset, we have created a situation where you often can’t get answers to the hard questions.
And that’s dangerous. That’s what should be keeping people up at night – the questions people aren’t asking anymore because they’ve given up on ever getting an answer.
To provide answers to the unanticipated questions, and enable the culture of curiosity, that’s power. That’s what people want. Not just data, but the ability to break free from the preconceived and predefined. That is what Direct Data Mapping enables, and that is where competitive advantage lies.
Mike Nader is Managing Director at EY. He formerly worked at Incorta, where he served as Principal Technical Evangelist & Architect, Head of Channels, and Head of Product Marketing.