Picking up from where I left off in the last post, now lets discuss why and how leveraging this mammoth of data can revolutionise how a business organisation is run. This is the 2nd part of a two part blog post and if you haven’t read part 1 (Data Revolution … Big Data), I highly recommend you read it before delving into this one.
A use-case: Entertainment industry
From this example in figure 1, it is observable how the movie industry went from a product oriented business model to a data driven model. The earliest businesses focussed solely on selling VHS tapes & DVDs to consumers. This model is terrible for the consumer as there is no long lasting value in owning a movie. Once you watch it, the value is gone. Then this industry evolved into a service oriented industry. Instead of transferring ownership of the good to the consumer, he/she would buy a subscription from the vendor where the vender would make a collection of media available to the members. As the internet boomed and hardware flourished, innovative companies like Netflix, Spotify took away the whole physical aspect of having to collect a DVD/ CD and return it back to a physical location. With the improvement of internet technologies, content is delivered to consumers digitally. At this point, the service oriented business has evolved into a technology focussed model where the business process and the customer experience has been completely changed using new technology. And this technological trend has enabled businesses to capture more user behaviour signal and the software advances have led these companies to create ways to analyse this digital footprint and make more informed decisions to further improve customer experience, transforming these technology focussed companies into data driven business organisations. Using techniques such as Machine Learning, Data Science and Computational Statistics, these organisations have found new ways to improve components such as content search, information retrieval and personalisation which improves the user experience drastically. Availability of big data and tools to manage this datasource is the single vector triggering this transformation. Methods such as Machine Learning and Data Mining complements this trend by giving better ways to create disproportionately large value from the data that is being captured.
What this use case suggests is that the business landscape is truly changing with the emergence of new technologies. This gives you an opportunity to use these methods to enhance your business organisation to reap the benefits of the technologies thats been developed in the last decade.
But, applying big data and machine learning to every business naively can greatly harm the return on investment (ROI). There could be some risks of jumping into the whole big data, machine learning bandwagon too aggressively without caution. Some obvious risks are,
- Big data and Machine Learning are not for every company. It doesn’t make sense to invest big money on infrastructure and personnel if your business has very little value to extract from the data you accumulate.
- It helps greatly if the stakeholders of the organisation (such as directors, accountants, financial controllers and etc…) are aware of the end goal of these projects
- If the endeavour is a long term one, it is quite important that your superiors have the patience to see it through.
- Often, big data and machine learning projects might not have direct $$$ numbers that can be associated with them although they bring in great value passively.
Due to above reasons, it makes a lot of sense to follow a systematic approach to keep the above factors under control and reduce the risk of your vision going down in books as one of those “unnecessary financial blackholes”.
One of the very popular approaches to leveraging Data Intelligence to business organisations is the DIKW pyramid. DIKW stands for Data –> Information –> Knowledge –> Wisdom. DIKW Pyramid is a popular representation that is used to depict how data can be harnessed into a valuable resource by adding more and more context. This is a systematic approach that starts with the raw forms of data and use a step by step strategy to develop features that add value by refining the useful signals from that data.
Before analysing how to use this tool, lets understand what is Data, Information, Knowledge and Wisdom.
As you can see from the figure 2, the left side represents the pyramid while the right side describes how data looks like at each stage. Data can be harnessed in different levels to extract increasingly valuable information from it. Data consists of values(letters, numbers, symbols and etc…) in its raw form. It is evident from figure 2 that the bottom layer data represents a bunch of values that do not mean anything by themselves. Information is data enriched with some context. By this stage, additional information has been added to data to create more sense out of it. Value “Red” now has some meaning as there is an additional piece of information that “it relates to traffic lights”. The other two degrees values are also meaningful now as it is a location. Knowledge is understanding the patterns in the information. As the right side of figure shows, now there is additional information about the traffic light and its relationship with traffic violations. Wisdom is being able to react using the knowledge at hand. Based on the knowledge of the pattern recognised, i.e. “crossing red traffic lights triggers traffic violations”, now it is possible to react to that knowledge and actively avoid negative consequences.
But what happens in this process is that we continuously enrich data with additional information that allows us to get more actionable information. Now let’s see how this pyramid is applied in real world.
Different Data related operations fall in different stages of the DIKW pyramid. The base mainly consists of getting the right data into the system and unifying the data coming in. The data layer mainly consists of building the right data sensor that can capture the right data in the right volume and velocity. In this phase, the enrichment of data is minimal. The main focus is to use the raw data to build accurate representations that are useful. Main activities can involve things like de-duplication of data, extreme value removal and making sure that the data is captured and stored as reliably as required.
The Information and knowledge layers of pyramid are where data mining, machine learning is used to enrich the captured data with more context. this is the data that is mainly consumed in making the reports and dashboards that give the visibility into business process. In this phase a lot of pattern recognition, statistical inference and predictions are used to enrich the data.
On the top, we have the action phase, that uses wisdom to actively take decisions. Based on traffic prediction and pattern recognition on orders etc… a distribution company can forecast stocking, warehousing and distribution more effectively. A trading algorithm uses the knowledge it extracts from numerous data sources inc. stock movement, weather data, news, social media to execute trades.
In summary, DIKW pyramid is a good approach to introduce incrementally complex and valuable data driven features to a business without having to throw into leaps of faith two large for the organisation.
DIKW pyramid to everyday businesses…
As explained in the sections above, the data layer is mainly leveraged by building the right sensors and data ingestion infrastructure to store the data. Once the data layer has useable data, it is possible to start building components that will lead to higher levels in the DIKW pyramid. Figure 4 below gives a concise summary of different levels of complexity that can be introduced incrementally to transform your organisation from a data warehouse into a proactive data driven engine.
The most important thing to bear in mind during this transformation is that a business organisation is a system that needs all the nuts and bolts to turn for the business to move forward. Data transformation cannot be single handedly achieved by enhancing the technology that drives the business. The transformation is enabled by improving the technology, people and the processes that makes the business operate as one unit. If you fail to keep in mind all these aspects when introducing change, it can have dire consequences.
- With the technological advances leading to cheaper, yet powerful infrastructure, data collection, processing and storage has become so simple and feasible.
- Seeing the opportunity, business organisations have adapted a culture where they record as much data as possible → Big Data
- With the emergence of big data, methods and techniques to extract insight from this mammoth of data has emerged → Data Science and machine learning
- With the proper use of Big Data and Data Science, a business can unlock limitless opportunities to evolve into data centric organisational cultures.
These two posts are a summarisation of an invited talk I did at Post Graduate Institute of Agriculture at University of Peradeniya in Sri Lanka in January 2017. The slides from presentation are found below: