In my recent blog posts, I’ve focused on Data Analytics, including deep dives on the methods and approaches that go into building IoT solutions such as Bsquare DataV. Now I’d like to move up a level and outline the steps that businesses will need to take to put these approaches effectively to work. I believe there are four common steps that are effective in an analytic problem solving approach for data driven decision-making. I firmly believe that these four directly translate to business processes:
Step 1: Develop a clear understanding and articulation of the problem you are trying to solve. “If I had an answer to X, it would create business value for me because it would enable me to do Y.” Understanding of the problem also means being able to articulate the business action or response.
Step 2: Enumerate the data sources and types of data that are available to answer your business questions. Contrary to most common practices, this should always come after identifying the business problem you are trying to solve. Ask four critical questions:
Question: What data do you already have that you need? Traditional data driven decision-making where people start with data and say here’s what we have, this is clearly the data that we need – let’s start asking questions of it! When data requirements end at this stage, customers are often left feeling very unfulfilled with their end solution.
Question: What data do you have, that you don’t need to include in your solution? I think this to be a very important step because it helps clear the space and focus the solution. It reduces complexity in ETL (“extract, transform, load”) and implementation and often reduces cost and pain getting at answers. Only pull in data if it clearly helps to answer a specific question.
Question: What data do you not have, but can get? Quite often there is simple publicly available data, or a simple enrichment step that can get you the additional data you need. If you don’t ask the business question first, you’ll never know that you could have had it – and likely will never explore the data that you already have to answer the question that you really wanted to ask.
Question: What data do you need to answer the question, but do not have – and there is no possible way that you can get it? This is a showstopper, but should not be ignored. Identifying this upstream significantly reduces downstream problems and broken analytic promises. A negative answer to this question is not necessarily a bad thing, it just means that you need to revisit the first question and if necessary re-scope the analytic solution.
Step 3: Identify the audience to whom you are trying to communicate your results. This will largely inform what the deliverable or what the product should do and how the data should be visualized. Typical approaches may include dashboards, automated reporting, or alerts – but understanding your audience is critical.
Step 4: Understand the complexity and dimensions of the space you are analyzing. How complex is the space and can it be reduced to layers of 2-3 dimensions? This question is dependent on a clear enumeration of your data types. As well, the complexity of the space is directly related to how you plan on communicating insight to your identified audience in Step 3. Simplicity is the key to effective data visualization, which is the way analytic results are communicated. It is very easy to add tremendous chaos to your data analytics process by underestimating the complexity of more than two or three dimensions.
To better understand what is meant by dimensions, let’s consider a simple example from the transportation industry. For a semi-trailer rig, how is fuel consumption (ex. mileage) related to engine horsepower? Generally, you can picture what the relationship would look like – as engine horsepower goes up, so too would the consumption of fuel i.e. a positive correlation between a single independent and dependent variable.
Now let’s make this simple example more complex and messy: How would you visualize the relationship between fuel consumption, type of route traveled (in-city, long haul, short haul), engine horsepower, truck configuration, time of day and driver experience level all in the same visual – not easily! But each of these variables could be analyzed and characterized separately in a very meaningful way.
When the dimensionality of the space is complex the best approach is to break the problem into multiple layers of analysis, each with a simple two or three dimension component. The DataV platform from Bsquare can deliver analysis in multiple (in fact, an arbitrary) numbers of dimensions but the value of reporting on findings is best approached with a simpler approach.
Click here to read more blog posts from Bsquare leaders…