New IoT programs often face a critical decision early in the development process. The full ramifications of this decision are not seen until field deployments begin many months or even years later. There are often two groups of people involved in this decision. Never having done an IoT project before, they do not have a base of experience to draw on.

The development teams says, “What data do you want?”

The product manager says, “What data do you have?”

Development says, “We have this, but it may be expensive. What do you need?”

To which, the product manager says, “We aren’t sure yet. Why don’t you send it all and we will use the power of the Cloud to sort it out later.”  Or someone else pipes up, “Wouldn’t it be cool if we…?”

The decision is made and work begins.

As the number of devices is deployed, the DevOps team notices that message traffic is starting to stress the system.  The data storage seems to be growing exponentially. In the meantime, the users are trying to sort through massive amounts of data to find the key values or events that matter.

DevOps solves the problem by adding more server capacity. Engineering says it will take a lot of effort to update the devices in the field and besides, you still don’t know what you want…

Let’s go back in time and do the decision meeting again.  This time we approach the problem from the perspective of a business requirement or product feature. Product management will have defined the stakeholders that need information, how it will be consumed, and how it will produce value. The users have these expectations. The business has this cost expectation. With that in hand, the user story for the developers is very crisp.  We need this data, at this time or event, in this data store, so we can perform this type of presentation or analysis, within this cost model.

Will product management get everything right?  Probably not. But, by starting with the destination in mind, they will have a list of specific data for identified purposes. As data comes in, learning will happen. The assumptions will be validated (good or bad). Decisions can then be made on what data to drop and what new data to add in the next revision.

Best of all, the costs of too much data are not crippling the business case for continuing the project.

Post script:  Any time someone starts a suggestion for IoT data with the phrase, “Wouldn’t it be cool if…”, it may be cool, but is usually not a good business choice.