• Post author:
  • Post category:Blog

We can all agree that we live in a world increasingly driven by data and information. In fact, the pace is accelerating and the amount of data being collected is growing almost unabated. In addition, the data being generated comes not only from our interactions with the digital world through various touch points (e.g., mobile devices, smartphones, Internet, etc.), but also from a vast array of digital sensors that surround us. They are everywhere – on the street, in the subway, on the highway, in the sky, and even on our bodies as wearable devices that track our fitness and health.

How can we make sense of this data?
Before, we answer the question; we need to understand the nature of the data and where it comes from. Let’s address the problem from a data science point of view.

Collection

Because data can exist in multiple systems, each with varying capabilities and complexities, and each focused on a particular goal or need, it is essential for any technology solution to remain agnostic with respect to how it was gathered and where it originates.
Although standardized formats and content are preferable, the solution must remain flexible enough to accept data regardless of format or means of connectivity.

Transformation

A critical piece of the data analytics puzzle is the transformation of the acquired data into a single format that can be queried, alerted, and reported, during which time it may also be necessary to perform some data cleansing to ensure that inaccurate readings (e.g., a sensor that is disconnected or malfunctioning) do not pollute the integrity of the data set.

Normalization

Gathering and funnelling data from disparate sources is only half the battle; the data must be normalized to be useful. Normalization involves not only merging potentially different representations of the same user into a single view but also tweaking and translating the data into a common representation (e.g., units of measurement).

Anonymization

In some instances, it is also very important to mask or encrypt certain details from incoming data in order to protect the privacy of the source,
Aggregation
In order to make use of data, it is often necessary to summarize the data along one or more dimensions of the dataset (e.g., summarizing readings by hour or by day) in order to conduct statistical comparisons along other dimensions. To illustrate, comparing the temperature at 12:01 to the temperature at 12:02 may not be very interesting, but comparing the average temperature at 12:00-12:15 across days might be highly valuable.

Interpretation

Determining whether data is actionable requires an interpretation of its value as compared to a known value (e.g., “normal body temperature”) or to the underlying statistics of the dataset (e.g., number of standard deviations from the mean value), and in an increasing number of cases, the proper action must be determined automatically and within a very short time window.

Visualization

Once the data has been unified and consolidated, we need to overlay different components that may seem unrelated to achieve a global picture and interpretation of the data. This means that we need to present the data to the user in a way that is contextually relevant and time critical.
The Power of Real-Time Data Analytics
We all strive for order, predictability, and simplicity. But, we live in a world increasingly driven by and dependent on data generated by a myriad of different inputs. How can we make sense of this data? The answer lies in real-time data analytics, that is, the ability to accurately identify, capture, and track multi-sensor data accurately in order to enable more immediate, effective, and actionable decision making about our personal health, our physical environment, and our social lives.