Is Embedded Data Big Data?

3 minute read

So are embedded systems considered Big Data? Typically you see Big Data used as a term for a large amount of user or business data that is acquired via some sort of web portal and processed to get more useful information and/or money out of it. This data can span Peta-bytes of storage and can be gathered from millions of sources and or users.

But what about your you hardware? Could the data coming out of your little embedded controller be consider “Big Data”? It may not seem like much, but what an embedded controller may lack in data sources it can more than make up for in resolution. A single device can easily generate more data than we know what to do with even with a single data stream. While there may not be quite the large, widespread sampling of data sources that a distributed website/network may have, we do have very high speed data sources that we need to see and track in real-time to find any bugs or glitches in the system. That could mean we would have sample rates of well under 1ms. With even 100 sources at about a 1ms sample rate we can very quickly start to run into some of the similar issues you may see with Big Data although with some slightly different problems to consider.

Things like SQL start to become increasingly important as Excel will start to fall apart with these large growing data sources. Some things like a time series database (TSDB) may even be worth investigating.

Resolution

Typically resolution will be limited by the interface you decide to use to gather your data.

  • Serial Port

Typically see these ports used at around 57.6k baud, but you can certainly run them well over 200k baud with the right hardware. These are often used via an RS-232 interface for updating firmware or running a GUI and can be run over a USB converter that is either external or integrated into the controller. You often see MODBUS implementations using a similar RS-485 interface for industrial controls applications.

  • CAN Bus

Has many similar uses to RS-485, but is much more robust in the hardware implementation although may be slower. Typically limited to about 1Mbps or less depending on the application and wiring distances. These interface have similar cards and device interface as RS-232 and RS-485, however they are not quite as standardize on how to access the data. See [CAN Bus CAN Do] for more information.

  • JTAG

May have the highest resolution of all the methods, however is also likely the hardest data to get at. Every microcontroller, JTAG variant and development tool can have it’s own intricacies to handle in gathering the data. However you can’t beat talking directly to the diagnostic bus for speed when high resolution is required. Sometimes it may be best just use the development tools provided depending on what is being done.

Storage and Retention

There are many ways and places to dump your stream of data, however your application will likely define how to do this.

  • Raw CSV Files

Very simple and straightforward. You can even dump raw data to a csv file and figure out what to do with it later if you don’t have the tools in place to handle it.

  • SQLite Database

Putting your data into a database format without the need for a server can be very useful. SQLite is pretty much just a storage format, but you get all the SQL goodness that will let you query up your data and search quickly.

  • SQL Database

A proper database may be overkill for many low level endeavors, but if you want to share your data or you are truly dump a lot of data, a proper SQL database server will be very useful. There are many choices in this area, but that could be it’s own write up.

  • Time-series Database

A time series database is a rather new option, but it does fit very well into the embedded systems world as it’s all time series data. There a many tools out there that can make handling data very easy once you manage to get the data into a TSDB and it may be seriously worth considering depending on your application and orgainzation’s needs.

Analytics and Visualization

Coming soon!!

  • Testing and verification
  • Excel, Grafana and others

Updated: