Documentation

Downsampling

It might be a bit challenging sometimes to see a clear picture of the information presented in your charts, when you are working on very large sets of data that need to be visualized. This is why the process of downsampling exists, enabling you to reduce the data rate while keeping the general trend and facilitating the comprehension of your set. Below, we will show you how to downsample in datapine, how it works, and explain why it can be a good idea to do so.

 

datapine offers you the option to downsample the number of data points displayed to create a clearer picture of your data by retaining only the important visual characteristics. The downsampling method used in datapine is a very common one -namely in cartography- called Largest-Triangle-Three-Buckets. It basically divides all data points in a chart into a number of buckets of equal size and then only leaves one value from each bucket, by forming the largest area of a triangle between the previous, current and next bucket data points.

 

Follow these easy steps to decrease the number of data points displayed in a chart.

 

illustration of a downsampling example

 

 

  1. Open your chart in the Analyzer.
  2. Now, right click onto the chart background to display the quick settings.
  3. Mouse over Show most relevant data, and simply select the number of data points you want to display from the dropdown list.

steps to follow in datapine for downsampling

 

WHY IS DOWNSAMPLING USEFUL?

 

As human beings, our brain cannot process the infinite amount of data a computer can manage. We are highly visual creatures that need to visualize information to make better sense of it and that can be quite a challenge when large amounts of data are involved, putting our big picture comprehension to test. That is why downsampling can help, as it keeps only the important visual characteristics.

 

Besides, just because you have a lot of data does not mean that you always need it, nor that your machine can process it all. By reducing the size of your input parameters, you can cling to the data offering the most valuable information, and dispose of the rest without twisting or altering the results.