Big files: XML or JSON ? TeeBI !!

TeeBI Dashboards

 

Introduction

XML and JSON are very typical text formats used to store data, designed to be more comfortable than plain old “csv” text and allowing hierarchical (parent -> child) relationships.

However, even if there are many wonderful standard libraries to process them, there is still a speed problem when loading big quantities of data (say, hundreds or thousands of megabytes).

Content has to be parsed and stored into memory structures representing the “tree” nature of data nodes and attributes, which can be very fast (milliseconds) for small files, but terribly slow (minutes !) for big ones.

TeeBI core base class (TDataItem) is an “agnostic” memory structure providing parent -> child connections, using simple arrays to store data (one array per field or column).

Pseudo-code:


TDataItem = class
Name : String;
Items : Array of TDataItem;   // <--- Children
Kind : TDataKind;  // <-- Integer, String, Float, DateTime, Boolean, or "List of TDataItem"
Data : Array of ...    //  <--  one array for each Kind: "Int32Data : Array of Int32"
end

 

With a TDataItem, loading and saving big quantities of data is insanely fast (0.2 seconds to load or save 1 million rows with 4 columns on a normal PC).

The arrays are saved / loaded to / from Streams directly in one Write / Read operation.

That means we can import data from XML or JSON (or any other format like database datasets, server connections, Excel, etc, etc) into a TDataItem and then save it to a binary TeeBI file for later reuse.


Data := TDataItemPersistence.Load( 'my data.bi ')

 

Once a TDataItem is created or loaded, we can use it in many ways:

  • Search and modify data, re-structure columns
  • Sort data by multiple fields, and by custom expressions
  • Run ultra-fast SQL-like queries and summaries against TDataItems
  • Set master -> detail relationships between different TDataItems
  • Filter rows by code or using expressions (as strings or as TExpression classes)
  • Create calculated columns (using code or expressions)
  • Merge several TDataItems
  • Compare the structure and / or full data of TDataItems to obtain difference sets
  • Present TDataItems using Grids, Charts, Dashboards and PDF output
  • Connect TDataItems to a super-fast TBIDataset (a normal memory TDataset class)
  • Export to any other format (for example XML to JSON and vice-versa)
  • Access remote TDataItems from web servers transparently
  • Apply machine-learning algorithms using R or Python Scikit-learn
  • Access basic statistics of any TDataItem or child item

 

Note to TeeChart developers:

TeeBI includes a new TBIChart control (derived from TChart) that is capable of automatically creating new chart series and fill them from any TDataItem.

BIChart1.Data := MyDataItem;

A planned new feature is to integrate the Data property editor dialog inside the normal TeeChart editor, for design-time support (zero code charting !)

 

TeeBI library is available for download at the following link:

https://github.com/Steema/BI

Supported development environments:

  • Embarcadero Studio (Delphi, C++) from XE4 version and up
  • Lazarus FreePascal
  • …and soon for Microsoft Visual Studio .NET

Several 3rd party products can be optionally used with TeeBI:

https://github.com/Steema/BI/wiki/3rd-party-supported-products

 

For more information:

Please visit the TeeBI community at Google+ and the TeeBI home website for more information and technical details.

 

 

Generic Delphi Tree structure

A generic Tree structure made in Delphi is freely available at this GitHub repository:

https://github.com/davidberneda/GenericTree

See the readme at GitHub for details and updated documentation.

This small class allows creating hierarchical structures where each “node” is a small object (20 bytes plus your own data size), containing a property of your desired type.

For example, a Tree of string nodes:

var MyTree : TNode<String>;

MyTree.Data :=’hello’;