Getting started with Druid: A high-performance, column-oriented, distributed data store.

Tags: ,

Getting started with Druid: A high-performance, column-oriented, distributed data store.

Getting started with Druid: A high-performance, column-oriented, distributed data store.

The goal of this post is 2 folds: Being able to work with arbitrary data generated from http://www.json-generator.com/. Ingesting it into Druid. Performing filtering and aggregation on the data. We'll start with timeseries data and then try and look for ways to work with non-timeseries data. Generating data I used the following json spec to generate data from json-generator: Ingestion We generated a sample data from json-generator.com Now we need to generate the index file, which is like a metadata document that druid uses to ingest the files. The ingestion spec is a JSON document of the following structure: The

Continue Reading