I successfully completed my Google Summer of Code project last week. It has been an enriching journey and I am both delighted and saddened upon the conclusion. My project started of as creating a dashboard for TCIA but ended up being much bigger: A Data Visualization Engine.
The initial few weeks were spent mainly on prototyping and running benchmark tests to push dc.js and crossfilter to their limits. We soon realized that dc.js approach of doing everything: loading, filtering and rendering on the client side would not work for the scales of data that we needed. We shifted the loading and filtering part to the server and created a communication mechanism using AJAX.
The biggest challenges we faced were
- Handling the large scale of the data. Our system can handle millions of records having 15-20 attributes. We are trying to handle further larger datasets
- Making the system generic enough to be easily configured to handle any sorts of data. An author needs to configure only 4 files namely: dataSource.json, dataDescription.json, interactiveFilters.json, visualization.json to create his/her own dashboard.
My mentor was Ashish Sharma, who is an assistant professor at Emory University. It was an amazing experience to work with him. We talked every week using Google hangouts. We'd often use the drawing tools, especially during the early stages, to develop prototypes.
We are hoping to extend the project and submit research papers on the how the system can be used as a generic visualization engine to visualize multivariate data.