Advanced Data Visualization

This course is taught by Dr. Katherine E. Isaacs. This is one of the most useful course I have taken during my PhD life. I have learned a lot of visualization techniques, including a major javascript library that I often use now-a-days (d3js).

A visualization I like

In one of the first classes, we needed to submit a visualization that we really like and explain why we like it so much. So here is mine:

Description of visualization:

Tensorboard is an optional module of TensorFlow, a popular deep learning framework. After creating and training a deep learning model, we can use TensorBoard to visualize the effectiveness of learning of the model.
While the model is being trained, we get the output of a layer of the model and store it in the TensorBoard directory. The data can be multidimensional (higher than 3). Later, tensorboard will take the data, and embed using the t-SNE algorithm into 3-dimensional space. This 3-dimensional representation of the model output data can be used to know what the model has learned from the training. In the visualization, we can see that the output of a hidden layer of the model is gradually differentiating between different labeled samples (0-9). So the model has learned about some differences between the labels.

Why I chose it:

I have explored and experimented with training a lot of deep neural networks while learning deep learning. One of the most challenging tasks about this is knowing if the model is learning something or not.

In some cases, the loss of the model may decrease over time, or the loss may decrease but the accuracy does not improve. In these cases, all you have are many trained weights to inspect and know what is happening while the model is learning. As the weights of models themselves do not carry human-understandable information in them, there is no way to say whether a model is capturing something about the data or not.

So I found that the output of different layers of a deep neural network can be useful to see what responsibilities a hidden layer is taking for our target result. But the output of the hidden layers is not always 2 or 3 dimensional. So we need to use a dimension reduction algorithm like PCA to see it. t-SNE has always been better for the visualization of multidimensional data. In tensorboard, this functionality is built-in, so all that is needed to be done is to pipe the outputs of the layers to tensorboard.

This strategy helped me a lot in understanding the inside-out of a deep neural network and how they work. That is why I chose this visualization.

Most of the visualizations here has been included AS-IS inside an iframe. This might make some issues rendering in your browser and go scrollably out of viewport.

Assignments

There has been a lot of projects during this course work. Many of them are improvements over the previous project done during the on-going course. Below are the 3 main assignments we did.

Grading Breakdown with BarChart (My First D3 app)

Creating Tree Layout

Creating Scatterplot with D3

Project

We did a major visualization project. I was partnered with Jacob Miller and created a D3-based library named d3-hyperbolic. The project can be found in GitHub and the documentation to use this library is hosted here.

This project is not yet complete and none of me and Jacob is planning to work on this.

The project report can be found here.

With the library we created, here are 2 apps for visualization in hyperbolic space.

Draw circles and lines in hyperbolic space

Force directed graph layout in hyperbolic space

Paper Review and Question Answering

Except the programming stuff, we also have gone through a number of viz papers and write long and short question answers from the paper.