Tuesday, March 7, 2023

Data Viz lecture for the undergrad data science seminar BYU- City Tech

 Last Friday was my turn to lecture at the data science undergraduate seminar offered to City Tech and Bringham Young Univesity (BYU) students. 

I taught a Data Viz course to Yeshiva University students a couple of times and It was really fascinating to learn so many important aspects of the field not frequently mentioned in ML/Data Science. 

The perspectives and work of Edward Tufte, Tamara Muzner, Jeffrey Heer, and Michale Bostocks are so deep and extensive. Summarizing this in a single lecture can be a difficult challenge. I did not want to focus solely on the use of Python libraries for visualization (matplotlib or seaborn), Tableau, or D3. I wanted to focus more on stressing that what matters is what we want to communicate and the importance of the effectiveness of the visualization. 

I based my talk on Jeffrey Heer's materials from his data viz course at the University of Washington. One of his first lectures is about "the value of visualization".  I asked the students why creating visualization and they were on point with many of their comments:

  • Answer questions
  • Make decisions
  • See data in the context
  • Expand memory
  • Present an argument or tell a story
My favorite one: Inspire!!!


Dat encoding fundamental was mentioned while showing Bertin's diagram from 1967. 


We finally got to one of the best visualizations in history by Minard about the disastrous Napoleon's march to Moscow. 
image credit: https://www.openculture.com/2019/07/napoleons-disastrous-invasion-of-russia-explained-in-an-1869-data-visualization.html



How many variables we are encoding? Notice the temperature, the size of the troop...what else do you see? 
We also mentioned how a visualization for the O-Rings in the NASA Challenger mission had tragic consequences. 
I must confess it has been hard for me to figure out a cohesive and succinct way of writing about such a fascinating topic without feeling that I am leaving relevant information out. You might be seeing more posts about this topic in the future. I'll be taking Edward Tufte's online workshop, so more important ideas will be coming! 
For now, we can start exploring examples of visualization idioms for multidimensional data such as parallel coordinates. 
Check out the cool library of examples on D3: https://observablehq.com/@d3/gallery
Until next time!

No comments:

Post a Comment

4-week summer intense precalculus course...an enthusiastic, refreshing and fun crowd!

I decided to teach a four-week summer class this year. Sometimes I had doubts because I usually use the summer months to do research in a mo...