Advanced visualization concepts
MIDS W209: Information Visualization

Partially based on slides from Tamara Munzner

What We Are Going to Learn

  • Traditional Analytics methods
    • Clustering
    • Regression
    • Classification
    • Dimensionality reduction
    • Recommendation systems
  • Visualization + ML
  • Business Intelligence Tools
  • Deploying your visualizations
University Of California at Berkeley logo

Clustering

http://scikit-learn.org/stable/modules/clustering.html#clustering

When to Use It

  • When you want to find similar items
  • Depends on your distance metric
  • When you have too many items and you want to aggregate

Examples

  • Customer segmentation
  • Grouping experiment outcomes
University Of California at Berkeley logo

Regression

http://scikit-learn.org/stable/modules/linear_model.html#passive-aggressive-algorithms

When to Use It

  • Present (identify/compare) tendency
  • Predict values

Regression

http://blockbuilder.org/tmcw/3931800by tmcw

Examples

  • Stock prices
  • Drug response

What car to buy?

User: person buying a car

Task: What's the best car to buy?

Data: all cars on sale

Normal procedure

Ask friends and family

Problem

That's inferring statistics from a sample n = 1

Better approach

Data-based decisions

https://tucarro.com

Jeep Willys

  • Colombia bought many Jeeps after the war
  • They are the a sort of mountain taxi
  • There is a trend to pimp them up
Colombian Jeep Willys. Foto by John Alexis Guerra Gómez
University Of California at Berkeley logo

Classification

Classification algorithms
https://scikit-learn.org/stable/modules/ensemble.html#random-forests

When to Use It

  • Present (identify/compare) tendency
  • Present (identify/compare) groups
  • Aggregate
  • Predict values

Examples

  • Photo categorization
  • Sentiment analysis
  • Spam filtering
University Of California at Berkeley logo

Dimensionality Reduction

When to Use It

  • Attribute filtering
  • Categorize documents (topic modeling)
University Of California at Berkeley logo

Recommendation Systems

When to Use It?

  • Large catalog with user preference history
  • If you like "A" and "B", maybe you will like "C"

Types

  • Collaborative filtering
  • Content- based systems
  • Hybrids

Examples

  • Amazon
  • Facebook
  • Google
  • Yahoo
  • Netflix Prize
University Of California at Berkeley logo

How to Use the Algorithms

University Of California at Berkeley logo

Vis for ML

Tools

  • LIME
  • Shapely values
  • ...
University Of California at Berkeley logo

BI Tools

Tools

  • Tableau
  • PowerBI
  • Microstrategy
  • Looker
University Of California at Berkeley logo

interview Alex Baldenko (MM)

Alex Baldenko has worked as a lead data scientist, led teams of data scientists, and now leads the data analytics team inside of MassMutual's data science group. They use reliable BI tools to build enterprise-grade analytical systems.

University Of California at Berkeley logo

Cloud Deployment

Where

  • AWS
  • GCP
  • Azure
  • Linode
  • Heroku
  • ...

Tools/Technologies

Paradigm: Infrascructure as code.

  • Ansible
  • Chef
  • Puppet
  • Terraform
  • Cloudformation
  • Kubernetes
  • Docker
  • Vagrant
  • ...

Ah!

What/Why/How

A lot of focus on what and how... but why?

University Of California at Berkeley logo

Other tools to check

Trifacta

Flourish

Data Wrapper

University Of California at Berkeley logo

What We Learned

  • Traditional Analytics methods
    • Clustering
    • Regression
    • Classification
    • Dimensionality reduction
    • Recommendation systems
  • Visualization + ML
  • Business Intelligence Tools
  • Deploying your visualizations