MGMT 17300: Data Mining Lab

Bringing It All Together: From Data to Insights to Decisions

Professor: Davi Moreira

August 01, 2024

Overview

  • Motivation

  • The Importance of Context

  • Six General Principals

  • Choosing the Chart

  • Less is More

  • Hierarchy Among Data

  • Not All Data Are Equally Important

  • Telling Your Story

  • Dashboards

Motivation

Motivation

Covid-19

The Forest and the Trees

Forest

We have explored many implementation details in recent days, focusing on individual aspects of each analysis.

Today, we want to take a step back to think less about the detail and more about the process.

After all, every data analysis has a purpose. How can we achieve it more effectively?

Essential Elements of Data Communication

Let’s break down the data communication process into six general principles:

  1. Context matters
  2. Visualization derives from data
  3. Separate signal from noise
  4. Impose hierarchy among data
  5. Beauty also counts
  6. Your analysis should tell a story

Context

Context

Every analysis has a goal and an audience.

  • It’s important to separate data exploration from the final analysis. Don’t fall into the temptation of showing everything you did.

  • Adapt the report to your audience. Decision-makers aren’t always interested in execution details.

  • So what? Keep a specific learning objective in mind. It will guide which information is relevant for your report.

Isolated numbers don’t tell us much. To make evidence-based decisions, it’s necessary to establish an appropriate basis for comparison for the goal of your report.

Context Can Come from New Information…

Context Add

…or Reinforce Existing Information

Context Obs

Choosing the Chart

Choosing the Chart

  • Use graphs instead of tables!

  • What type of data?

  • How many dimensions?

  • Most reports are consumed in 2D media. Showing more than that can confuse the reader.

  • Be careful with scales!

Scales Can Be Misleading

Scale Fail

Avoid Dual Axes

Dual Axis

Or Triple Axes!

Triple Axis

Fewer Pie Charts…

Pie vs Bar

What?!

About Pie Charts

Oof

Chart Fail

Less Is More

Eliminating Noise


  • The more information in your visualization, the greater the cognitive load.

  • Your objective must be to reduce your audience cognitive costs.


Data-Ink Ratio Formula

\[ \text{Data-Ink Ratio} = \frac{\text{Data-Ink}}{\text{Total ink used to print the graphic}} \]

Data-Ink Ratio


Your objective must be to reduce your audience cognitive costs.


Data-Ink

Step-by-Step Cleanup


Cleanup 1

Eliminating the Border


Cleanup 2

Cleaning the Grids


Cleanup 3

Removing the Points


Cleanup 4

Processing the Axes


Cleanup 5

Adjusting the Label


Cleanup 6

Adjusting Colors

Cleanup 7

Before and After


Before and After

Hierarchy Among Data

Count the Number 3s

Hierarchy 1

Count the Number 3s

Hierarchy 2

Ways to Draw Attention

Hierarchy 3

Highlighting with Colors

Hierarchy Colors

Returning to Our Example

Cleanup 9

Returning to Our Example

Cleanup 10

Use Colors Strategically

Colors

Not All Data Are Equally Important

Emphasizing the Main Point

Emphasis 1

Emphasizing the Main Point

Emphasis 2

Emphasis 3

Telling Your Story

Bringing It All Together

Let’s tell a story starting from the chart below, making step-by-step adaptations we’ve discussed. What is it telling you?

Final 1

Bringing It All Together

Final 2

Bringing It All Together

Final 3

Bringing It All Together

Final 4

Bringing It All Together

Final 5

Final Narrative

Story 1

Final Narrative

Story 2

Final Narrative

Story 3

Final Narrative

Story 4

Final Narrative

Story 5

Final Narrative

Story 6

Final Narrative

Story 7

Final Narrative

Story 8

Before and After

Story 9

Application: COVID-19 Evolution

COVID-19 Evolution

Moving Average

Deaths in New York

Additional Material

Dashboards

Dashboards

Flexdashboard

The goal of flexdashboard is to facilitate the creation of interactive dashboards with R Markdown.

Flexdashboard: Features

  • Support for a wide variety of components, including htmlwidgets; base graphics, structure, and grid; tabular data; gauges and value boxes; and text annotations.

  • Flexible and easy to specify layouts based on rows and columns. Components are intelligently resized to fill the browser and adapted for mobile display.

  • Storyboard layouts to present sequences of visualizations and related commentary.

Flexdashboard: Installation and Use

After installing the package, to create a flexdashboard simply open a new R Markdown document with the output format flexdashboard::flex_dashboard. You can do this from within RStudio using: File > New File > R Markdown...:

install.packages("flexdashboard")

Flexdashboard: Layout



  • Dashboards are divided into columns and rows, with output components delineated using level 3 markdown headers (###).

  • By default, dashboards are laid out in a single column, with charts stacked vertically and sized to fill the available height of the browser.

Flexdashboard: Layout



  • Depending on the nature of your dashboard (number of components, ideal component height, etc.), you might prefer a scrolling layout where components occupy their natural height and the browser scrolls when additional vertical space is needed.

  • You can specify this attribute via the vertical_layout: scroll option.

Flexdashboard: Layout



  • You can also choose to orient the dashboards by row instead of by column by specifying orientation: rows.

Flexdashboard: Storyboard



  • Storyboards are an alternative to the row and column-based layout schemes.

  • They are suitable for presenting a sequence of data visualizations and related commentary.

  • To create a storyboard layout, add storyboard: true to the dashboard’s preamble. This option includes a set of level 3 dashboard components (###). Each component will receive its own frame in the storyboard, with the section title used as a navigation caption.

Dashboard: Additional Material

Summary

Summary

Main Takeaways from this lecture:

  • Data Communication Principles:

    • Context matters: Tailor your analysis to the audience and goal.
    • Focus on the story: Highlight insights, not the process.
    • Beauty and clarity: Simplify visuals, use appropriate colors, and remove unnecessary elements.
  • Visualization Best Practices:

    • Use graphs instead of tables where possible.
    • Avoid misleading scales and excessive dimensions.
    • Prioritize hierarchy and emphasize key data points.
  • Effective Dashboards:

    • Utilize tools like flexdashboard to create interactive layouts.
    • Structure information logically, adapting to mobile and web use.
    • Storyboard layouts help narrate data insights step-by-step.
  • Final Message:

    • Less is more. Reduce complexity to communicate data effectively.
    • Always keep your audience’s decision-making needs at the forefront.

Thank you!