+ - 0:00:00
Notes for current slide
Notes for next slide

Data Visualization

The Art of Surprise and Metaphor

by
Arvind Venkatadri

arvind.venkatadri@gmail.com

Written: July 13 2022
Updated: August 14 2024

1
2

What makes Human Experience?

How would we begin to describe this experience?

  • Where / When?
  • Who?
  • How?
  • How Big? How small? How frequent? How sudden?
  • And....How Surprising ! How Shocking! How sad...How Wonderful !!!
    So: Our Questions, and our Surprise lead us to creating Human Experiences.

https://www.anecdote.com/2014/09/story-framework/

3

Is This a Surprise?

  • Do Emotions lead to Data?
  • How do I measure/quantify my emotions?
4

The Element of Surprise?

Jane Austen knew a lot about human information processing as these snippets from Pride and Prejudice (published in 1813 -- over 200 years ago) show:

  • She was a woman of mean understanding, little information, and uncertain temper.
  • Catherine and Lydia had information for them of a different sort.
  • When this information was given, and they had all taken their seats, Mr. Collins was at leisure to look around him and admire,...
  • You could not have met with a person more capable of giving you certain information on that head than myself, for I have been connected with his family in a particular manner from my infancy.
  • This information made Elizabeth smile, as she thought of poor Miss Bingley.
  • This information, however, startled Mrs. Bennet ...
5

Claude Shannon and Information

Clause Shannon, the father of modern telecom

  • Defined information/data as "quantified surprise"

info=โˆ’โˆ‘pโˆ—log(p)

  • "There was traffic in Bangalore today"(meh)
  • "There was hail in Bangalore today"(๐Ÿ˜ฎ)
  • "There was snow in Bangalore today"(๐Ÿ˜ฑ)

https://plus.maths.org/content/information-surprise

6

Human Experience is....Data??

7

But Can't We "Generate" Data Ourselves? Sure we can!

  • We create Hypotheses, and then Experiments
  • A Kitchen Experiment
  • Inputs are: Ingredients, Recipes, Processes
  • Outputs are: Taste, Texture, Colour, Quantity!!
Used without permission from https://safetyculture.com/topics/design-of-experiments/
8

What is the Result of an Experiment?

All experiments give us data about some phenomena of interest

  • We obtain data about the things that happen: Outputs
  • What makes things happen?: Inputs
  • How?: Factors
  • When? Factors
  • How much "output" is caused by how much "input"? Effect Size

All Experiments stem from Human Curiosity, a Hypothesis, and a Desire to Find out and Talk about Something

"Factors" and "Effect Size" are statistics tech terms!

9

A Famous Lady and her Famous Experiment

In 1853, Turkey declared war on Russia. After the Russian Navy destroyed a Turkish squadron in the Black Sea, Great Britain and France joined with Turkey. In September of the following year, the British landed on the Crimean Peninsula and set out, with the French and Turks, to take the Russian naval base at Sevastopol.

What followed was a tragicomedy of errors -- failure of supply, failed communications, international rivalries. Conditions in the armies were terrible, and disease ate through their ranks. They finally did take Sevastopol a year later, after a ghastly assault. It was ugly business all around. Well over half a million soldiers lost their lives during the Crimean War.

Deaths in Crimea
Month Year Disease.rate Wounds.rate Other.rate
Apr 1854 1.4 0.0 7.0
May 1854 6.2 0.0 4.6
Jun 1854 4.7 0.0 2.5
Jul 1854 150.0 0.0 9.6
Aug 1854 328.5 0.4 11.9
Sep 1854 312.2 32.1 27.7
Oct 1854 197.0 51.7 50.1
Nov 1854 340.6 115.8 42.8
Dec 1854 631.5 41.7 48.0
Jan 1855 1022.8 30.7 120.0
10

What Makes a Good Vegetable Variable, then?

Each kind of variable answers to a different Interrogative Pronoun

  • ''Whose was it?'' ''His who is gone.''
  • ''Who shall have it?'' ''He who will come.''
  • ''Where was the sun?'' ''Over the oak.''
  • ''Where was the shadow?'' ''Under the elm.''
  • ''How was it stepped?'' ''North by ten and by ten, east by five and by five, south by two and by two, west by one and by one, and so under.''
  • ''What shall we give for it?'' ''All that is ours.''
  • ''Why should we give it?'' ''For the sake of the trust.''

The Return of Sherlock Holmes: The Musgrave Ritual

11

Types of Variables

Using Interrogative Pronouns

  • Nominal: What? Who? Where? (Just names)

  • Ordinal: Which Types? What Sizes? How Big? (Factors, Dimensions)

  • Interval: How Often? (Numbers, Facts)

  • Ratio: How many? How much? How heavy? (Numbers, Facts)

12

Another Way of Looking at Variables

13

Types of Variables in Nightingale Data

  • Nominal: None
  • Ordinal: (Factors, Dimensions)
    • HOW? War, Disease, Other!!
  • Interval: (Numbers, Facts)
    • WHEN? Year, Month
  • Ratio: (Numbers, Facts)
    • HOW MANY? Rates of Deaths (War, Disease, Other)
Month Year Disease.rate Wounds.rate Other.rate
Apr 1854 1.4 0 7.0
May 1854 6.2 0 4.6
Jun 1854 4.7 0 2.5

15

Tidy Data

Nightingale's data table had dimensions (i.e. types of deaths) coded into column names. This is not considered tidy, though some software packages use this wide form data.

16

And Visualization?

17

Nightingale's Rose

18

Nightingale created a remarkable and original graphical display to show us just what hadd really gone on in the War. It was a Polar-Area Diagram that showed how people had died during the period from July, 1854, through the end of the following year.

Nightingale's graph is like a pie chart, cut into twelve equal angles. These slices advance in a clockwise direction, one each month. The radius shows how many deaths occurred in that month. We see little short slices in April, May and June of 1854. After the troops land in the Crimea, the slices begin reaching far outward in the radial direction.

There's more: Each slice has three sections, one for deaths from wounds in battle, one for "other causes", and one for disease.

Once you see Nightingale's graph, the terrible picture is clear. The Russians were a minor enemy. The real enemies were cholera, typhus, and dysentery. Once the military looked at that eloquent graph, the modern army hospital system was inevitable.

"Engines of Our Ingenuity", https://www.uh.edu/engines/epi1712.htm

Yes, But Why? Shapes and Data Viz Culture

  • Each Data Vis answers a Question
  • We can digest information more easily when it is pictorial
  • Our Working Memories are both short-term and limited in capacity.
  • So a picture abstracts the details and presents us with an overall summary, an insight, or a story that is both easy to recall and easy on retention.
  • Data Viz includes shapes that carry strong cultural memories; and impressions for us.
  • These cultural memories help us to use data viz in a universal way to appeal to a wide variety of audiences.

19

Visualization: A Metaphor from Data -> Geometry

  • How did we arrive at shapes, colours, lines, points...from data?

  • All Statistical Graphs do a Kalidasa

  • They use metaphors to map data variables and computed stats to geometrical aspects aka aesthetics

  • They may also compute stuff with data before plotting, as we shall see

  • Shapes have Cultural Significance

  • We may all have a Gene for Geometry

20

Shapes in Data Viz

Geometric Aesthetics in Data Viz

Geometric Aesthetics in Data Viz

  • Commonly used aesthetics in data visualization: position, shape, size, color, line width, line type...

  • Some of these aesthetics can represent both continuous and discrete data (position, size, line width, color)

  • While others can usually only represent discrete data (shape, line type).

21

What's missing?

  • Area?
  • Angle?/Direction?Slope?
  • Transparency? ( Alpha ) And further?
  • Surface Texture? Soft/Hard/Rough/Smooth?
  • Density of Material?
  • Can it evaporate? Or Vanish?
  • Taste?
  • Smell?
22

Each of the geometries works differently

23

So Let us see some Data Visualizations

24

Amounts and Counts

  • Variable: Ordinal / Nominal
  • Stat: count
  • Geometry:
    • x: Ord/Nom; y = count = height; colour = Ord/Nom
  • Questions:
    • How many artists work with each Type of Material?
    • Broken up by Gender or Nationality?
    • Do girls outnumber boys(meh!) Introverts outnumber Extraverts in Srishti 4:1 ?
    • And from metros?

25

Distributions

  • Variable: Interval / Ratio
  • Stat: bin and count
  • Geometry:
    • x = bins ( Quant ranges)
    • y = count
    • colour = nothing, or Nom/Ord
  • Questions:

26

Grouped Distributions

  • Variable: Interval/Ratio + Nominal/Ordinal
  • Stat: sort(boxplot), bin(violin)
  • Geometry:
    • x = Int/Ratio,
    • y = Nom/Ord, and
    • colour = Nom/Ord
  • Question:
I say what I mean, and I mean what I say

I say what I mean, and I mean what I say

27

Relationships

  • Variable: Interval/Ratio + Nominal/Ordinal
  • Stat: none
  • Geometry:
    • x = Int/Ratio,
    • y = Int/Ratio, and
    • colour = Nom/Ord
    • shape = Nom/Ord
  • Question:
    • How does one Interval/Ratio variable vary with respect to another?
    • Are the number of friendships dependent upon my wallet-size?

28

Change, Evolution, and Flow

29

Networks

  • Variables: Nominal/Ordinal
  • Stat: none
  • Geometry:
    • Node (shape) = Nom
    • Colour = Ord
    • Edge Line width = Count (computed) or Quant
  • Question:

30

Hierarchies

  • Variables: Nominal/Ordinal
  • Stat: none
  • Geometry:
    • x = Nom/Ord
    • y = Nom/Ord
  • Question:
    • How much does on Qual variable connect with another?
    • Who reports to whom, or is dependent upon whom?
    • Who cannot pass the buck any further?

31

Maps

  • Variables: Quant + Qual (Nominal/Ordinal)
  • Stat: none
  • Geometry:
    • x = Quant
    • y = Quant
    • colour = Nom/Ord
  • Question:
    • What does each area represent, qualitatively? (Religion, Properity, Voting Pattern..)
    • Which are the most affluent, or crime-prone neighbourhoods?
    • Who watches Brooklyn 99 and who watches Bojack Horseman?

32

What Else Could We have Discussed?

  • Coordinates: Cartesian or Polar?
    • A Bar Chart in Polar is... a Pie!!!
  • Small Multiples: Making multiple smaller graphs
  • Charts for :
    • Survey Data (Likert Plots)
    • Multiple Qual variables (Mosaic Plots)
    • Ranks (Bump Charts / Dumbbell Charts)
    • Ratings (Radar Plots)
  • Scales: Matching Axes to the Units of the variables (percentage, currency..)
  • Colour Palettes
  • Annotations on Charts
33

Conclusion

  • We question the world and form Hypotheses out of surprise
  • Hypotheses leads us to define Questions
  • Questions lead to Variables and Data
  • Questions with Variables lead to Graphs
  • With Graphs, we can write Stories that can drive Decisions, Designs, Policies, and...Art!!
34
35

Thanks!

Questions? Comments?

Slides created

with

via the R packages

โš”๏ธ xaringan
+
๐Ÿ˜Ž โœ˜gadenbuie/xaringanExtra
+
โš”๏ธthe tidyverse

36
2
Paused

Help

Keyboard shortcuts

โ†‘, โ†, Pg Up, k Go to previous slide
โ†“, โ†’, Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
oTile View: Overview of Slides
Alt + fFit Slides to Screen
sToggle scribble toolbox
Esc Back to slideshow