# Choosing the right graph for your data

There are many types of graphs, which use different methods to visually encode data and present quantitative relationships. It is critical to choose the correct type of graph for your data so that these relationships are clear. More often than not, the simplest graphing option will be the best.

Some data relationships can be shown by more than one type of graph. In this situation, you should use the graph type that is most familiar to your particular readership. Readers will engage with, and be persuaded by, the message of your data if it is presented in a way that matches their intuitive understanding of data relationships – for example, that horizontal lines represent measurements over time.

Aim to use the same type of graph, and consistent design features, for similar kinds of data within a document or series of related documents. This will help readers to interpret the content and avoid confusion.

This section covers:

## Types of quantitative relationship

Quantitative relationships fall into the following categories:

• ordinal or nominal comparisons – differences across a list of ordered or unordered values for a set of items, groups or categories
• time series – how something changes over time (eg yearly)
• part to whole – ratio of each part to the whole, expressed as either percentages (with a total of 100%) or proportions of an absolute total (eg total income)
• deviation – difference between 2 sets of values (typically a set of measures compared with a baseline or prior measurement of the same variable)
• distribution – counts of values per interval of a continuous variable
• correlation – a set of 2 measurements that vary together (eg height and weight)
• geographic or spatial – comparison of data across a map (see Maps).

## Selecting the most appropriate type of graph

The following table (adapted from Few 2012) matches data relationships to graph types to help you choose the right graph for your data. More detailed explanations for these selections are provided under each graph type in Types of graphs and plots. Graph types shown in bold are preferable to other options for the given type of data.

Type of data and relationship Recommended graph type Notes for use

Ordinal or nominal items, groups or categories

Compares data values across independent items, groups or categories (eg unemployment rates for each Australian state and territory)

Horizontal bar graph

• Order bars by size of data values to emphasise differences
• Use clustered bars for subcategories of groups, but limit clusters to 3 or 4 subcategories to enable comparisons across groups

Vertical bar graph

Dot plot

• Dots represent single data values for each item or group; a column of dots can represent summary values for each group
• Can be mistaken for scatter plots or time-series graphs – consider using a bar graph instead

Time series

Shows how data values for a measure(s) change over time (eg population-adjusted breast cancer diagnoses recorded in Australia every year, for a 20-year period)

Line graph (for large time series)

• Use to highlight trends or patterns in a measure over time
• Use for datasets that include data for more than about 8 time points
• Lines are connected, consecutive data values
• Lines always follow a horizontal direction, with time intervals on the x axis increasing from left to right, and the measurement variable plotted on the y axis
• Only connect consecutive values – intervals with missing data must be shown as a break in the line(s)

Vertical bar graph (for small time series)

• Use for time-series data with a small number of time points – about 8 or less
• Use to emphasise specific data values, rather than an overall pattern or trend

Dot plot

• Dots represent data values at each time point. If connected, these dots form a line graph
• Can be mistaken for scatter plots – consider using a bar graph or line graph instead

Part to whole (ie proportions of a total)

Shows how data values relate to, compare with or make up a total measure at 1 or more points in time (eg proportion of Australia’s total primary energy supply attributable to each major fuel type)

Horizontal bar graph

• Use to show the value (ie percentage or proportion of an absolute total) of each part for a single population
• This type of data is often shown as a pie graph, which is not generally recommended

Horizontal stacked bar graph

• Use to show proportions of a total measure for multiple populations
• Total(s) must add to 100% if values are percentages, or the total absolute value for other scales

Vertical stacked bar graph

• Use to show proportions of a total measure over time, for about 8 or fewer time points
• Use to emphasise changes in the relative size of parts over time

Stacked area graph

• Use to show proportions of a total measure over time, for about 8 or more time points
• Use to emphasise changes in the relative size of parts over time

Deviation

Shows the difference between data values and a baseline (eg differences between actual rainfall and predicted or previous-year rainfall for each month of a year)

Vertical bar graph

• Use when your goal is to highlight deviations between measurements and some meaningful baseline or reference
• Bars (ie data values) above the reference or x axis indicate positive differences from the baseline; bars below indicate negative differences
• The y axis can measure absolute differences or percentage change between data values and the reference

Line graph

• Use to show differences from a baseline or reference over time, when the dataset includes data for more than about 8 time points
• See above points for line graphs

Single frequency or distribution data

Shows how frequency or count values are distributed over the range of a measure (eg range of blood pressure measurements for men)

Histogram (for measures with a small range)

• Use a vertical bar graph to show frequency or count values across the range of a measure with few intervals
• Used as an alternative to a frequency polygon when individual data values must be emphasised

Frequency polygon (for measures with a large range)

• Use to show frequency or count values across the range of a measure with many intervals
• Use to emphasise the shape of a distribution

Strip plot

• Use to show the distribution of a measure for a small population
• If multiple measurements are recorded for the same value on the distribution, these points should be stacked or shown in a denser tone than other (nonrepeated) points

Box plot (horizontal or vertical)

• Use to summarise a measure’s distribution, rather than all individual data values
• May be unfamiliar to readers – consider plotting a simple histogram instead

Distribution of the same measure across multiple time points or categories

Shows how frequency or count values are distributed over the range of a measure, for more than 1 population (eg range of blood pressure measurements for men with 5 different medical conditions)

Vertical box plot

• Use to summarise multiple distributions of the same measure
• May be unfamiliar to readers – consider plotting summary values (eg medians of the distribution) as a bar chart for multiple groups or populations, or a line graph with or without upper and lower bounds for multiple distributions over time

Strip plot

• Multiple distributions are plotted side by side against the same y axis
• White space should separate each distribution
• See above points for strip plots

Line graph with upper and lower bounds

• Use to show distributions with a large number of time points – not multiple, discrete populations
• Median values for the distributions at each time point are connected to form a line
• The largest and smallest values for the distribution at each time point are connected to form (typically invisible) lines above and below the median line – the areas between the median line and these upper and lower bounds are shaded
• Upper and lower bounds may be an unfamiliar feature for readers – consider whether their inclusion adds meaning and whether this outweighs potential misperceptions among readers

Correlated measures

Shows an association between 2 measures or variables (eg children’s age and height)

Scatter plot

• Each dot or data point represents a subject’s measurement on x axis and y axis variables
• Use to show that data points form a meaningful shape that indicates the type (or lack) of association between 2 variables
• Consider including a trend line to highlight the type and strength of association
• Depending on the audience, readers may be unable to interpret scatter plots – consider whether side-by-side horizontal bar graphs would better communicate the association

Side-by-side horizontal bar graph

• Use to show an association between 2 measures when scatter plots are unfamiliar to readers
• Most effective for showing linear associations
• Two aligned bar graphs display each subject’s measurement on the first and second measures
• Order the bars by size on one of the graphs to emphasise the association between the 2 measures

Download our quick guide for easy reference: What type of graph is best for my data? .