Adjusting figure options, optimizing them for mobile devices.


You might have already noticed it: The dot plot you produced in the last chapter still needs some tweaks. There doesn’t seem to be enough space between the arrows, and the last label (Netherlands) doesn’t even show. Also, you want the image to fit the aspect ratio of a mobile device better, so you’re going to change this with another set of chunk options.

 

Summary


The International Labour Organization (ILO) has many data sets on working conditions. For example, one can look at how weekly working hours have been decreasing in many countries of the world, while monetary compensation has risen. In this report, the reduction in weekly working hours in European countries is analysed, and a comparison between 1996 and 2006 is made. All analysed countries have seen a decrease in weekly working hours since 1996 – some more than others.

Preparations


library(dplyr)
library(ggplot2)
library(forcats)

Analysis


Data

The herein used data can be found in the statistics database of the ILO.For the purpose of this course, it has been slightly preprocessed.

load(url("http://s3.amazonaws.com/assets.datacamp.com/production/course_5807/datasets/ilo_data.RData"))

The loaded data contains 380 rows.

# Some summary statistics
ilo_data %>%
group_by(year) %>%
summarize(mean_hourly_compensation = mean(hourly_compensation),
mean_working_hours = mean(working_hours))
## # A tibble: 27 x 3
## year mean_hourly_compensation mean_working_hours
## <fct> <dbl> <dbl>
## 1 1980 9.27 34.0
## 2 1981 8.69 33.6
## 3 1982 8.36 33.5
## 4 1983 7.81 33.9
## 5 1984 7.54 33.7
## 6 1985 7.79 33.7
## 7 1986 9.70 34.0
## 8 1987 12.1 33.6
## 9 1988 13.2 33.7
## 10 1989 13.1 33.5
## # … with 17 more rows

As can be seen from the above table, the average weekly working hours of European countries have been descreasing since 1980.

Preprocessing

The data is now filtered so it only contains the years 1996 and 2006 – a good time range for comparison.

ilo_data <- ilo_data %>%
filter(year == "1996" | year == "2006")
# Reorder country factor levels
ilo_data <- ilo_data %>%
# Arrange data frame first, so last is always 2006
arrange(year) %>%
# Use the fct_reorder function inside mutate to reorder countries by working hours in 2006
mutate(country = fct_reorder(country,
working_hours,
last))

Results

In the following, a plot that shows the reduction of weekly working hours from 1996 to 2006 in each country is produced.

First, a custom theme is defined. Then, the plot is produced.

# Compute temporary data set for optimal label placement
median_working_hours <- ilo_data %>%
group_by(country) %>%
summarize(median_working_hours_per_country = median(working_hours)) %>%
ungroup()
# Have a look at the structure of this data set
str(median_working_hours)
## Classes 'tbl_df', 'tbl' and 'data.frame': 17 obs. of 2 variables:
## $ country : Factor w/ 30 levels "Netherlands",..: 1 2 3 4 5 6 7 8 9 10 ...
## $ median_working_hours_per_country: num 27 27.8 28.4 31 30.9 ...
# Plot
ggplot(ilo_data) +
geom_path(aes(x = working_hours, y = country),
arrow = arrow(length = unit(1.5, "mm"), type = "closed")) +
# Add labels for values (both 1996 and 2006)
geom_text(
aes(x = working_hours,
y = country,
label = round(working_hours, 1),
hjust = ifelse(year == "2006", 1.4, -0.4)
),
# Change the appearance of the text
size = 3,
family = "AppleGothic",
color = "gray25"
) +
# Add labels for country
geom_text(data = median_working_hours,
aes(y = country,
x = median_working_hours_per_country,
label = country),
vjust = 2,
family = "AppleGothic",
color = "gray25") +
# Add titles
labs(
title = "People work less in 2006 compared to 1996",
subtitle = "Working hours in European countries, development since 1996",
caption = "Data source: ILO, 2017"
) +
# Apply your theme
theme_ilo() +
# Remove axes and grids
theme(
axis.ticks = element_blank(),
axis.title = element_blank(),
axis.text = element_blank(),
panel.grid = element_blank(),
# Also, let's reduce the font size of the subtitle
plot.subtitle = element_text(size = 9)
) +
# Reset coordinate system
coord_cartesian(xlim = c(25, 41))

An interesting correlation

The results of another analysis are shown here, even though they cannot be reproduced with the data at hand.

As you can see, there’s also an interesting relationship. The more people work, the less compensation they seem to receive, which seems kind of unfair. This is quite possibly related to other proxy variables like overall economic stability and performance of a country.

'R > [R] 데이터 시각화' 카테고리의 다른 글

[R] Facets()을 활용한 데이터 시각화  (0) 2019.11.24

+ Recent posts