chapter2_Plotting_Points

All the contents are from DataCamp


In chapter 2 students will build on the leaflet map they created in chapter 1 to create an interactive web map of every four year college in California. After plotting hundreds of points on an interactive leaflet map, students will learn to customize the markers on their leaflet map. This chapter will also how to color code markers based on a factor variable.

Chapter 1. Cleaning up the Base Map

If you are storing leaflet maps in objects, there will come a time when you need to remove markers or reset the view. You can accomplish these tasks with the following functions.

clearMarkers()- Remove one or more features from a map clearBounds()- Clear bounds and automatically determine bounds based on map elements

To remove the markers and to reset the bounds of our m map we would:

{r}
m <- m  %>% 
        addMarkers(lng = dc_hq$lon, lat = dc_hq$lat) %>% 
        setView(lat = 50.9, lng = 4.7, zoom = 5)

m  %>% 
    clearMarkers() %>% 
    clearBounds()

The leaflet map of DataCamp's headquarters has been printed for you.

```{r}

Store leaflet hq map in an object called map

Plot DataCamp's NYC HQ

pkgs <- c("tidyverse", "leaflet", "htmlwidgets", "webshot") sapply(pkgs, require, character.only = TRUE)

dc_hq <- data.frame(hq = c("DataCamp - NYC", "DataCamp - Belgium"), lon = c(-74.0, 4.72), lat = c(40.7, 50.9))

map <- leaflet() %>% addProviderTiles("CartoDB") %>%

      # Use dc_hq to add the hq column as popups
      addMarkers(lng = dc_hq$lon, lat = dc_hq$lat,
                 popup = dc_hq$hq)

Center the view of map on the Belgium HQ with a zoom of 5

map_zoom <- map %>% setView(lat = 50.881363, lng = 4.717863, zoom = 5) ```{r}

{r}
# Remove markers, reset bounds, and store the updated map in the m object
map_clear <- map %>%
        clearMarkers() %>% 
        clearBounds()

# Print the cleared map
map_clear

Chapter 2. Exploring the IPEDS Data

In Chapters 2 and 3, we will be using a subset of the IPEDS data that focuses on public, private, and for-profit four-year institutions. The United States also has many institutions that are classified as two-year colleges or vocational institutions, which are not included this course. Our subset has five variables on 3,146 colleges.

The sector_label column in the ipeds data frame indicates whether a college is public, private, or for-profit. In the console, use the group_by() and the count() functions from the dplyr package to determine which sector of college is most common.

The tidyverse package, which includes dplyr, has been loaded for you. In your workspace, you also have access to the ipeds dataframe.

Which sector of college is most common in the IPEDS data?

Data comes from tableu public data You can directly download data here.

In [63]:
# data cleansing function
# this code I built
data_cleansing <- function(data = data) {
  library(dplyr)
  
  data <- data %>% select(Name, 'Longitude location of institution', 'Latitude location of institution', 'State abbreviation', 'Sector of institution')
  
  names(data) <- c("name", "lng", "lat", "state", "sector_label")
  
  data$sector_label[grepl('Private', data$sector_label)] <- 'private'
  data$sector_label[grepl('Public', data$sector_label)] <- 'public'
  
  return(data)
}
In [11]:
library(rio)
# step 1. data import
data <- import("data/IPEDS_data.xlsx")

# step 2. data cleansing
ipeds <- data_cleansing(data = data)
glimpse(ipeds)
Observations: 1,534
Variables: 5
$ name         <chr> "Alabama A & M University", "University of Alabama at Bi…
$ lng          <dbl> -86.56850, -86.80917, -86.17401, -86.63842, -86.29568, -…
$ lat          <dbl> 34.78337, 33.50223, 32.36261, 34.72282, 32.36432, 33.214…
$ state        <chr> "Alabama", "Alabama", "Alabama", "Alabama", "Alabama", "…
$ sector_label <chr> "public", "public", "private", "public", "public", "publ…

Chapter 3. Exploring the IPEDS Data II

Most analyses require data wrangling. Luckily, there are many functions in the tidyverse that facilitate data frame cleaning. For example, the drop_na() function will remove observations with missing values. By default, drop_na() will check all columns for missing values and will remove all observations with one or more missing values.

{r}
miss_ex <- tibble(
             animal = c("dog", "cat", "rat", NA),
             name   = c("Woodruf", "Stryker", NA, "Morris"),
             age    = c(1:4))
miss_ex

miss_ex %>% 
     drop_na() %>% 
     arrange(desc(age))

# A tibble: 2 x 3
  animal    name   age
   <chr>   <chr> <dbl>
1    cat Stryker     2
2    dog Woodruf     1
In [14]:
# Remove colleges with missing sector information
library(tidyverse)
ipeds2 <- ipeds %>% drop_na()
glimpse(ipeds2)
Observations: 1,534
Variables: 5
$ name         <chr> "Alabama A & M University", "University of Alabama at Bi…
$ lng          <dbl> -86.56850, -86.80917, -86.17401, -86.63842, -86.29568, -…
$ lat          <dbl> 34.78337, 33.50223, 32.36261, 34.72282, 32.36432, 33.214…
$ state        <chr> "Alabama", "Alabama", "Alabama", "Alabama", "Alabama", "…
$ sector_label <chr> "public", "public", "private", "public", "public", "publ…
In [18]:
# Count the number of four-year colleges in each state
ipeds2 %>% group_by(state) %>% count() %>% head(6)
staten
Alabama 28
Alaska 4
Arizona 8
Arkansas 20
California93
Colorado 20
In [23]:
# Create a list of US States in descending order by the number of colleges in each state
ipeds2 %>% 
    group_by(state) %>% 
    count() %>% 
    arrange(desc(n)) %>% 
    head(6)
staten
New York 122
Pennsylvania 114
California 93
Texas 70
Ohio 60
Massachusetts 59

4. California Colleges

Now it is your turn to map all of the colleges in a state. In this exercise, we'll apply our example of mapping Maine's colleges to California's colleges. The first step is to set up your data by filtering the ipeds data frame to include only colleges in California. For reference, you will find how we accomplished this with the colleges in Maine below.

{r}
maine_colleges <- 
    ipeds %>% 
        filter(state == "ME")

maine_colleges

# A tibble: 21 x 5
                     name       lng      lat state sector_label
                    <chr>     <dbl>    <dbl> <chr>        <chr>
1           Bates College -70.20333 44.10530    ME      Private
2         Bowdoin College -69.96524 43.90690    ME      Private
In [28]:
## Create Dataframe called 'ca' with data on only colleges in California
ca <- ipeds2 %>% filter(state == "California")
glimpse(ca)
Observations: 93
Variables: 5
$ name         <chr> "Azusa Pacific University", "Biola University", "Califor…
$ lng          <dbl> -117.8880, -118.0173, -122.4165, -117.4259, -118.1257, -…
$ lat          <dbl> 34.13087, 33.90482, 37.77477, 33.92857, 34.13927, 34.225…
$ state        <chr> "California", "California", "California", "California", …
$ sector_label <chr> "private", "private", "private", "private", "private", "…
In [40]:
# Use `addMarkers` to plot all of the colleges in `ca` on the `m` leaflet map
library(leaflet)
{r}
map <- leaflet() %>% addProviderTiles("CartoDB")
map %>% 
    addMarkers(lng = ca$lng, lat = ca$lat)

Chapter 5. The City of Colleges

Based on our map of California colleges it appears that there is a cluster of colleges in and around the City of Angels (e.g., Los Angeles). Let's take a closer look at these institutions on our leaflet map.

The coordinates for the center of LA are provided for you in the la_coords data frame.

{r}
la_coords <- data.frame(lat = 34.05223, lon = -118.2437)

Once you create a map focused on LA, try panning and zooming the map. Can you find the cluster of colleges East of LA known as the Claremont Colleges?

When there are hundreds of markers, do you find the pin markers helpful or do they get in your way?

The coordinates of LA have been provided in the la_coords data frame and the ca data frame of California colleges and the map have been loaded for you.

{r}
la_coords <- data.frame(lat = 34.05223, lon = -118.2437) 

# Center the map on LA 
map %>% 
   addMarkers(data = ca) %>% 
   setView(lat = la_coords$lat, lng = la_coords$lon, zoom = 12)
{r}
# Set the zoom level to 8 and store in the m object
map_zoom <-
    map %>%
    addMarkers(data = ca) %>%
     setView(lat = la_coords$lat, lng = la_coords$lon, zoom = 8)

map_zoom

Chapter 6. Circle Markers

Circle markers are notably different from pin markers:

We can control their size They do not "stand-up" on the map We can more easily change their color There are many ways to customize circle markers and the design of your leaflet map. To get started we will focus on the following arguments.

{r}
addCircleMarkers(map, lng = NULL, lat = NULL, 
                 radius = 10, color = "#03F", popup = NULL)

The first argument map takes a leaflet object, which we will pipe directly into addCircleMarkers(). lng and lat are the coordinates we are mapping. The other arguments can customize the appearance and information presented by each marker.

The ca data frame and the leaflet object map have been loaded for you.

{r}
# Clear the markers from the map 
map2 <- map %>% 
            clearMarkers()
{r}
# Use addCircleMarkers() to plot each college as a circle
map2 %>%
    addCircleMarkers(lng = ca$lng, lat = ca$lat)
{r}
# Change the radius of each circle to be 2 pixels and the color to red
map2 %>% 
    addCircleMarkers(lng = ca$lng, lat = ca$lat,
                     radius = 2, color = "red")

7. Making our Map Pop

Similar to building a plot with ggplot2 or manipulating data with dplyr, your map needs to be stored in an object if you reference it later in your code.

Speaking of dplyr, the %>% operator can pipe data into the function chain that creates a leaflet map.

{r}
ipeds %>% 
    leaflet()  %>% 
        addTiles() %>% 
        addCircleMarkers(popup = ~name, color = "#FF0000")

Piping makes our code more readable and allows us to refer to variables using the ~ operator rather than repeatedly specifying the data frame.

The color argument in addCircleMarkers() takes the name of a color or a hex code. For example, red or #FF0000.

map has been printed for you. Notice the circle markers are gone!

{r}
# Add circle markers with popups for college names
map %>% 
    addCircleMarkers(data = ca, radius = 2, popup = ~name)
{r}
# Change circle color to #2cb42c and store map in map_color object
map_color <- map %>% 
    addCircleMarkers(data = ca, radius = 2, color = "#2cb42c", popup = ~name)

# Print map_color
map_color

Chapter 8. Building a Better Pop-up

With the paste0() function and a few html tags, we can customize our popups. paste0() converts its arguments to characters and combines them into a single string without separating the arguments.

{r}
addCircleMarkers(popup = ~paste0(name,
                                 "<br/>",
                                 sector_label))

We can use the
tag to create a line break to have each element appear on a separate line.

To distinguish different data elements, we can make the name of each college italics by wrapping the name variable in

{r}
addCircleMarkers(popup = ~paste0("<i>",
                                 name,
                                 "</i>", 
                                 "<br/>", 
                                 sector_label))
In [51]:
# Clear the bounds and markers on the map object and store in map2
map2 <- map %>% 
        clearMarkers() %>% 
        clearBounds()
{r}
# Add circle markers with popups that display both the institution name and sector
map2 %>% 
    addCircleMarkers(data = ca, radius = 2, 
                     popup = ~paste0(name, "<br/>", sector_label))
{r}
# Make the institution name in each popup bold
map2 %>% 
    addCircleMarkers(data = ca, radius = 2, 
                     popup = ~paste0("<b>", name, "</b>", "<br/>", sector_label))

9. Swapping Popups for Labels

Popups are great, but they require a little extra effort. That is when labels come to our the aid. Using the label argument in the addCircleMarkers() function we can get more information about one of our markers with a simple hover!

{r}
ipeds %>% 
    leaflet()  %>% 
    addProviderTiles("CartoDB")  %>% 
    addCircleMarkers(label = ~name, radius = 2)

Labels are especially helpful when mapping more than a few locations as they provide quick access to detail about what each marker represents.

{r}
# Add circle markers with labels identifying the name of each college
map %>% 
    addCircleMarkers(data = ca, radius = 2, label = ~name)
In [55]:
# Use paste0 to add sector information to the label inside parentheses 
map %>% 
    addCircleMarkers(data = ca, radius = 2, label = ~paste0(name, " (", sector_label, ")"))
Assuming "lng" and "lat" are longitude and latitude, respectively

Chapter 10. Creating a Color Palette using colorFactor

So far we have only used color to customize the style of our map. With colorFactor() we can create a color palette that maps colors the levels of a factor variable.

{r}
pal <- 
   colorFactor(palette = c("blue", "red", "green"), 
               levels = c("Public", "Private", "For-Profit"))

m %>% 
    addCircleMarkers(color = ~pal(sector_label))

Why might we not want to use this particular color palette?

If you are interested in using a continuous variable to color a map see colorNumeric().

{r}
pal <- colorNumeric(palette = "RdBu", domain = c(25:50))

ipeds %>% 
    leaflet() %>% 
        addProviderTiles("CartoDB")  %>% 
        addCircleMarkers(radius = 2, color = ~pal(lat))
{r}

# Make a color palette called pal for the values of `sector_label` using `colorFactor()`  
# Colors should be: "red", "blue", and "#9b4a11" for "Public", "Private", and "For-Profit" colleges, respectively
pal <- colorFactor(palette = c("red", "blue"), 
                   levels = c("public", "private"))

# Add circle markers that color colleges using pal() and the values of sector_label
map2 <- 
    map %>% 
        addCircleMarkers(data = ca, radius = 2, 
                         color = ~pal(sector_label), 
                         label = ~paste0(name, " (", sector_label, ")"))

# Print map2
map2

Chapter 11. A Legendary Map

Adding information to our map using color is great, but it is only helpful if we remember what the colors represent. With addLegend() we can add a legend to remind us.

There are several arguments we can use to custom the legend to our liking, including opacity, title, and position. To create a legend for our colorNumeric() example, we would do the following.

{r}
pal <- colorNumeric(palette = "RdBu", domain = c(25:50))

ipeds %>% 
    leaflet() %>% 
        addProviderTiles("CartoDB")  %>% 
        addCircleMarkers(radius = 2,
                         color = ~pal(lat)) %>% 
         addLegend(pal = pal,
                   values = c(25:50),
                   opacity = 0.75,
                   title = "Latitude",
                   position = "topleft")
{r}
# Make a color palette called pal for the values of `sector_label` using `colorFactor()`  
# Colors should be: "red", "blue", and "#9b4a11" for "Public", "Private", and "For-Profit" colleges, respectively
pal <- colorFactor(palette = c("red", "blue"), 
                   levels = c("public", "private"))

# Customize the legend
map2 %>% 
    addLegend(pal = pal, 
              values = c("public", "private"),
              # opacity of .5, title of Sector, and position of topright
              opacity = 0.5, title = "Sector", position = "topright")

The Final Output code is followed. 
In [96]:
# Store leaflet hq map in an object called map
# Plot DataCamp's NYC HQ
pkgs <- c("tidyverse", "leaflet", "htmlwidgets", "webshot", "rio")
sapply(pkgs, require, character.only = TRUE)

# step 1. data import
data <- import("data/IPEDS_data.xlsx")

# step 2. data cleansing
ipeds <- data_cleansing(data = data)

# step 3. visualization
pal <- colorFactor(palette = c("red", "blue"), 
                   levels = c("public", "private"))
map_circle <- ipeds %>% 
    leaflet() %>% 
    addProviderTiles("CartoDB") %>% 
    addCircleMarkers(radius = 2, 
                     color = ~pal(sector_label), 
                     label = ~paste0(name, " (", sector_label, ")")) %>% 
    addLegend(pal = pal, 
              values = c("public", "private"),
              # opacity of .5, title of Sector, and position of topright
              opacity = 0.5, title = "Sector", position = "topright")

# saving leaflet
## create .html and .png
## save html to png
saveWidget(map_circle, "chapter2_map_circle.html", selfcontained = FALSE)
webshot("chapter2_map_circle.html", file = "chapter2_map_circle.png",
        cliprect = "viewport")
tidyverse
TRUE
leaflet
TRUE
htmlwidgets
TRUE
webshot
TRUE
rio
TRUE
Assuming "lng" and "lat" are longitude and latitude, respectively



+ Recent posts