Lectures

Change the overall appearance of your report

In the YAML header, more than just the title and output format can be specified. You can also customize things such as syntax highlighting or the overall appearance by specifying a custom theme.

output: 
  html_document:
    theme: cosmo
    highlight: monochrome

Add a table of contents

Another cool feature of RMarkdown reports (whether HTML or PDF) is an automatically generated table of contents (TOC). And with several settings, you can customize your TOC quite a bit: You add a table with toc: true and specify whether it should be floating (= whether it moves along as you scroll) with toc_float. The depth of your TOC is set with toc_depth.

output: 
  html_document:
    theme: cosmo
    highlight: monochrome
    toc: true
    toc_float: false
    toc_depth: 4

More YAML hacks

There are many more customizations for RMarkdown reports, and most of
them can be configured via the YAML header. Before you dig deeper into custom stylesheets, let’s enable code folding with code_folding: …. This will allow your readers to completely hide all code chunks or show them.

output: 
  html_document:
    theme: cosmo
    highlight: monochrome
    toc: true
    toc_float: false
    toc_depth: 4
    number_sections: true
    code_folding: hide

Change style attributes of text elements

With CSS, it’s easy to change the appearance of text in your report. In this exercise, you’re going to change the font to a font with serifs, in accordance with the style of your plots. You’re also going to try out a few other CSS selectors in order to change some colors and font sizes in your report. For example, the font of the R code elements is currently a little on the larger side, compared to the surrounding prose. You’ll use CSS to reduce their size. Here, all of your CSS should go inside the <style> tags above the Summary. In the next exercise, you’ll learn how to reference an external CSS file using the YAML header. If you need more help regarding the styling of text, you can refer to the Mozilla Developer reference.

<style>
body, h1, h2, h3, h4 {
    font-family: "Bookman", serif;
}

body {
    color: #333333;
}
a, a:hover {
    color: red;
}
pre {
    font-size: 10px;
}
</style>

Reference the style sheet

See the new pane in the exercise interface called styles.css? As mentioned in the previous exercise, you can reference an external CSS file in the YAML header of your RMarkdown document like so:

title: "Test"
output:
  html_document:
    css: styles.css

Your CSS from before is now contained in styles.css. It’s time to reference styles.css in your YAML header so that the CSS rules are applied to your report.

Beautify a table with kable

You’ve just heard it: There are two ways to beautify a table with thekable package: either directly in code chunks by calling theknitr::kable() function or in the YAML header. Here you will try out the former. my_data_frame %>% knitr::kable()

Example

http://rpubs.com/Evan_Jung/Customized_Report

chapter2_Plotting_Points

All the contents are from DataCamp


In chapter 2 students will build on the leaflet map they created in chapter 1 to create an interactive web map of every four year college in California. After plotting hundreds of points on an interactive leaflet map, students will learn to customize the markers on their leaflet map. This chapter will also how to color code markers based on a factor variable.

Chapter 1. Cleaning up the Base Map

If you are storing leaflet maps in objects, there will come a time when you need to remove markers or reset the view. You can accomplish these tasks with the following functions.

clearMarkers()- Remove one or more features from a map clearBounds()- Clear bounds and automatically determine bounds based on map elements

To remove the markers and to reset the bounds of our m map we would:

{r}
m <- m  %>% 
        addMarkers(lng = dc_hq$lon, lat = dc_hq$lat) %>% 
        setView(lat = 50.9, lng = 4.7, zoom = 5)

m  %>% 
    clearMarkers() %>% 
    clearBounds()

The leaflet map of DataCamp's headquarters has been printed for you.

```{r}

Store leaflet hq map in an object called map

Plot DataCamp's NYC HQ

pkgs <- c("tidyverse", "leaflet", "htmlwidgets", "webshot") sapply(pkgs, require, character.only = TRUE)

dc_hq <- data.frame(hq = c("DataCamp - NYC", "DataCamp - Belgium"), lon = c(-74.0, 4.72), lat = c(40.7, 50.9))

map <- leaflet() %>% addProviderTiles("CartoDB") %>%

      # Use dc_hq to add the hq column as popups
      addMarkers(lng = dc_hq$lon, lat = dc_hq$lat,
                 popup = dc_hq$hq)

Center the view of map on the Belgium HQ with a zoom of 5

map_zoom <- map %>% setView(lat = 50.881363, lng = 4.717863, zoom = 5) ```{r}

{r}
# Remove markers, reset bounds, and store the updated map in the m object
map_clear <- map %>%
        clearMarkers() %>% 
        clearBounds()

# Print the cleared map
map_clear

Chapter 2. Exploring the IPEDS Data

In Chapters 2 and 3, we will be using a subset of the IPEDS data that focuses on public, private, and for-profit four-year institutions. The United States also has many institutions that are classified as two-year colleges or vocational institutions, which are not included this course. Our subset has five variables on 3,146 colleges.

The sector_label column in the ipeds data frame indicates whether a college is public, private, or for-profit. In the console, use the group_by() and the count() functions from the dplyr package to determine which sector of college is most common.

The tidyverse package, which includes dplyr, has been loaded for you. In your workspace, you also have access to the ipeds dataframe.

Which sector of college is most common in the IPEDS data?

Data comes from tableu public data You can directly download data here.

In [63]:
# data cleansing function
# this code I built
data_cleansing <- function(data = data) {
  library(dplyr)
  
  data <- data %>% select(Name, 'Longitude location of institution', 'Latitude location of institution', 'State abbreviation', 'Sector of institution')
  
  names(data) <- c("name", "lng", "lat", "state", "sector_label")
  
  data$sector_label[grepl('Private', data$sector_label)] <- 'private'
  data$sector_label[grepl('Public', data$sector_label)] <- 'public'
  
  return(data)
}
In [11]:
library(rio)
# step 1. data import
data <- import("data/IPEDS_data.xlsx")

# step 2. data cleansing
ipeds <- data_cleansing(data = data)
glimpse(ipeds)
Observations: 1,534
Variables: 5
$ name         <chr> "Alabama A & M University", "University of Alabama at Bi…
$ lng          <dbl> -86.56850, -86.80917, -86.17401, -86.63842, -86.29568, -…
$ lat          <dbl> 34.78337, 33.50223, 32.36261, 34.72282, 32.36432, 33.214…
$ state        <chr> "Alabama", "Alabama", "Alabama", "Alabama", "Alabama", "…
$ sector_label <chr> "public", "public", "private", "public", "public", "publ…

Chapter 3. Exploring the IPEDS Data II

Most analyses require data wrangling. Luckily, there are many functions in the tidyverse that facilitate data frame cleaning. For example, the drop_na() function will remove observations with missing values. By default, drop_na() will check all columns for missing values and will remove all observations with one or more missing values.

{r}
miss_ex <- tibble(
             animal = c("dog", "cat", "rat", NA),
             name   = c("Woodruf", "Stryker", NA, "Morris"),
             age    = c(1:4))
miss_ex

miss_ex %>% 
     drop_na() %>% 
     arrange(desc(age))

# A tibble: 2 x 3
  animal    name   age
   <chr>   <chr> <dbl>
1    cat Stryker     2
2    dog Woodruf     1
In [14]:
# Remove colleges with missing sector information
library(tidyverse)
ipeds2 <- ipeds %>% drop_na()
glimpse(ipeds2)
Observations: 1,534
Variables: 5
$ name         <chr> "Alabama A & M University", "University of Alabama at Bi…
$ lng          <dbl> -86.56850, -86.80917, -86.17401, -86.63842, -86.29568, -…
$ lat          <dbl> 34.78337, 33.50223, 32.36261, 34.72282, 32.36432, 33.214…
$ state        <chr> "Alabama", "Alabama", "Alabama", "Alabama", "Alabama", "…
$ sector_label <chr> "public", "public", "private", "public", "public", "publ…
In [18]:
# Count the number of four-year colleges in each state
ipeds2 %>% group_by(state) %>% count() %>% head(6)
staten
Alabama 28
Alaska 4
Arizona 8
Arkansas 20
California93
Colorado 20
In [23]:
# Create a list of US States in descending order by the number of colleges in each state
ipeds2 %>% 
    group_by(state) %>% 
    count() %>% 
    arrange(desc(n)) %>% 
    head(6)
staten
New York 122
Pennsylvania 114
California 93
Texas 70
Ohio 60
Massachusetts 59

4. California Colleges

Now it is your turn to map all of the colleges in a state. In this exercise, we'll apply our example of mapping Maine's colleges to California's colleges. The first step is to set up your data by filtering the ipeds data frame to include only colleges in California. For reference, you will find how we accomplished this with the colleges in Maine below.

{r}
maine_colleges <- 
    ipeds %>% 
        filter(state == "ME")

maine_colleges

# A tibble: 21 x 5
                     name       lng      lat state sector_label
                    <chr>     <dbl>    <dbl> <chr>        <chr>
1           Bates College -70.20333 44.10530    ME      Private
2         Bowdoin College -69.96524 43.90690    ME      Private
In [28]:
## Create Dataframe called 'ca' with data on only colleges in California
ca <- ipeds2 %>% filter(state == "California")
glimpse(ca)
Observations: 93
Variables: 5
$ name         <chr> "Azusa Pacific University", "Biola University", "Califor…
$ lng          <dbl> -117.8880, -118.0173, -122.4165, -117.4259, -118.1257, -…
$ lat          <dbl> 34.13087, 33.90482, 37.77477, 33.92857, 34.13927, 34.225…
$ state        <chr> "California", "California", "California", "California", …
$ sector_label <chr> "private", "private", "private", "private", "private", "…
In [40]:
# Use `addMarkers` to plot all of the colleges in `ca` on the `m` leaflet map
library(leaflet)
{r}
map <- leaflet() %>% addProviderTiles("CartoDB")
map %>% 
    addMarkers(lng = ca$lng, lat = ca$lat)

Chapter 5. The City of Colleges

Based on our map of California colleges it appears that there is a cluster of colleges in and around the City of Angels (e.g., Los Angeles). Let's take a closer look at these institutions on our leaflet map.

The coordinates for the center of LA are provided for you in the la_coords data frame.

{r}
la_coords <- data.frame(lat = 34.05223, lon = -118.2437)

Once you create a map focused on LA, try panning and zooming the map. Can you find the cluster of colleges East of LA known as the Claremont Colleges?

When there are hundreds of markers, do you find the pin markers helpful or do they get in your way?

The coordinates of LA have been provided in the la_coords data frame and the ca data frame of California colleges and the map have been loaded for you.

{r}
la_coords <- data.frame(lat = 34.05223, lon = -118.2437) 

# Center the map on LA 
map %>% 
   addMarkers(data = ca) %>% 
   setView(lat = la_coords$lat, lng = la_coords$lon, zoom = 12)
{r}
# Set the zoom level to 8 and store in the m object
map_zoom <-
    map %>%
    addMarkers(data = ca) %>%
     setView(lat = la_coords$lat, lng = la_coords$lon, zoom = 8)

map_zoom

Chapter 6. Circle Markers

Circle markers are notably different from pin markers:

We can control their size They do not "stand-up" on the map We can more easily change their color There are many ways to customize circle markers and the design of your leaflet map. To get started we will focus on the following arguments.

{r}
addCircleMarkers(map, lng = NULL, lat = NULL, 
                 radius = 10, color = "#03F", popup = NULL)

The first argument map takes a leaflet object, which we will pipe directly into addCircleMarkers(). lng and lat are the coordinates we are mapping. The other arguments can customize the appearance and information presented by each marker.

The ca data frame and the leaflet object map have been loaded for you.

{r}
# Clear the markers from the map 
map2 <- map %>% 
            clearMarkers()
{r}
# Use addCircleMarkers() to plot each college as a circle
map2 %>%
    addCircleMarkers(lng = ca$lng, lat = ca$lat)
{r}
# Change the radius of each circle to be 2 pixels and the color to red
map2 %>% 
    addCircleMarkers(lng = ca$lng, lat = ca$lat,
                     radius = 2, color = "red")

7. Making our Map Pop

Similar to building a plot with ggplot2 or manipulating data with dplyr, your map needs to be stored in an object if you reference it later in your code.

Speaking of dplyr, the %>% operator can pipe data into the function chain that creates a leaflet map.

{r}
ipeds %>% 
    leaflet()  %>% 
        addTiles() %>% 
        addCircleMarkers(popup = ~name, color = "#FF0000")

Piping makes our code more readable and allows us to refer to variables using the ~ operator rather than repeatedly specifying the data frame.

The color argument in addCircleMarkers() takes the name of a color or a hex code. For example, red or #FF0000.

map has been printed for you. Notice the circle markers are gone!

{r}
# Add circle markers with popups for college names
map %>% 
    addCircleMarkers(data = ca, radius = 2, popup = ~name)
{r}
# Change circle color to #2cb42c and store map in map_color object
map_color <- map %>% 
    addCircleMarkers(data = ca, radius = 2, color = "#2cb42c", popup = ~name)

# Print map_color
map_color

Chapter 8. Building a Better Pop-up

With the paste0() function and a few html tags, we can customize our popups. paste0() converts its arguments to characters and combines them into a single string without separating the arguments.

{r}
addCircleMarkers(popup = ~paste0(name,
                                 "<br/>",
                                 sector_label))

We can use the
tag to create a line break to have each element appear on a separate line.

To distinguish different data elements, we can make the name of each college italics by wrapping the name variable in

{r}
addCircleMarkers(popup = ~paste0("<i>",
                                 name,
                                 "</i>", 
                                 "<br/>", 
                                 sector_label))
In [51]:
# Clear the bounds and markers on the map object and store in map2
map2 <- map %>% 
        clearMarkers() %>% 
        clearBounds()
{r}
# Add circle markers with popups that display both the institution name and sector
map2 %>% 
    addCircleMarkers(data = ca, radius = 2, 
                     popup = ~paste0(name, "<br/>", sector_label))
{r}
# Make the institution name in each popup bold
map2 %>% 
    addCircleMarkers(data = ca, radius = 2, 
                     popup = ~paste0("<b>", name, "</b>", "<br/>", sector_label))

9. Swapping Popups for Labels

Popups are great, but they require a little extra effort. That is when labels come to our the aid. Using the label argument in the addCircleMarkers() function we can get more information about one of our markers with a simple hover!

{r}
ipeds %>% 
    leaflet()  %>% 
    addProviderTiles("CartoDB")  %>% 
    addCircleMarkers(label = ~name, radius = 2)

Labels are especially helpful when mapping more than a few locations as they provide quick access to detail about what each marker represents.

{r}
# Add circle markers with labels identifying the name of each college
map %>% 
    addCircleMarkers(data = ca, radius = 2, label = ~name)
In [55]:
# Use paste0 to add sector information to the label inside parentheses 
map %>% 
    addCircleMarkers(data = ca, radius = 2, label = ~paste0(name, " (", sector_label, ")"))
Assuming "lng" and "lat" are longitude and latitude, respectively

Chapter 10. Creating a Color Palette using colorFactor

So far we have only used color to customize the style of our map. With colorFactor() we can create a color palette that maps colors the levels of a factor variable.

{r}
pal <- 
   colorFactor(palette = c("blue", "red", "green"), 
               levels = c("Public", "Private", "For-Profit"))

m %>% 
    addCircleMarkers(color = ~pal(sector_label))

Why might we not want to use this particular color palette?

If you are interested in using a continuous variable to color a map see colorNumeric().

{r}
pal <- colorNumeric(palette = "RdBu", domain = c(25:50))

ipeds %>% 
    leaflet() %>% 
        addProviderTiles("CartoDB")  %>% 
        addCircleMarkers(radius = 2, color = ~pal(lat))
{r}

# Make a color palette called pal for the values of `sector_label` using `colorFactor()`  
# Colors should be: "red", "blue", and "#9b4a11" for "Public", "Private", and "For-Profit" colleges, respectively
pal <- colorFactor(palette = c("red", "blue"), 
                   levels = c("public", "private"))

# Add circle markers that color colleges using pal() and the values of sector_label
map2 <- 
    map %>% 
        addCircleMarkers(data = ca, radius = 2, 
                         color = ~pal(sector_label), 
                         label = ~paste0(name, " (", sector_label, ")"))

# Print map2
map2

Chapter 11. A Legendary Map

Adding information to our map using color is great, but it is only helpful if we remember what the colors represent. With addLegend() we can add a legend to remind us.

There are several arguments we can use to custom the legend to our liking, including opacity, title, and position. To create a legend for our colorNumeric() example, we would do the following.

{r}
pal <- colorNumeric(palette = "RdBu", domain = c(25:50))

ipeds %>% 
    leaflet() %>% 
        addProviderTiles("CartoDB")  %>% 
        addCircleMarkers(radius = 2,
                         color = ~pal(lat)) %>% 
         addLegend(pal = pal,
                   values = c(25:50),
                   opacity = 0.75,
                   title = "Latitude",
                   position = "topleft")
{r}
# Make a color palette called pal for the values of `sector_label` using `colorFactor()`  
# Colors should be: "red", "blue", and "#9b4a11" for "Public", "Private", and "For-Profit" colleges, respectively
pal <- colorFactor(palette = c("red", "blue"), 
                   levels = c("public", "private"))

# Customize the legend
map2 %>% 
    addLegend(pal = pal, 
              values = c("public", "private"),
              # opacity of .5, title of Sector, and position of topright
              opacity = 0.5, title = "Sector", position = "topright")

The Final Output code is followed. 
In [96]:
# Store leaflet hq map in an object called map
# Plot DataCamp's NYC HQ
pkgs <- c("tidyverse", "leaflet", "htmlwidgets", "webshot", "rio")
sapply(pkgs, require, character.only = TRUE)

# step 1. data import
data <- import("data/IPEDS_data.xlsx")

# step 2. data cleansing
ipeds <- data_cleansing(data = data)

# step 3. visualization
pal <- colorFactor(palette = c("red", "blue"), 
                   levels = c("public", "private"))
map_circle <- ipeds %>% 
    leaflet() %>% 
    addProviderTiles("CartoDB") %>% 
    addCircleMarkers(radius = 2, 
                     color = ~pal(sector_label), 
                     label = ~paste0(name, " (", sector_label, ")")) %>% 
    addLegend(pal = pal, 
              values = c("public", "private"),
              # opacity of .5, title of Sector, and position of topright
              opacity = 0.5, title = "Sector", position = "topright")

# saving leaflet
## create .html and .png
## save html to png
saveWidget(map_circle, "chapter2_map_circle.html", selfcontained = FALSE)
webshot("chapter2_map_circle.html", file = "chapter2_map_circle.png",
        cliprect = "viewport")
tidyverse
TRUE
leaflet
TRUE
htmlwidgets
TRUE
webshot
TRUE
rio
TRUE
Assuming "lng" and "lat" are longitude and latitude, respectively



Chapter1_Setting_Up_Interactive_Web_Maps

Chapter 1 will introduce students to the htmlwidgets package and the leaflet package. Following this introduction, students will build their first interactive web map using leaflet. Through the process of creating this first map students will be introduced to many of the core features of the leaflet package, including adding different map tiles, setting the center point and zoom level, plotting single points based on latitude and longitude coordinates, and storing leaflet maps as objects. Chapter 1 will conclude with students geocoding DataCamp’s headquarters, and creating a leaflet map that plots the headquarters and displays a popup describing the location.

1. Creating an Interactive Web Map

Similar to the packages in the tidyverse, the leaflet package makes use of the pipe operator (i.e., %>%) from the magrittr package to chain function calls together. This means we can pipe the result of one function into another without having to store the intermediate output in an object. For example, one way to find every car in the mtcars data set with a mpg >= 25 is to pipe the data through a series of functions.

{r}
mtcars  %>% 
    mutate(car = rownames(.))  %>% 
    select(car, mpg)  %>% 
    filter(mpg >= 25)

To create a web map in R, you will chain together a series of function calls using the %>% operator. Our first function leaflet() will initialize the htmlwidget then we will add a map tile using the addTiles() function.

In [1]:
# Load the leaflet library
library(leaflet)
{r}
# Create a leaflet map with default map tile using addTiles()
library(htmlwidgets)
leaflet() %>% addTiles()

2. Provider Tiles

In the previous exercise, addTiles() added the default OpenStreetMap (OSM) tile to your leaflet map. Map tiles weave multiple map images together. The map tiles presented adjust when a user zooms or pans the map enabling the interactive features you experimented with in exercise 2.

The leaflet package comes with more than 100 map tiles that you can use. These tiles are stored in a list called providers and can be added to your map using addProviderTiles() instead of addTiles().

The leaflet and tidyverse libraries have been loaded for you.

In [6]:
pkgs <- c("tidyverse", "leaflet")
sapply(pkgs, require, character.only = TRUE)
Loading required package: tidyverse
── Attaching packages ─────────────────────────────────────── tidyverse 1.2.1 ──
✔ ggplot2 3.1.0     ✔ purrr   0.2.5
✔ tibble  1.4.2     ✔ dplyr   0.7.8
✔ tidyr   0.8.2     ✔ stringr 1.3.1
✔ readr   1.3.1     ✔ forcats 0.3.0
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
tidyverse
TRUE
leaflet
TRUE
In [9]:
# Print the providers list included in the leaflet library
providers[1:5]
$OpenStreetMap
'OpenStreetMap'
$OpenStreetMap.Mapnik
'OpenStreetMap.Mapnik'
$OpenStreetMap.BlackAndWhite
'OpenStreetMap.BlackAndWhite'
$OpenStreetMap.DE
'OpenStreetMap.DE'
$OpenStreetMap.CH
'OpenStreetMap.CH'
In [10]:
# Print only the names of the map tiles in the providers list 
names(providers)
  1. 'OpenStreetMap'
  2. 'OpenStreetMap.Mapnik'
  3. 'OpenStreetMap.BlackAndWhite'
  4. 'OpenStreetMap.DE'
  5. 'OpenStreetMap.CH'
  6. 'OpenStreetMap.France'
  7. 'OpenStreetMap.HOT'
  8. 'OpenStreetMap.BZH'
  9. 'OpenInfraMap'
  10. 'OpenInfraMap.Power'
  11. 'OpenInfraMap.Telecom'
  12. 'OpenInfraMap.Petroleum'
  13. 'OpenInfraMap.Water'
  14. 'OpenSeaMap'
  15. 'OpenPtMap'
  16. 'OpenTopoMap'
  17. 'OpenRailwayMap'
  18. 'OpenFireMap'
  19. 'SafeCast'
  20. 'Thunderforest'
  21. 'Thunderforest.OpenCycleMap'
  22. 'Thunderforest.Transport'
  23. 'Thunderforest.TransportDark'
  24. 'Thunderforest.SpinalMap'
  25. 'Thunderforest.Landscape'
  26. 'Thunderforest.Outdoors'
  27. 'Thunderforest.Pioneer'
  28. 'OpenMapSurfer'
  29. 'OpenMapSurfer.Roads'
  30. 'OpenMapSurfer.AdminBounds'
  31. 'OpenMapSurfer.Grayscale'
  32. 'Hydda'
  33. 'Hydda.Full'
  34. 'Hydda.Base'
  35. 'Hydda.RoadsAndLabels'
  36. 'MapBox'
  37. 'Stamen'
  38. 'Stamen.Toner'
  39. 'Stamen.TonerBackground'
  40. 'Stamen.TonerHybrid'
  41. 'Stamen.TonerLines'
  42. 'Stamen.TonerLabels'
  43. 'Stamen.TonerLite'
  44. 'Stamen.Watercolor'
  45. 'Stamen.Terrain'
  46. 'Stamen.TerrainBackground'
  47. 'Stamen.TopOSMRelief'
  48. 'Stamen.TopOSMFeatures'
  49. 'Esri'
  50. 'Esri.WorldStreetMap'
  51. 'Esri.DeLorme'
  52. 'Esri.WorldTopoMap'
  53. 'Esri.WorldImagery'
  54. 'Esri.WorldTerrain'
  55. 'Esri.WorldShadedRelief'
  56. 'Esri.WorldPhysical'
  57. 'Esri.OceanBasemap'
  58. 'Esri.NatGeoWorldMap'
  59. 'Esri.WorldGrayCanvas'
  60. 'OpenWeatherMap'
  61. 'OpenWeatherMap.Clouds'
  62. 'OpenWeatherMap.CloudsClassic'
  63. 'OpenWeatherMap.Precipitation'
  64. 'OpenWeatherMap.PrecipitationClassic'
  65. 'OpenWeatherMap.Rain'
  66. 'OpenWeatherMap.RainClassic'
  67. 'OpenWeatherMap.Pressure'
  68. 'OpenWeatherMap.PressureContour'
  69. 'OpenWeatherMap.Wind'
  70. 'OpenWeatherMap.Temperature'
  71. 'OpenWeatherMap.Snow'
  72. 'HERE'
  73. 'HERE.normalDay'
  74. 'HERE.normalDayCustom'
  75. 'HERE.normalDayGrey'
  76. 'HERE.normalDayMobile'
  77. 'HERE.normalDayGreyMobile'
  78. 'HERE.normalDayTransit'
  79. 'HERE.normalDayTransitMobile'
  80. 'HERE.normalNight'
  81. 'HERE.normalNightMobile'
  82. 'HERE.normalNightGrey'
  83. 'HERE.normalNightGreyMobile'
  84. 'HERE.basicMap'
  85. 'HERE.mapLabels'
  86. 'HERE.trafficFlow'
  87. 'HERE.carnavDayGrey'
  88. 'HERE.hybridDay'
  89. 'HERE.hybridDayMobile'
  90. 'HERE.pedestrianDay'
  91. 'HERE.pedestrianNight'
  92. 'HERE.satelliteDay'
  93. 'HERE.terrainDay'
  94. 'HERE.terrainDayMobile'
  95. 'FreeMapSK'
  96. 'MtbMap'
  97. 'CartoDB'
  98. 'CartoDB.Positron'
  99. 'CartoDB.PositronNoLabels'
  100. 'CartoDB.PositronOnlyLabels'
  101. 'CartoDB.DarkMatter'
  102. 'CartoDB.DarkMatterNoLabels'
  103. 'CartoDB.DarkMatterOnlyLabels'
  104. 'HikeBike'
  105. 'HikeBike.HikeBike'
  106. 'HikeBike.HillShading'
  107. 'BasemapAT'
  108. 'BasemapAT.basemap'
  109. 'BasemapAT.grau'
  110. 'BasemapAT.overlay'
  111. 'BasemapAT.highdpi'
  112. 'BasemapAT.orthofoto'
  113. 'nlmaps'
  114. 'nlmaps.standaard'
  115. 'nlmaps.pastel'
  116. 'nlmaps.grijs'
  117. 'nlmaps.luchtfoto'
  118. 'NASAGIBS'
  119. 'NASAGIBS.ModisTerraTrueColorCR'
  120. 'NASAGIBS.ModisTerraBands367CR'
  121. 'NASAGIBS.ViirsEarthAtNight2012'
  122. 'NASAGIBS.ModisTerraLSTDay'
  123. 'NASAGIBS.ModisTerraSnowCover'
  124. 'NASAGIBS.ModisTerraAOD'
  125. 'NASAGIBS.ModisTerraChlorophyll'
  126. 'NLS'
  127. 'JusticeMap'
  128. 'JusticeMap.income'
  129. 'JusticeMap.americanIndian'
  130. 'JusticeMap.asian'
  131. 'JusticeMap.black'
  132. 'JusticeMap.hispanic'
  133. 'JusticeMap.multi'
  134. 'JusticeMap.nonWhite'
  135. 'JusticeMap.white'
  136. 'JusticeMap.plurality'
  137. 'Wikimedia'
In [11]:
# Use str_detect() to determine if the name of each provider tile contains the string "CartoDB"
str_detect(names(providers), "CartoDB")
  1. FALSE
  2. FALSE
  3. FALSE
  4. FALSE
  5. FALSE
  6. FALSE
  7. FALSE
  8. FALSE
  9. FALSE
  10. FALSE
  11. FALSE
  12. FALSE
  13. FALSE
  14. FALSE
  15. FALSE
  16. FALSE
  17. FALSE
  18. FALSE
  19. FALSE
  20. FALSE
  21. FALSE
  22. FALSE
  23. FALSE
  24. FALSE
  25. FALSE
  26. FALSE
  27. FALSE
  28. FALSE
  29. FALSE
  30. FALSE
  31. FALSE
  32. FALSE
  33. FALSE
  34. FALSE
  35. FALSE
  36. FALSE
  37. FALSE
  38. FALSE
  39. FALSE
  40. FALSE
  41. FALSE
  42. FALSE
  43. FALSE
  44. FALSE
  45. FALSE
  46. FALSE
  47. FALSE
  48. FALSE
  49. FALSE
  50. FALSE
  51. FALSE
  52. FALSE
  53. FALSE
  54. FALSE
  55. FALSE
  56. FALSE
  57. FALSE
  58. FALSE
  59. FALSE
  60. FALSE
  61. FALSE
  62. FALSE
  63. FALSE
  64. FALSE
  65. FALSE
  66. FALSE
  67. FALSE
  68. FALSE
  69. FALSE
  70. FALSE
  71. FALSE
  72. FALSE
  73. FALSE
  74. FALSE
  75. FALSE
  76. FALSE
  77. FALSE
  78. FALSE
  79. FALSE
  80. FALSE
  81. FALSE
  82. FALSE
  83. FALSE
  84. FALSE
  85. FALSE
  86. FALSE
  87. FALSE
  88. FALSE
  89. FALSE
  90. FALSE
  91. FALSE
  92. FALSE
  93. FALSE
  94. FALSE
  95. FALSE
  96. FALSE
  97. TRUE
  98. TRUE
  99. TRUE
  100. TRUE
  101. TRUE
  102. TRUE
  103. TRUE
  104. FALSE
  105. FALSE
  106. FALSE
  107. FALSE
  108. FALSE
  109. FALSE
  110. FALSE
  111. FALSE
  112. FALSE
  113. FALSE
  114. FALSE
  115. FALSE
  116. FALSE
  117. FALSE
  118. FALSE
  119. FALSE
  120. FALSE
  121. FALSE
  122. FALSE
  123. FALSE
  124. FALSE
  125. FALSE
  126. FALSE
  127. FALSE
  128. FALSE
  129. FALSE
  130. FALSE
  131. FALSE
  132. FALSE
  133. FALSE
  134. FALSE
  135. FALSE
  136. FALSE
  137. FALSE
In [12]:
# Use str_detect() to print only the provider tile names that include the string "CartoDB"
names(providers)[str_detect(names(providers), "CartoDB")]
  1. 'CartoDB'
  2. 'CartoDB.Positron'
  3. 'CartoDB.PositronNoLabels'
  4. 'CartoDB.PositronOnlyLabels'
  5. 'CartoDB.DarkMatter'
  6. 'CartoDB.DarkMatterNoLabels'
  7. 'CartoDB.DarkMatterOnlyLabels'

3. Adding a Custom Map Tile

Did any tile names look familiar? If you have worked with the mapping software you may recognize the name ESRI or CartoDB.

We create our first leaflet map using the default OSM map tile.

{r}
leaflet() %>% 
    addTiles()

We will primarily use CartoDB provider tiles, but feel free to try others, like Esri. To add a custom provider tile to our map we will use the addProviderTiles() function. The first argument to addProviderTiles() is your leaflet map, which allows us to pipe leaflet() output directly into addProviderTiles(). The second argument is provider, which accepts any of the map tiles included in the providers list.

Familiarize yourself with the SCRIPT.R and HTML VIEWER tabs. Click back and forth to type your code and view your maps.

{r}
leaflet() %>% 
    addProviderTiles("Esri")
{r}
leaflet() %>% 
    addProviderTiles("CartoDB.PositronNoLabels")

4. A Map with a View I

You may have noticed that, by default, maps are zoomed out to the farthest level. Rather than manually zooming and panning, we can load the map centered on a particular point using the setView() function.

{r}
leaflet()  %>% 
    addProviderTiles("CartoDB")  %>% 
    setView(lat = 40.7, lng = -74.0, zoom = 10)

Currently, DataCamp has offices at the following locations:

350 5th Ave, Floor 77, New York, NY 10118

Martelarenlaan 38, 3010 Kessel-Lo, Belgium

These addresses were converted to coordinates using the geocode() function in the ggmaps package.

NYC: (-73.98575, 40.74856) Belgium: (4.717863, 50.881363)

{r}
leaflet()  %>% 
    addProviderTiles("CartoDB")  %>% 
    setView(lng = -73.98575, lat = 40.74856, zoom = 6)
{r}
hc_dq <- data.frame(hq = c("DataCamp - NYC", "DataCamp - Belgium"), 
                   lon = c(-74.0, 4.72), 
                   lat = c(40.7, 50.9))
leaflet() %>% 
    addProviderTiles("CartoDB.PositronNoLabels") %>% 
    setView(lng = hc_dq$lon[2], lat = hc_dq$lat[2], zoom = 4)

5. A Map with a Narrower View

We can limit users' ability to pan away from the map's focus using the options argument in the leaflet() function. By setting minZoom anddragging, we can create an interactive web map that will always be focused on a specific area.

{r}
leaflet(options = 
        leafletOptions(minZoom = 14, dragging = FALSE))  %>% 
  addProviderTiles("CartoDB")  %>% 
  setView(lng = -73.98575, lat = 40.74856, zoom = 14)

Alternatively, if we want our users to be able to drag the map while ensuring that they do not stray too far, we can set the maps maximum boundaries by specifying two diagonal corners of a rectangle.

You'll use dc_hq to create a map with the "CartoDB" provider tile that is centered on DataCamp's Belgium office.

{r}
leaflet(options = leafletOptions(
                    # Set minZoom and dragging 
                    minZoom = 12, dragging = TRUE))  %>% 
  addProviderTiles("CartoDB")  %>% 

  # Set default zoom level 
  setView(lng = hc_dq$lon[2], lat = hc_dq$lat[2], zoom = 14) %>% 

  # Set max bounds of map 
  setMaxBounds(lng1 = hc_dq$lon[2] + .05, 
               lat1 = hc_dq$lat[2] + .05, 
               lng2 = hc_dq$lon[2] - .05, 
               lat2 = hc_dq$lat[2] - .05)

6. Mark it

So far we have been creating maps with a single layer: a base map. We can add layers to this base map similar to how you add layers to a plot in ggplot2. One of the most common layers to add to a leaflet map is location markers, which you can add by piping the result of addTiles() or addProviderTiles() into the add markers function.

For example, if we plot DataCamp's NYC HQ by passing the coordinates to addMarkers() as numeric vectors with one element, our web map will place a blue drop pin at the coordinate. In chapters 2 and 3, we will review some options for customizing these markers.

{r}
leaflet()  %>% 
    addProviderTiles("CartoDB")  %>% 
    addMarkers(lng = -73.98575, lat = 40.74856)

The dc_hq tibble is available in your work space.

{r}
# Plot DataCamp's NYC HQ
hc_dq <- data.frame(hq = c("DataCamp - NYC", "DataCamp - Belgium"), 
                   lon = c(-74.0, 4.72), 
                   lat = c(40.7, 50.9))

leaflet() %>% 
    addProviderTiles("CartoDB") %>% 
    addMarkers(lng = hc_dq$lon[1], lat = hc_dq$lat[1])
{r}
# Plot DataCamp's NYC HQ with zoom of 12    
leaflet() %>% 
    addProviderTiles("CartoDB") %>% 
    addMarkers(lng = -73.98575, lat = 40.74856)  %>% 
    setView(lng = -73.98575, lat = 40.74856, zoom = 12)
{r}
# Plot both DataCamp's NYC and Belgium locations
leaflet() %>% 
    addProviderTiles("CartoDB") %>% 
    addMarkers(lng = dc_hq$lon, lat = dc_hq$lat)

7. Adding Popups and Storing your Map

To make our map more informative we can add popups. To add popups that appear when a marker is clicked we need to specify the popup argument in the addMarkers() function. Once we have a map we would like to preserve, we can store it in an object. Then we can pipe this object into functions to add or edit the map's layers.

{r}
dc_nyc <- 
    leaflet() %>% 
        addTiles() %>% 
        addMarkers(lng = -73.98575, lat = 40.74856, 
                   popup = "DataCamp - NYC") 

dc_nyc %>% 
    setView(lng = -73.98575, lat = 40.74856, 
            zoom = 2)

Let's try adding popups to both DataCamp location markers and storing our map in an object.

In [1]:
# Store leaflet hq map in an object called map
# Plot DataCamp's NYC HQ
pkgs <- c("tidyverse", "leaflet", "htmlwidgets", "webshot")
sapply(pkgs, require, character.only = TRUE)

dc_hq <- data.frame(hq = c("DataCamp - NYC", "DataCamp - Belgium"), 
                   lon = c(-74.0, 4.72), 
                   lat = c(40.7, 50.9))

map <- leaflet() %>%
          addProviderTiles("CartoDB") %>%
          # Use dc_hq to add the hq column as popups
          addMarkers(lng = dc_hq$lon, lat = dc_hq$lat,
                     popup = dc_hq$hq)

# Center the view of map on the Belgium HQ with a zoom of 5 
map_zoom <- map %>%
      setView(lat = 50.881363, lng = 4.717863,
              zoom = 5)

# Print map_zoom
# map_zoom

# saving leaflet
## create .html and .png
## save html to png
saveWidget(map_zoom, "chapter1_mapZoom.html", selfcontained = FALSE)
webshot("chapter1_mapZoom.html", file = "chapter1_mapZoom.png",
        cliprect = "viewport")
Loading required package: tidyverse
── Attaching packages ─────────────────────────────────────── tidyverse 1.2.1 ──
✔ ggplot2 3.1.0       ✔ purrr   0.2.5  
✔ tibble  2.0.1       ✔ dplyr   0.8.0.1
✔ tidyr   0.8.2       ✔ stringr 1.3.1  
✔ readr   1.3.1       ✔ forcats 0.3.0  
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
Loading required package: leaflet
Loading required package: htmlwidgets
Loading required package: webshot
tidyverse
TRUE
leaflet
TRUE
htmlwidgets
TRUE
webshot
TRUE



Exploring_Two_Variables

Exploring Two or More Variables

Evan Jung January 17, 2019

Intro to Multivariate Analysis

Key Terms

  • Contingency Tables - A tally of counts between two or more categorical variables

  • Hexagonal Binning - A plot of two numeric variables with the records binned into hexagons

  • Contour plots - A plot showing the density of two numeric variables like a topographical map.

  • Violin plots - Similar a boxplot but showing the density estimate.

Multivariate Analysis depends on the nature of data: numeric versus categorical.

Hexagonal Binning and Contours (Plotting Numeric Versus Numeric Data)

  • kc_tax contains the tax-assessed values for residential properties in King County, Washington.

## Observations: 498,249
## Variables: 3
## $ TaxAssessedValue <dbl> NA, 206000, 303000, 361000, 459000, 223000, 2...
## $ SqFtTotLiving    <dbl> 1730, 1870, 1530, 2000, 3150, 1570, 1770, 115...
## $ ZipCode          <dbl> 98117, 98002, 98166, 98108, 98108, 98032, 981...

The problem of scatterplots

They are fine when the number of data values is relatively small. But if data sets are enormous, a scatterplot will be too dense, so it becomes difficult to distinctly visualize the relationship. We will compare it to other graph later.

Hexagon binning plot

This plot is to visualize the relationship between the finished squarefeet versus TaxAssessedValue.

## 
## Attaching package: 'gridExtra'

## The following object is masked from 'package:dplyr':
## 
##     combine


Let’s compare two plots. Rather than Scatter Plot, hexagon binning plots help to group into the hexagon bins and to plot the hexagons with a color indicating the number of records in that bin. Now, we can clearly see the positive relationship between two variables.

Density2d

The geom_density2d function uses contours overlaid on a scatterplot to visualize the relationship between two variables. The contours are essentially a topographical map to two variables. Each contour band represents a specific density of points, increasing as one nears a “peak”.


Conclusion

These plots are related to Correlation Analysis. So, when we draw graph between two variables, we has to think ahead “what two variables are related.”


All contents comes from the book below.


practical statistics for data scientists에 대한 이미지 검색결과

+ Recent posts