Count plots

In this exercise, we'll return to exploring our dataset that contains the responses to a survey sent out to young people. We might suspect that young people spend a lot of time on the internet, but how much do they report using the internet each day? Let's use a count plot to break down the number of survey responses in each category and then explore whether it changes based on age.

 

이번 예제는 설문 조사에 대한 젋은층 응답을 포함하는 데이터 세트를 살펴 봅니다. 젊은 사람들이 인터넷에서 많은 시간을 보냈다고 의심 할 수도 있지만, 실제로 젋은 사람들이 매일 인터넷 사용에 관한 보고는 실제로 확인할 필요가 있습니다. 이번에 카운트 플롯(Count plots)을 사용하여 각 카테고리의 설문 응답 수를 분석 한 다음 연령에 따라 설문 응답이 어떻게 변하는 지 여부를 알아 봅니다.

 

As a reminder, to create a count plot, we'll use the catplot() function and specify the name of the categorical variable to count (x=____), the Pandas DataFrame to use (data=____), and the type of plot (kind="count").

 

다시 말해 카운트 플롯을 만들려면 catplot() 함수를 사용하고 계산할 범주 변수의 이름 (x = ____), 사용할 팬더 데이터 프레임 (data = ____) 및 유형을 지정합니다. plot (kind = "count").

 

Seaborn has been imported as sns and matplotlib.pyplot has been imported as plt.

 

Seaborn은 sn으로 가져오고 matplotlib.pyplot은 plt로 가져 왔습니다.

# Import Matplotlib and Seaborn
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
url = 'https://assets.datacamp.com/production/repositories/3996/datasets/ab13162732ae9ca1a9a27e2efd3da923ed6a4e7b/young-people-survey-responses.csv'
survey_data = pd.read_csv(url)
print(survey_data)
Unnamed: 0 Music Techno Movies History Mathematics Pets Spiders \
0 0 5.0 1.0 5.0 1.0 3.0 4.0 1.0
1 1 4.0 1.0 5.0 1.0 5.0 5.0 1.0
2 2 5.0 1.0 5.0 1.0 5.0 5.0 1.0
3 3 5.0 2.0 5.0 4.0 4.0 1.0 5.0
4 4 5.0 2.0 5.0 3.0 2.0 1.0 1.0
5 5 5.0 1.0 5.0 5.0 2.0 2.0 1.0
6 6 5.0 5.0 4.0 3.0 1.0 5.0 1.0
7 7 5.0 3.0 5.0 5.0 1.0 5.0 1.0
8 8 5.0 1.0 5.0 3.0 1.0 1.0 5.0
9 9 5.0 1.0 5.0 3.0 3.0 2.0 3.0
10 10 5.0 4.0 5.0 3.0 2.0 5.0 2.0
11 11 5.0 1.0 5.0 2.0 1.0 1.0 5.0
12 12 5.0 1.0 5.0 4.0 1.0 2.0 1.0
13 13 5.0 1.0 5.0 2.0 1.0 5.0 3.0
14 14 5.0 1.0 4.0 2.0 1.0 5.0 2.0
15 15 1.0 1.0 5.0 5.0 3.0 1.0 1.0
16 16 5.0 4.0 5.0 1.0 1.0 5.0 5.0
17 17 5.0 1.0 5.0 3.0 1.0 5.0 4.0
18 18 5.0 4.0 5.0 3.0 5.0 4.0 1.0
19 19 5.0 3.0 3.0 2.0 3.0 1.0 4.0
20 20 5.0 3.0 5.0 5.0 2.0 3.0 2.0
21 21 5.0 1.0 5.0 4.0 3.0 5.0 1.0
22 22 5.0 3.0 5.0 3.0 1.0 5.0 5.0
23 23 5.0 1.0 5.0 2.0 2.0 5.0 3.0
24 24 5.0 3.0 5.0 4.0 2.0 3.0 5.0
25 25 5.0 2.0 5.0 2.0 3.0 3.0 5.0
26 26 5.0 2.0 5.0 3.0 2.0 5.0 1.0
27 27 4.0 4.0 5.0 2.0 1.0 5.0 5.0
28 28 5.0 3.0 5.0 5.0 1.0 3.0 4.0
29 29 5.0 3.0 5.0 5.0 2.0 2.0 5.0
... ... ... ... ... ... ... ... ...
980 980 5.0 1.0 5.0 5.0 1.0 3.0 5.0
981 981 5.0 1.0 5.0 5.0 1.0 1.0 5.0
982 982 5.0 1.0 4.0 1.0 3.0 4.0 1.0
983 983 5.0 2.0 5.0 2.0 1.0 4.0 5.0
984 984 5.0 1.0 4.0 1.0 1.0 4.0 5.0
985 985 5.0 1.0 5.0 4.0 1.0 2.0 4.0
986 986 4.0 4.0 4.0 5.0 5.0 1.0 1.0
987 987 4.0 1.0 4.0 5.0 2.0 2.0 2.0
988 988 5.0 NaN 5.0 5.0 4.0 1.0 1.0
989 989 5.0 5.0 3.0 5.0 1.0 1.0 1.0
990 990 5.0 1.0 5.0 3.0 1.0 1.0 1.0
991 991 5.0 2.0 5.0 4.0 1.0 5.0 3.0
992 992 4.0 5.0 4.0 5.0 5.0 1.0 5.0
993 993 5.0 2.0 5.0 1.0 1.0 2.0 3.0
994 994 5.0 1.0 5.0 5.0 1.0 5.0 4.0
995 995 5.0 5.0 5.0 3.0 4.0 5.0 1.0
996 996 5.0 1.0 3.0 4.0 1.0 2.0 2.0
997 997 5.0 1.0 4.0 1.0 2.0 5.0 4.0
998 998 5.0 5.0 5.0 3.0 5.0 1.0 1.0
999 999 5.0 5.0 4.0 4.0 5.0 1.0 2.0
1000 1000 5.0 3.0 5.0 3.0 1.0 5.0 1.0
1001 1001 5.0 3.0 3.0 3.0 1.0 4.0 1.0
1002 1002 5.0 3.0 3.0 2.0 1.0 2.0 1.0
1003 1003 4.0 4.0 5.0 3.0 1.0 5.0 4.0
1004 1004 5.0 5.0 5.0 3.0 2.0 4.0 3.0
1005 1005 5.0 3.0 5.0 4.0 1.0 4.0 2.0
1006 1006 4.0 4.0 5.0 4.0 5.0 5.0 1.0
1007 1007 4.0 1.0 4.0 2.0 3.0 5.0 2.0
1008 1008 5.0 2.0 5.0 3.0 1.0 4.0 3.0
1009 1009 5.0 3.0 5.0 2.0 2.0 5.0 1.0
Loneliness Parents' advice Internet usage Finances Age \
0 3.0 4.0 few hours a day 3.0 20.0
1 2.0 2.0 few hours a day 3.0 19.0
2 5.0 3.0 few hours a day 2.0 20.0
3 5.0 2.0 most of the day 2.0 22.0
4 3.0 3.0 few hours a day 4.0 20.0
5 2.0 3.0 few hours a day 2.0 20.0
6 3.0 4.0 less than an hour a day 4.0 20.0
7 2.0 3.0 few hours a day 3.0 19.0
8 4.0 4.0 few hours a day 2.0 18.0
9 2.0 3.0 few hours a day 4.0 19.0
10 2.0 4.0 less than an hour a day 2.0 19.0
11 4.0 4.0 few hours a day 2.0 17.0
12 5.0 4.0 few hours a day 4.0 24.0
13 2.0 3.0 few hours a day 3.0 19.0
14 2.0 4.0 most of the day 5.0 22.0
15 4.0 4.0 few hours a day 3.0 18.0
16 2.0 3.0 few hours a day 3.0 19.0
17 4.0 1.0 few hours a day 1.0 20.0
18 4.0 3.0 few hours a day 4.0 18.0
19 2.0 2.0 few hours a day 2.0 18.0
20 2.0 3.0 few hours a day 2.0 20.0
21 4.0 3.0 few hours a day 2.0 24.0
22 3.0 3.0 few hours a day 5.0 22.0
23 3.0 2.0 few hours a day 2.0 20.0
24 4.0 4.0 few hours a day 3.0 19.0
25 2.0 4.0 few hours a day 2.0 20.0
26 4.0 5.0 few hours a day 5.0 22.0
27 3.0 4.0 few hours a day 3.0 19.0
28 3.0 3.0 few hours a day 3.0 20.0
29 3.0 3.0 few hours a day 4.0 19.0
... ... ... ... ... ...
980 2.0 4.0 less than an hour a day 1.0 18.0
981 4.0 2.0 most of the day 5.0 19.0
982 3.0 4.0 few hours a day 2.0 18.0
983 2.0 3.0 few hours a day 2.0 22.0
984 3.0 1.0 most of the day 4.0 21.0
985 3.0 1.0 few hours a day 5.0 20.0
986 3.0 3.0 few hours a day 1.0 19.0
987 4.0 2.0 few hours a day 3.0 20.0
988 5.0 3.0 few hours a day 1.0 19.0
989 5.0 3.0 most of the day 3.0 30.0
990 2.0 2.0 few hours a day 1.0 29.0
991 2.0 1.0 most of the day 3.0 21.0
992 3.0 3.0 most of the day 1.0 30.0
993 4.0 3.0 few hours a day 4.0 21.0
994 3.0 4.0 few hours a day 3.0 20.0
995 4.0 4.0 few hours a day 4.0 18.0
996 2.0 4.0 few hours a day 3.0 20.0
997 2.0 NaN less than an hour a day 3.0 19.0
998 2.0 3.0 few hours a day 4.0 28.0
999 3.0 3.0 few hours a day 3.0 19.0
1000 3.0 2.0 few hours a day 1.0 16.0
1001 2.0 2.0 few hours a day 2.0 18.0
1002 3.0 2.0 few hours a day 4.0 22.0
1003 3.0 4.0 few hours a day 3.0 20.0
1004 3.0 2.0 few hours a day 4.0 22.0
1005 4.0 4.0 few hours a day 3.0 20.0
1006 1.0 4.0 less than an hour a day 3.0 27.0
1007 4.0 4.0 most of the day 1.0 18.0
1008 3.0 3.0 most of the day 3.0 25.0
1009 3.0 3.0 few hours a day 5.0 21.0
Siblings Gender Village - town
0 1.0 female village
1 2.0 female city
2 2.0 female city
3 1.0 female city
4 1.0 female village
5 1.0 male city
6 1.0 female village
7 1.0 male city
8 1.0 female city
9 3.0 female city
10 2.0 female city
11 1.0 female city
12 10.0 female city
13 1.0 female city
14 1.0 female city
15 0.0 male city
16 2.0 female city
17 1.0 female village
18 2.0 male city
19 1.0 male city
20 1.0 male city
21 1.0 male city
22 1.0 female city
23 3.0 female city
24 1.0 female city
25 1.0 female city
26 1.0 female city
27 1.0 female city
28 2.0 male village
29 2.0 female village
... ... ... ...
980 2.0 female city
981 1.0 female village
982 2.0 male city
983 2.0 female village
984 1.0 female city
985 1.0 female city
986 3.0 female city
987 1.0 male city
988 0.0 male city
989 2.0 female village
990 1.0 male city
991 0.0 male city
992 1.0 male city
993 0.0 female city
994 0.0 female city
995 0.0 female city
996 0.0 male NaN
997 1.0 female NaN
998 1.0 male city
999 1.0 male city
1000 1.0 female city
1001 2.0 female city
1002 1.0 male city
1003 1.0 female city
1004 1.0 male city
1005 1.0 female city
1006 5.0 male village
1007 0.0 female city
1008 1.0 female city
1009 1.0 male village
[1010 rows x 16 columns]

Quesiton

  • Use sns.catplot() to create a count plot using the survey_data DataFrame with "Internet usage" on the x-axis.
# Change the orientation of the plot
sns.catplot(x="Internet usage", data=survey_data,
kind="count")
# Show plot
plt.show()

Question

  • Make the bars horizontal instead of vertical.
# Change the orientation of the plot
sns.catplot(y="Internet usage", data=survey_data,
kind="count")
# Show plot
plt.show()

Question

  • Create column subplots based on "Age Category", which separates respondents into those that are younger than 21 vs. 21 and older.
# Create Age Category column by condition
import numpy as np
survey_data['Age Category'] = np.where(survey_data['Age'] >= 21, "21+", "Less than 21")
print(survey_data[['Age', 'Age Category']].head())
Age Age Category
0 20.0 Less than 21
1 19.0 Less than 21
2 20.0 Less than 21
3 22.0 21+
4 20.0 Less than 21
# Create column subplots based on age category
sns.catplot(y="Internet usage", data=survey_data,
kind="count", col='Age Category')
# Show plot
plt.show()

 

All the Contents are from DataCamp

파이썬 Seaborn을 이용한 선 그래프에 하위 그룹을 적용해보겠습니다. 

Let's continue to look at the mpg dataset. We've seen that the average miles per gallon for cars has increased over time, but how has the average horsepower for cars changed over time? And does this trend differ by country of origin?

 

mpg 데이터 세트를 계속 살펴 보겠습니다. 자동차에 대한 갤런 당 평균 마일이 시간이 지남에 따라 증가했다는 것을 알았지만, 차의 평균 마력은 시간이 지남에 따라 어떻게 변했는가? 그리고 이러한 데이터의 흐름은 차량 생산지에 따라 다른지 확인하는 작업을 하려고 합니다. 

 

# Import Matplotlib and Seaborn
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
url = 'https://assets.datacamp.com/production/repositories/3996/datasets/e0b285b89bdbfbbe8d81123e64727ff150d544e0/mpg.csv'
mpg = pd.read_csv(url)
print(mpg)
mpg cylinders displacement horsepower weight acceleration \
0 18.0 8 307.0 130.0 3504 12.0
1 15.0 8 350.0 165.0 3693 11.5
2 18.0 8 318.0 150.0 3436 11.0
3 16.0 8 304.0 150.0 3433 12.0
4 17.0 8 302.0 140.0 3449 10.5
5 15.0 8 429.0 198.0 4341 10.0
6 14.0 8 454.0 220.0 4354 9.0
7 14.0 8 440.0 215.0 4312 8.5
8 14.0 8 455.0 225.0 4425 10.0
9 15.0 8 390.0 190.0 3850 8.5
10 15.0 8 383.0 170.0 3563 10.0
11 14.0 8 340.0 160.0 3609 8.0
12 15.0 8 400.0 150.0 3761 9.5
13 14.0 8 455.0 225.0 3086 10.0
14 24.0 4 113.0 95.0 2372 15.0
15 22.0 6 198.0 95.0 2833 15.5
16 18.0 6 199.0 97.0 2774 15.5
17 21.0 6 200.0 85.0 2587 16.0
18 27.0 4 97.0 88.0 2130 14.5
19 26.0 4 97.0 46.0 1835 20.5
20 25.0 4 110.0 87.0 2672 17.5
21 24.0 4 107.0 90.0 2430 14.5
22 25.0 4 104.0 95.0 2375 17.5
23 26.0 4 121.0 113.0 2234 12.5
24 21.0 6 199.0 90.0 2648 15.0
25 10.0 8 360.0 215.0 4615 14.0
26 10.0 8 307.0 200.0 4376 15.0
27 11.0 8 318.0 210.0 4382 13.5
28 9.0 8 304.0 193.0 4732 18.5
29 27.0 4 97.0 88.0 2130 14.5
.. ... ... ... ... ... ...
368 27.0 4 112.0 88.0 2640 18.6
369 34.0 4 112.0 88.0 2395 18.0
370 31.0 4 112.0 85.0 2575 16.2
371 29.0 4 135.0 84.0 2525 16.0
372 27.0 4 151.0 90.0 2735 18.0
373 24.0 4 140.0 92.0 2865 16.4
374 23.0 4 151.0 NaN 3035 20.5
375 36.0 4 105.0 74.0 1980 15.3
376 37.0 4 91.0 68.0 2025 18.2
377 31.0 4 91.0 68.0 1970 17.6
378 38.0 4 105.0 63.0 2125 14.7
379 36.0 4 98.0 70.0 2125 17.3
380 36.0 4 120.0 88.0 2160 14.5
381 36.0 4 107.0 75.0 2205 14.5
382 34.0 4 108.0 70.0 2245 16.9
383 38.0 4 91.0 67.0 1965 15.0
384 32.0 4 91.0 67.0 1965 15.7
385 38.0 4 91.0 67.0 1995 16.2
386 25.0 6 181.0 110.0 2945 16.4
387 38.0 6 262.0 85.0 3015 17.0
388 26.0 4 156.0 92.0 2585 14.5
389 22.0 6 232.0 112.0 2835 14.7
390 32.0 4 144.0 96.0 2665 13.9
391 36.0 4 135.0 84.0 2370 13.0
392 27.0 4 151.0 90.0 2950 17.3
393 27.0 4 140.0 86.0 2790 15.6
394 44.0 4 97.0 52.0 2130 24.6
395 32.0 4 135.0 84.0 2295 11.6
396 28.0 4 120.0 79.0 2625 18.6
397 31.0 4 119.0 82.0 2720 19.4
model_year origin name
0 70 usa chevrolet chevelle malibu
1 70 usa buick skylark 320
2 70 usa plymouth satellite
3 70 usa amc rebel sst
4 70 usa ford torino
5 70 usa ford galaxie 500
6 70 usa chevrolet impala
7 70 usa plymouth fury iii
8 70 usa pontiac catalina
9 70 usa amc ambassador dpl
10 70 usa dodge challenger se
11 70 usa plymouth 'cuda 340
12 70 usa chevrolet monte carlo
13 70 usa buick estate wagon (sw)
14 70 japan toyota corona mark ii
15 70 usa plymouth duster
16 70 usa amc hornet
17 70 usa ford maverick
18 70 japan datsun pl510
19 70 europe volkswagen 1131 deluxe sedan
20 70 europe peugeot 504
21 70 europe audi 100 ls
22 70 europe saab 99e
23 70 europe bmw 2002
24 70 usa amc gremlin
25 70 usa ford f250
26 70 usa chevy c20
27 70 usa dodge d200
28 70 usa hi 1200d
29 71 japan datsun pl510
.. ... ... ...
368 82 usa chevrolet cavalier wagon
369 82 usa chevrolet cavalier 2-door
370 82 usa pontiac j2000 se hatchback
371 82 usa dodge aries se
372 82 usa pontiac phoenix
373 82 usa ford fairmont futura
374 82 usa amc concord dl
375 82 europe volkswagen rabbit l
376 82 japan mazda glc custom l
377 82 japan mazda glc custom
378 82 usa plymouth horizon miser
379 82 usa mercury lynx l
380 82 japan nissan stanza xe
381 82 japan honda accord
382 82 japan toyota corolla
383 82 japan honda civic
384 82 japan honda civic (auto)
385 82 japan datsun 310 gx
386 82 usa buick century limited
387 82 usa oldsmobile cutlass ciera (diesel)
388 82 usa chrysler lebaron medallion
389 82 usa ford granada l
390 82 japan toyota celica gt
391 82 usa dodge charger 2.2
392 82 usa chevrolet camaro
393 82 usa ford mustang gl
394 82 europe vw pickup
395 82 usa dodge rampage
396 82 usa ford ranger
397 82 usa chevy s-10
[398 rows x 9 columns]

Step 1. Turn off Confidence Intervals on the Plot

Use relplot() and the mpg DataFrame to create a line plot with "model_year" on the x-axis and "horsepower" on the y-axis. Turn off the confidence intervals on the plot.

 

relplot ()과 mpg DataFrame을 사용하여 x 축에 "model_year", y 축에 "horsepower"을 가진 선 그림을 만듭니다. 플롯에서 신뢰 구간 기능은 Off 하도록 합니다.

 

# Import Matplotlib and Seaborn
import matplotlib.pyplot as plt
import seaborn as sns
# Create line plot of model year vs. horsepower
sns.relplot(x='model_year',
y='horsepower',
data=mpg,
kind='line',
ci=None)
# Show plot
plt.show()

Add Style and Color

Create different lines for each country of origin ("origin") that vary in both line style and color.

# Import Matplotlib and Seaborn
import matplotlib.pyplot as plt
import seaborn as sns
# Change to create subgroups for country of origin
sns.relplot(x="model_year",
y="horsepower",
data=mpg, kind="line",
ci=None, style='origin', hue='origin')
# Show plot
plt.show()

Add Markers

Add markers for each data point to the lines.

각 데이터 포인트에 대한 마커를 선에 추가하십시오.

# Import Matplotlib and Seaborn
import matplotlib.pyplot as plt
import seaborn as sns
# Add markers and make each line have the same style
sns.relplot(x="model_year", y="horsepower",
data=mpg, kind="line",
ci=None, style="origin",
hue="origin",
markers=True)
# Show plot
plt.show()

Now that we've added subgroups, we can see that this downward trend in horsepower was more pronounced among cars from the USA.

 

이제 하위 그룹을 추가 했으므로 'horsepower'의 하락 추세 중 미국의 자동차 가운데 더 두드러 졌다는 것을 알 수 있습니다.

 

All the contents are from DataCamp

Visualizing standard deviation with line plots

In the last exercise, we looked at how the average miles per gallon achieved by cars has changed over time. Now let's use a line plot to visualize how the distribution of miles per gallon has changed over time. Seaborn has been imported as sns and matplotlib.pyplot has been imported as plt.

# Import Matplotlib and Seaborn
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
url = 'https://assets.datacamp.com/production/repositories/3996/datasets/e0b285b89bdbfbbe8d81123e64727ff150d544e0/mpg.csv'
mpg = pd.read_csv(url)
print(mpg)
mpg cylinders displacement horsepower weight acceleration \
0 18.0 8 307.0 130.0 3504 12.0
1 15.0 8 350.0 165.0 3693 11.5
2 18.0 8 318.0 150.0 3436 11.0
3 16.0 8 304.0 150.0 3433 12.0
4 17.0 8 302.0 140.0 3449 10.5
5 15.0 8 429.0 198.0 4341 10.0
6 14.0 8 454.0 220.0 4354 9.0
7 14.0 8 440.0 215.0 4312 8.5
8 14.0 8 455.0 225.0 4425 10.0
9 15.0 8 390.0 190.0 3850 8.5
10 15.0 8 383.0 170.0 3563 10.0
11 14.0 8 340.0 160.0 3609 8.0
12 15.0 8 400.0 150.0 3761 9.5
13 14.0 8 455.0 225.0 3086 10.0
14 24.0 4 113.0 95.0 2372 15.0
15 22.0 6 198.0 95.0 2833 15.5
16 18.0 6 199.0 97.0 2774 15.5
17 21.0 6 200.0 85.0 2587 16.0
18 27.0 4 97.0 88.0 2130 14.5
19 26.0 4 97.0 46.0 1835 20.5
20 25.0 4 110.0 87.0 2672 17.5
21 24.0 4 107.0 90.0 2430 14.5
22 25.0 4 104.0 95.0 2375 17.5
23 26.0 4 121.0 113.0 2234 12.5
24 21.0 6 199.0 90.0 2648 15.0
25 10.0 8 360.0 215.0 4615 14.0
26 10.0 8 307.0 200.0 4376 15.0
27 11.0 8 318.0 210.0 4382 13.5
28 9.0 8 304.0 193.0 4732 18.5
29 27.0 4 97.0 88.0 2130 14.5
.. ... ... ... ... ... ...
368 27.0 4 112.0 88.0 2640 18.6
369 34.0 4 112.0 88.0 2395 18.0
370 31.0 4 112.0 85.0 2575 16.2
371 29.0 4 135.0 84.0 2525 16.0
372 27.0 4 151.0 90.0 2735 18.0
373 24.0 4 140.0 92.0 2865 16.4
374 23.0 4 151.0 NaN 3035 20.5
375 36.0 4 105.0 74.0 1980 15.3
376 37.0 4 91.0 68.0 2025 18.2
377 31.0 4 91.0 68.0 1970 17.6
378 38.0 4 105.0 63.0 2125 14.7
379 36.0 4 98.0 70.0 2125 17.3
380 36.0 4 120.0 88.0 2160 14.5
381 36.0 4 107.0 75.0 2205 14.5
382 34.0 4 108.0 70.0 2245 16.9
383 38.0 4 91.0 67.0 1965 15.0
384 32.0 4 91.0 67.0 1965 15.7
385 38.0 4 91.0 67.0 1995 16.2
386 25.0 6 181.0 110.0 2945 16.4
387 38.0 6 262.0 85.0 3015 17.0
388 26.0 4 156.0 92.0 2585 14.5
389 22.0 6 232.0 112.0 2835 14.7
390 32.0 4 144.0 96.0 2665 13.9
391 36.0 4 135.0 84.0 2370 13.0
392 27.0 4 151.0 90.0 2950 17.3
393 27.0 4 140.0 86.0 2790 15.6
394 44.0 4 97.0 52.0 2130 24.6
395 32.0 4 135.0 84.0 2295 11.6
396 28.0 4 120.0 79.0 2625 18.6
397 31.0 4 119.0 82.0 2720 19.4
model_year origin name
0 70 usa chevrolet chevelle malibu
1 70 usa buick skylark 320
2 70 usa plymouth satellite
3 70 usa amc rebel sst
4 70 usa ford torino
5 70 usa ford galaxie 500
6 70 usa chevrolet impala
7 70 usa plymouth fury iii
8 70 usa pontiac catalina
9 70 usa amc ambassador dpl
10 70 usa dodge challenger se
11 70 usa plymouth 'cuda 340
12 70 usa chevrolet monte carlo
13 70 usa buick estate wagon (sw)
14 70 japan toyota corona mark ii
15 70 usa plymouth duster
16 70 usa amc hornet
17 70 usa ford maverick
18 70 japan datsun pl510
19 70 europe volkswagen 1131 deluxe sedan
20 70 europe peugeot 504
21 70 europe audi 100 ls
22 70 europe saab 99e
23 70 europe bmw 2002
24 70 usa amc gremlin
25 70 usa ford f250
26 70 usa chevy c20
27 70 usa dodge d200
28 70 usa hi 1200d
29 71 japan datsun pl510
.. ... ... ...
368 82 usa chevrolet cavalier wagon
369 82 usa chevrolet cavalier 2-door
370 82 usa pontiac j2000 se hatchback
371 82 usa dodge aries se
372 82 usa pontiac phoenix
373 82 usa ford fairmont futura
374 82 usa amc concord dl
375 82 europe volkswagen rabbit l
376 82 japan mazda glc custom l
377 82 japan mazda glc custom
378 82 usa plymouth horizon miser
379 82 usa mercury lynx l
380 82 japan nissan stanza xe
381 82 japan honda accord
382 82 japan toyota corolla
383 82 japan honda civic
384 82 japan honda civic (auto)
385 82 japan datsun 310 gx
386 82 usa buick century limited
387 82 usa oldsmobile cutlass ciera (diesel)
388 82 usa chrysler lebaron medallion
389 82 usa ford granada l
390 82 japan toyota celica gt
391 82 usa dodge charger 2.2
392 82 usa chevrolet camaro
393 82 usa ford mustang gl
394 82 europe vw pickup
395 82 usa dodge rampage
396 82 usa ford ranger
397 82 usa chevy s-10
[398 rows x 9 columns]
# Change the plot so the shaded area shows the standard deviation instead of the confidence interval for the mean.
# Make the shaded area show the standard deviation
sns.relplot(x="model_year", y="mpg",
data=mpg,
kind="line",
ci="sd")
# Show plot
plt.show()

 

All the contents are from DataCamp

Interpreting line plots

In this exercise, we'll continue to explore Seaborn's mpg dataset, which contains one row per car model and includes information such as the year the car was made, its fuel efficiency (measured in "miles per gallon" or "M.P.G"), and its country of origin (USA, Europe, or Japan).

How has the average miles per gallon achieved by these cars changed over time? Let's use line plots to find out!

# Import Matplotlib and Seaborn
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
url = 'https://assets.datacamp.com/production/repositories/3996/datasets/e0b285b89bdbfbbe8d81123e64727ff150d544e0/mpg.csv'
mpg = pd.read_csv(url)
print(mpg)
mpg cylinders displacement horsepower weight acceleration \
0 18.0 8 307.0 130.0 3504 12.0
1 15.0 8 350.0 165.0 3693 11.5
2 18.0 8 318.0 150.0 3436 11.0
3 16.0 8 304.0 150.0 3433 12.0
4 17.0 8 302.0 140.0 3449 10.5
5 15.0 8 429.0 198.0 4341 10.0
6 14.0 8 454.0 220.0 4354 9.0
7 14.0 8 440.0 215.0 4312 8.5
8 14.0 8 455.0 225.0 4425 10.0
9 15.0 8 390.0 190.0 3850 8.5
10 15.0 8 383.0 170.0 3563 10.0
11 14.0 8 340.0 160.0 3609 8.0
12 15.0 8 400.0 150.0 3761 9.5
13 14.0 8 455.0 225.0 3086 10.0
14 24.0 4 113.0 95.0 2372 15.0
15 22.0 6 198.0 95.0 2833 15.5
16 18.0 6 199.0 97.0 2774 15.5
17 21.0 6 200.0 85.0 2587 16.0
18 27.0 4 97.0 88.0 2130 14.5
19 26.0 4 97.0 46.0 1835 20.5
20 25.0 4 110.0 87.0 2672 17.5
21 24.0 4 107.0 90.0 2430 14.5
22 25.0 4 104.0 95.0 2375 17.5
23 26.0 4 121.0 113.0 2234 12.5
24 21.0 6 199.0 90.0 2648 15.0
25 10.0 8 360.0 215.0 4615 14.0
26 10.0 8 307.0 200.0 4376 15.0
27 11.0 8 318.0 210.0 4382 13.5
28 9.0 8 304.0 193.0 4732 18.5
29 27.0 4 97.0 88.0 2130 14.5
.. ... ... ... ... ... ...
368 27.0 4 112.0 88.0 2640 18.6
369 34.0 4 112.0 88.0 2395 18.0
370 31.0 4 112.0 85.0 2575 16.2
371 29.0 4 135.0 84.0 2525 16.0
372 27.0 4 151.0 90.0 2735 18.0
373 24.0 4 140.0 92.0 2865 16.4
374 23.0 4 151.0 NaN 3035 20.5
375 36.0 4 105.0 74.0 1980 15.3
376 37.0 4 91.0 68.0 2025 18.2
377 31.0 4 91.0 68.0 1970 17.6
378 38.0 4 105.0 63.0 2125 14.7
379 36.0 4 98.0 70.0 2125 17.3
380 36.0 4 120.0 88.0 2160 14.5
381 36.0 4 107.0 75.0 2205 14.5
382 34.0 4 108.0 70.0 2245 16.9
383 38.0 4 91.0 67.0 1965 15.0
384 32.0 4 91.0 67.0 1965 15.7
385 38.0 4 91.0 67.0 1995 16.2
386 25.0 6 181.0 110.0 2945 16.4
387 38.0 6 262.0 85.0 3015 17.0
388 26.0 4 156.0 92.0 2585 14.5
389 22.0 6 232.0 112.0 2835 14.7
390 32.0 4 144.0 96.0 2665 13.9
391 36.0 4 135.0 84.0 2370 13.0
392 27.0 4 151.0 90.0 2950 17.3
393 27.0 4 140.0 86.0 2790 15.6
394 44.0 4 97.0 52.0 2130 24.6
395 32.0 4 135.0 84.0 2295 11.6
396 28.0 4 120.0 79.0 2625 18.6
397 31.0 4 119.0 82.0 2720 19.4
model_year origin name
0 70 usa chevrolet chevelle malibu
1 70 usa buick skylark 320
2 70 usa plymouth satellite
3 70 usa amc rebel sst
4 70 usa ford torino
5 70 usa ford galaxie 500
6 70 usa chevrolet impala
7 70 usa plymouth fury iii
8 70 usa pontiac catalina
9 70 usa amc ambassador dpl
10 70 usa dodge challenger se
11 70 usa plymouth 'cuda 340
12 70 usa chevrolet monte carlo
13 70 usa buick estate wagon (sw)
14 70 japan toyota corona mark ii
15 70 usa plymouth duster
16 70 usa amc hornet
17 70 usa ford maverick
18 70 japan datsun pl510
19 70 europe volkswagen 1131 deluxe sedan
20 70 europe peugeot 504
21 70 europe audi 100 ls
22 70 europe saab 99e
23 70 europe bmw 2002
24 70 usa amc gremlin
25 70 usa ford f250
26 70 usa chevy c20
27 70 usa dodge d200
28 70 usa hi 1200d
29 71 japan datsun pl510
.. ... ... ...
368 82 usa chevrolet cavalier wagon
369 82 usa chevrolet cavalier 2-door
370 82 usa pontiac j2000 se hatchback
371 82 usa dodge aries se
372 82 usa pontiac phoenix
373 82 usa ford fairmont futura
374 82 usa amc concord dl
375 82 europe volkswagen rabbit l
376 82 japan mazda glc custom l
377 82 japan mazda glc custom
378 82 usa plymouth horizon miser
379 82 usa mercury lynx l
380 82 japan nissan stanza xe
381 82 japan honda accord
382 82 japan toyota corolla
383 82 japan honda civic
384 82 japan honda civic (auto)
385 82 japan datsun 310 gx
386 82 usa buick century limited
387 82 usa oldsmobile cutlass ciera (diesel)
388 82 usa chrysler lebaron medallion
389 82 usa ford granada l
390 82 japan toyota celica gt
391 82 usa dodge charger 2.2
392 82 usa chevrolet camaro
393 82 usa ford mustang gl
394 82 europe vw pickup
395 82 usa dodge rampage
396 82 usa ford ranger
397 82 usa chevy s-10
[398 rows x 9 columns]
# Create line plot
sns.relplot(x='model_year', y='mpg',
data=mpg,
kind='line')
# Show plot
plt.show()

Question

  • Which of the following is NOT a correct interpretation of this line plot?

Possible Answers

  1. The average miles per gallon has increased over time.

  2. The distribution of miles per gallon is smaller in 1973 compared to 1977.

  3. We can be 95% confident that the average miles per gallon for all cars in 1970 is between 16 and 20 miles per gallon.

  4. This plot assumes that our data is a random sample of all cars in the US, Europe, and Japan.

The answer is 2. It's that the shaded region represents a confidence interval for the mean, not the distribution of the observations.

 

All the contents are from DataCamp

 

Learn R, Python & Data Science Online

 

www.datacamp.com

 

+ Recent posts