Count plots
In this exercise, we'll return to exploring our dataset that contains the responses to a survey sent out to young people. We might suspect that young people spend a lot of time on the internet, but how much do they report using the internet each day? Let's use a count plot to break down the number of survey responses in each category and then explore whether it changes based on age.
이번 예제는 설문 조사에 대한 젋은층 응답을 포함하는 데이터 세트를 살펴 봅니다. 젊은 사람들이 인터넷에서 많은 시간을 보냈다고 의심 할 수도 있지만, 실제로 젋은 사람들이 매일 인터넷 사용에 관한 보고는 실제로 확인할 필요가 있습니다. 이번에 카운트 플롯(Count plots)을 사용하여 각 카테고리의 설문 응답 수를 분석 한 다음 연령에 따라 설문 응답이 어떻게 변하는 지 여부를 알아 봅니다.
As a reminder, to create a count plot, we'll use the catplot() function and specify the name of the categorical variable to count (x=____)
, the Pandas DataFrame to use (data=____)
, and the type of plot (kind="count")
.
다시 말해 카운트 플롯을 만들려면 catplot() 함수를 사용하고 계산할 범주 변수의 이름 (x = ____), 사용할 팬더 데이터 프레임 (data = ____) 및 유형을 지정합니다. plot (kind = "count").
Seaborn has been imported as sns
and matplotlib.pyplot
has been imported as plt.
Seaborn은 sn으로 가져오고 matplotlib.pyplot은 plt로 가져 왔습니다.
# Import Matplotlib and Seaborn import matplotlib.pyplot as plt import seaborn as sns import pandas as pd url = 'https://assets.datacamp.com/production/repositories/3996/datasets/ab13162732ae9ca1a9a27e2efd3da923ed6a4e7b/young-people-survey-responses.csv' survey_data = pd.read_csv(url) print(survey_data)
Unnamed: 0 Music Techno Movies History Mathematics Pets Spiders \ 0 0 5.0 1.0 5.0 1.0 3.0 4.0 1.0 1 1 4.0 1.0 5.0 1.0 5.0 5.0 1.0 2 2 5.0 1.0 5.0 1.0 5.0 5.0 1.0 3 3 5.0 2.0 5.0 4.0 4.0 1.0 5.0 4 4 5.0 2.0 5.0 3.0 2.0 1.0 1.0 5 5 5.0 1.0 5.0 5.0 2.0 2.0 1.0 6 6 5.0 5.0 4.0 3.0 1.0 5.0 1.0 7 7 5.0 3.0 5.0 5.0 1.0 5.0 1.0 8 8 5.0 1.0 5.0 3.0 1.0 1.0 5.0 9 9 5.0 1.0 5.0 3.0 3.0 2.0 3.0 10 10 5.0 4.0 5.0 3.0 2.0 5.0 2.0 11 11 5.0 1.0 5.0 2.0 1.0 1.0 5.0 12 12 5.0 1.0 5.0 4.0 1.0 2.0 1.0 13 13 5.0 1.0 5.0 2.0 1.0 5.0 3.0 14 14 5.0 1.0 4.0 2.0 1.0 5.0 2.0 15 15 1.0 1.0 5.0 5.0 3.0 1.0 1.0 16 16 5.0 4.0 5.0 1.0 1.0 5.0 5.0 17 17 5.0 1.0 5.0 3.0 1.0 5.0 4.0 18 18 5.0 4.0 5.0 3.0 5.0 4.0 1.0 19 19 5.0 3.0 3.0 2.0 3.0 1.0 4.0 20 20 5.0 3.0 5.0 5.0 2.0 3.0 2.0 21 21 5.0 1.0 5.0 4.0 3.0 5.0 1.0 22 22 5.0 3.0 5.0 3.0 1.0 5.0 5.0 23 23 5.0 1.0 5.0 2.0 2.0 5.0 3.0 24 24 5.0 3.0 5.0 4.0 2.0 3.0 5.0 25 25 5.0 2.0 5.0 2.0 3.0 3.0 5.0 26 26 5.0 2.0 5.0 3.0 2.0 5.0 1.0 27 27 4.0 4.0 5.0 2.0 1.0 5.0 5.0 28 28 5.0 3.0 5.0 5.0 1.0 3.0 4.0 29 29 5.0 3.0 5.0 5.0 2.0 2.0 5.0 ... ... ... ... ... ... ... ... ... 980 980 5.0 1.0 5.0 5.0 1.0 3.0 5.0 981 981 5.0 1.0 5.0 5.0 1.0 1.0 5.0 982 982 5.0 1.0 4.0 1.0 3.0 4.0 1.0 983 983 5.0 2.0 5.0 2.0 1.0 4.0 5.0 984 984 5.0 1.0 4.0 1.0 1.0 4.0 5.0 985 985 5.0 1.0 5.0 4.0 1.0 2.0 4.0 986 986 4.0 4.0 4.0 5.0 5.0 1.0 1.0 987 987 4.0 1.0 4.0 5.0 2.0 2.0 2.0 988 988 5.0 NaN 5.0 5.0 4.0 1.0 1.0 989 989 5.0 5.0 3.0 5.0 1.0 1.0 1.0 990 990 5.0 1.0 5.0 3.0 1.0 1.0 1.0 991 991 5.0 2.0 5.0 4.0 1.0 5.0 3.0 992 992 4.0 5.0 4.0 5.0 5.0 1.0 5.0 993 993 5.0 2.0 5.0 1.0 1.0 2.0 3.0 994 994 5.0 1.0 5.0 5.0 1.0 5.0 4.0 995 995 5.0 5.0 5.0 3.0 4.0 5.0 1.0 996 996 5.0 1.0 3.0 4.0 1.0 2.0 2.0 997 997 5.0 1.0 4.0 1.0 2.0 5.0 4.0 998 998 5.0 5.0 5.0 3.0 5.0 1.0 1.0 999 999 5.0 5.0 4.0 4.0 5.0 1.0 2.0 1000 1000 5.0 3.0 5.0 3.0 1.0 5.0 1.0 1001 1001 5.0 3.0 3.0 3.0 1.0 4.0 1.0 1002 1002 5.0 3.0 3.0 2.0 1.0 2.0 1.0 1003 1003 4.0 4.0 5.0 3.0 1.0 5.0 4.0 1004 1004 5.0 5.0 5.0 3.0 2.0 4.0 3.0 1005 1005 5.0 3.0 5.0 4.0 1.0 4.0 2.0 1006 1006 4.0 4.0 5.0 4.0 5.0 5.0 1.0 1007 1007 4.0 1.0 4.0 2.0 3.0 5.0 2.0 1008 1008 5.0 2.0 5.0 3.0 1.0 4.0 3.0 1009 1009 5.0 3.0 5.0 2.0 2.0 5.0 1.0 Loneliness Parents' advice Internet usage Finances Age \ 0 3.0 4.0 few hours a day 3.0 20.0 1 2.0 2.0 few hours a day 3.0 19.0 2 5.0 3.0 few hours a day 2.0 20.0 3 5.0 2.0 most of the day 2.0 22.0 4 3.0 3.0 few hours a day 4.0 20.0 5 2.0 3.0 few hours a day 2.0 20.0 6 3.0 4.0 less than an hour a day 4.0 20.0 7 2.0 3.0 few hours a day 3.0 19.0 8 4.0 4.0 few hours a day 2.0 18.0 9 2.0 3.0 few hours a day 4.0 19.0 10 2.0 4.0 less than an hour a day 2.0 19.0 11 4.0 4.0 few hours a day 2.0 17.0 12 5.0 4.0 few hours a day 4.0 24.0 13 2.0 3.0 few hours a day 3.0 19.0 14 2.0 4.0 most of the day 5.0 22.0 15 4.0 4.0 few hours a day 3.0 18.0 16 2.0 3.0 few hours a day 3.0 19.0 17 4.0 1.0 few hours a day 1.0 20.0 18 4.0 3.0 few hours a day 4.0 18.0 19 2.0 2.0 few hours a day 2.0 18.0 20 2.0 3.0 few hours a day 2.0 20.0 21 4.0 3.0 few hours a day 2.0 24.0 22 3.0 3.0 few hours a day 5.0 22.0 23 3.0 2.0 few hours a day 2.0 20.0 24 4.0 4.0 few hours a day 3.0 19.0 25 2.0 4.0 few hours a day 2.0 20.0 26 4.0 5.0 few hours a day 5.0 22.0 27 3.0 4.0 few hours a day 3.0 19.0 28 3.0 3.0 few hours a day 3.0 20.0 29 3.0 3.0 few hours a day 4.0 19.0 ... ... ... ... ... ... 980 2.0 4.0 less than an hour a day 1.0 18.0 981 4.0 2.0 most of the day 5.0 19.0 982 3.0 4.0 few hours a day 2.0 18.0 983 2.0 3.0 few hours a day 2.0 22.0 984 3.0 1.0 most of the day 4.0 21.0 985 3.0 1.0 few hours a day 5.0 20.0 986 3.0 3.0 few hours a day 1.0 19.0 987 4.0 2.0 few hours a day 3.0 20.0 988 5.0 3.0 few hours a day 1.0 19.0 989 5.0 3.0 most of the day 3.0 30.0 990 2.0 2.0 few hours a day 1.0 29.0 991 2.0 1.0 most of the day 3.0 21.0 992 3.0 3.0 most of the day 1.0 30.0 993 4.0 3.0 few hours a day 4.0 21.0 994 3.0 4.0 few hours a day 3.0 20.0 995 4.0 4.0 few hours a day 4.0 18.0 996 2.0 4.0 few hours a day 3.0 20.0 997 2.0 NaN less than an hour a day 3.0 19.0 998 2.0 3.0 few hours a day 4.0 28.0 999 3.0 3.0 few hours a day 3.0 19.0 1000 3.0 2.0 few hours a day 1.0 16.0 1001 2.0 2.0 few hours a day 2.0 18.0 1002 3.0 2.0 few hours a day 4.0 22.0 1003 3.0 4.0 few hours a day 3.0 20.0 1004 3.0 2.0 few hours a day 4.0 22.0 1005 4.0 4.0 few hours a day 3.0 20.0 1006 1.0 4.0 less than an hour a day 3.0 27.0 1007 4.0 4.0 most of the day 1.0 18.0 1008 3.0 3.0 most of the day 3.0 25.0 1009 3.0 3.0 few hours a day 5.0 21.0 Siblings Gender Village - town 0 1.0 female village 1 2.0 female city 2 2.0 female city 3 1.0 female city 4 1.0 female village 5 1.0 male city 6 1.0 female village 7 1.0 male city 8 1.0 female city 9 3.0 female city 10 2.0 female city 11 1.0 female city 12 10.0 female city 13 1.0 female city 14 1.0 female city 15 0.0 male city 16 2.0 female city 17 1.0 female village 18 2.0 male city 19 1.0 male city 20 1.0 male city 21 1.0 male city 22 1.0 female city 23 3.0 female city 24 1.0 female city 25 1.0 female city 26 1.0 female city 27 1.0 female city 28 2.0 male village 29 2.0 female village ... ... ... ... 980 2.0 female city 981 1.0 female village 982 2.0 male city 983 2.0 female village 984 1.0 female city 985 1.0 female city 986 3.0 female city 987 1.0 male city 988 0.0 male city 989 2.0 female village 990 1.0 male city 991 0.0 male city 992 1.0 male city 993 0.0 female city 994 0.0 female city 995 0.0 female city 996 0.0 male NaN 997 1.0 female NaN 998 1.0 male city 999 1.0 male city 1000 1.0 female city 1001 2.0 female city 1002 1.0 male city 1003 1.0 female city 1004 1.0 male city 1005 1.0 female city 1006 5.0 male village 1007 0.0 female city 1008 1.0 female city 1009 1.0 male village [1010 rows x 16 columns]
Quesiton
- Use sns.catplot() to create a count plot using the survey_data DataFrame with "Internet usage" on the x-axis.
# Change the orientation of the plot sns.catplot(x="Internet usage", data=survey_data, kind="count") # Show plot plt.show()

Question
- Make the bars horizontal instead of vertical.
# Change the orientation of the plot sns.catplot(y="Internet usage", data=survey_data, kind="count") # Show plot plt.show()

Question
- Create column subplots based on "Age Category", which separates respondents into those that are younger than 21 vs. 21 and older.
# Create Age Category column by condition import numpy as np survey_data['Age Category'] = np.where(survey_data['Age'] >= 21, "21+", "Less than 21") print(survey_data[['Age', 'Age Category']].head())
Age Age Category 0 20.0 Less than 21 1 19.0 Less than 21 2 20.0 Less than 21 3 22.0 21+ 4 20.0 Less than 21
# Create column subplots based on age category sns.catplot(y="Internet usage", data=survey_data, kind="count", col='Age Category') # Show plot plt.show()

All the Contents are from DataCamp