摘要:數據集大學畢業生收入下載地址,本文以繪制直方圖為主。整型全年全職在崗人數。浮點型收入的百分位數。各大類專業就業率圖示結論相對來說,由于計算機的發展前景,計算機與數學類的就業率較高。
下載地址,本文以繪制直方圖為主。
字段名稱 | 字段類型 | 字段說明 |
---|---|---|
Major_code | 整型 | 專業代碼。 |
Major | 字符型 | 專業名稱。 |
Major_category | 字符型 | 專業所屬目錄。 |
Total | 整型 | 總人數。 |
Employed | 整型 | 就業人數。 |
Employed_full_time_year_round | 整型 | 全年全職在崗人數。 |
Unemployed | 整型 | 失業人數。 |
Unemployment_rate | 浮點型 | 失業率。 |
Median | 整型 | 收入的中位數。 |
P25th | 整型 | 收入的25百分位數。 |
P75th | 浮點型 | 收入的75百分位數。 |
import numpy as npimport matplotlib.pyplot as pltimport pandas as pdimport osimport warningswarnings.filterwarnings("ignore")
df = pd.read_csv("大學畢業生收入數據集.csv")
print(df.head())
結果
:
Major_code Major ... P25th P75th0 1100 GENERAL AGRICULTURE ... 34000 80000.01 1101 AGRICULTURE PRODUCTION AND MANAGEMENT ... 36000 80000.02 1102 AGRICULTURAL ECONOMICS ... 40000 98000.03 1103 ANIMAL SCIENCES ... 30000 72000.04 1104 FOOD SCIENCE ... 38500 90000.0
df.info()
結果
:
RangeIndex: 173 entries, 0 to 172Data columns (total 11 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 Major_code 173 non-null int64 1 Major 173 non-null object 2 Major_category 173 non-null object 3 Total 173 non-null int64 4 Employed 173 non-null int64 5 Employed_full_time_year_round 173 non-null int64 6 Unemployed 173 non-null int64 7 Unemployment_rate 173 non-null float64 8 Median 173 non-null int64 9 P25th 173 non-null int64 10 P75th 173 non-null float64dtypes: float64(2), int64(7), object(2)
print(df.duplicated().sum())
結果
:
0
print(df.isnull().sum())
結果
:
Major_code 0Major 0Major_category 0Total 0Employed 0Employed_full_time_year_round 0Unemployed 0Unemployment_rate 0Median 0P25th 0P75th 0dtype: int64
describe = df.describe()print(describe)
結果
:
Major_code Total ... P25th P75thcount 173.000000 1.730000e+02 ... 173.000000 173.000000mean 3879.815029 2.302566e+05 ... 38697.109827 82506.358382std 1687.753140 4.220685e+05 ... 9414.524761 20805.330126min 1100.000000 2.396000e+03 ... 24900.000000 45800.00000025% 2403.000000 2.428000e+04 ... 32000.000000 70000.00000050% 3608.000000 7.579100e+04 ... 36000.000000 80000.00000075% 5503.000000 2.057630e+05 ... 42000.000000 95000.000000max 6403.000000 3.123510e+06 ... 78000.000000 210000.000000[8 rows x 9 columns]
可在變量視圖中查看
describe
Major_category_counts=df["Major_category"].value_counts()print(Major_category_counts)rects = plt.bar(range(1,17),Major_category_counts);for rect in rects: #rects 是三根柱子的集合 height = rect.get_height() plt.text(rect.get_x() + rect.get_width() / 2, height, str(height), size=12, ha="center", va="bottom")interval = ["Engineering","Education","Humanities & Liberal Arts","Biology & Life Science","Business","Health","Computers & Mathematics","Agriculture & Natural Resources","Physical Sciences","Social Science","Psychology & Social Work","Arts","Industrial Arts & Consumer Services","Law & Public Policy","Communications & Journalism","Interdisciplinary"]plt.xticks(range(1,17),interval,rotation=90);plt.title("Number of Branches by Major Category")plt.ylabel("Counts")plt.show()
結果
:
Engineering 29Education 16Humanities & Liberal Arts 15Biology & Life Science 14Business 13Health 12Computers & Mathematics 11Agriculture & Natural Resources 10Physical Sciences 10Social Science 9Psychology & Social Work 9Arts 8Industrial Arts & Consumer Services 7Law & Public Policy 5Communications & Journalism 4Interdisciplinary 1Name: Major_category, dtype: int64
圖示
:
結論
:
由于機械類專業發展歷史悠久,故相對來說機械類專業分支數相較其他大類專業要多
averageMoney = []for i in range(len(interval)): sum = 0 for j in range(173): if df["Major_category"][j] == interval[i]: sum = sum + df["Median"][j] averageMoney.append(sum/Major_category_counts[i])plt.bar(range(1,17),averageMoney);plt.xticks(range(1,17),interval,rotation=90);plt.title("Average Annual salary by Major Category")plt.ylabel("Moneys")plt.show()
圖示
:
結論
:
由于機械類專業與人工智能、自動化等領域相關,故平均工資比較高;計算機與數學類專業發展前景很好,但是小公司工資普遍不高,大公司工資相對來說較高。
averageUnemployRate = []for i in range(len(interval)): sum = 0 for j in range(173): if df["Major_category"][j] == interval[i]: sum = sum + df["Unemployment_rate"][j] averageUnemployRate.append(sum/Major_category_counts[i])plt.bar(range(1,17),averageUnemployRate);plt.xticks(range(1,17),interval,rotation=90);plt.title("Average Unemployment Rate by Major Category")plt.ylabel("Rate")plt.show()
圖示
:
結論
:
藝術類專業由于可變動性特別大,加上對人才的要求相對來說較為苛刻,故失業率較高。
averageEmployRate = []for i in range(len(interval)): sum = 0 for j in range(173): if df["Major_category"][j] == interval[i]: sum = sum + df["Employed"][j] / df["Total"][j] averageEmployRate.append(sum/Major_category_counts[i])plt.bar(range(1,17),averageEmployRate);plt.xticks(range(1,17),interval,rotation=90);plt.title("Average Employment Rate by Major Category")plt.ylabel("Rate")plt.show()
圖示
:
結論
:
相對來說,由于計算機的發展前景,計算機與數學類的就業率較高。
averageFullTimeRate = []for i in range(len(interval)): sum = 0 for j in range(173): if df["Major_category"][j] == interval[i]: sum = sum + df["Employed_full_time_year_round"][j] / df["Employed"][j] averageFullTimeRate.append(sum/Major_category_counts[i])plt.bar(range(1,17),averageFullTimeRate);plt.xticks(range(1,17),interval,rotation=90);plt.title("Average Full-Time Rate by Major Category")plt.ylabel("Rate")plt.show()
圖示
:
averageNum = []for i in range(len(interval)): sum = 0 for j in range(173): if df["Major_category"][j] == interval[i]: sum = sum + df["Total"][j] averageNum.append(sum/Major_category_counts[i])plt.bar(range(1,17),averageNum);plt.xticks(range(1,17),interval,rotation=90);plt.title("Average Total Numbers by Major Category")plt.ylabel("Counts")plt.show()
圖示
:
EUratio = []for i in range(len(interval)): EUratio.append(averageEmployRate[i]/averageUnemployRate[i])plt.bar(range(1,17),EUratio);plt.xticks(range(1,17),interval,rotation=90);plt.title("Employment-Unemployment Ratio by Major Category")plt.ylabel("Ratio")plt.show()
圖示
:
結論
:
相對來說,農業就業的門檻低,就業率高的同時失業率低。
# 導包import numpy as npimport matplotlib.pyplot as pltimport pandas as pdimport osimport warningswarnings.filterwarnings("ignore")# 讀取數據df = pd.read_csv("大學畢業生收入數據集.csv")# 預覽數據print(df.head())# 規范字段名稱(本數據集已經較為規范)# 查看基本信息df.info()# 查看重復值print(df.duplicated().sum())# 查看缺失值print(df.isnull().sum())# 查看數據集描述性信息describe = df.describe()print(describe)# 統計表中每個專業種類(Major_category)的個數Major_category_counts=df["Major_category"].value_counts()print(Major_category_counts)rects = plt.bar(range(1,17),Major_category_counts);for rect in rects: #rects 是三根柱子的集合 height = rect.get_height() plt.text(rect.get_x() + rect.get_width() / 2, height, str(height), size=12, ha="center", va="bottom")interval = ["Engineering","Education","Humanities & Liberal Arts","Biology & Life Science","Business","Health","Computers & Mathematics","Agriculture & Natural Resources","Physical Sciences","Social Science","Psychology & Social Work","Arts","Industrial Arts & Consumer Services","Law & Public Policy","Communications & Journalism","Interdisciplinary"]plt.xticks(range(1,17),interval,rotation=90);plt.title("Number of Branches by Major Category")plt.ylabel("Counts")plt.show()# 對各大類專業收入作統計并作圖averageMoney = []for i in range(len(interval)): sum = 0 for j in range(173): if df["Major_category"][j] == interval[i]: sum = sum + df["Median"][j] averageMoney.append(sum/Major_category_counts[i])plt.bar(range(1,17),averageMoney);plt.xticks(range(1,17),interval,rotation=90);plt.title("Average Annual salary by Major Category")plt.ylabel("Moneys")plt.show()# 對各大類專業失業率作統計并作圖averageUnemployRate = []for i in range(len(interval)): sum = 0 for j in range(173): if df["Major_category"][j] == interval[i]: sum = sum + df["Unemployment_rate"][j] averageUnemployRate.append(sum/Major_category_counts[i])plt.bar(range(1,17),averageUnemployRate);plt.xticks(range(1,17),interval,rotation=90);plt.title("Average Unemployment Rate by Major Category")plt.ylabel("Rate")plt.show()# 對各大類專業就業率作統計并作圖averageEmployRate = []for i in range(len(interval)): sum = 0 for j in range(173): if df["Major_category"][j] == interval[i]: sum = sum + df["Employed"][j] / df["Total"][j] averageEmployRate.append(sum/Major_category_counts[i])plt.bar(range(1,17),averageEmployRate);plt.xticks(range(1,17),interval,rotation=90);plt.title("Average Employment Rate by Major Category")plt.ylabel("Rate")plt.show()# 對各大類專業全年全職在崗率作統計并作圖(沒有早退的)averageFullTimeRate = []for i in range(len(interval)): sum = 0 for j in range(173): if df["Major_category"][j] == interval[i]: sum = sum + df["Employed_full_time_year_round"][j] / df["Employed"][j] averageFullTimeRate.append(sum/Major_category_counts[i])plt.bar(range(1,17),averageFullTimeRate);plt.xticks(range(1,17),interval,rotation=90);plt.title("Average Full-Time Rate
文章版權歸作者所有,未經允許請勿轉載,若此文章存在違規行為,您可以聯系管理員刪除。
轉載請注明本文地址:http://specialneedsforspecialkids.com/yun/121287.html
摘要:中國的行業的蓬勃發展,蛋糕之大,讓所有行業從業者的收入總體處于行業前列,可比擬的只有金融行業一個不創造財富,只分配財富的行業。每天收到十幾份簡歷,卻招聘不到合適的人。很多小伙伴冷門專業,普通學校,畢業了工作幾年了月薪還是幾千塊,這就是現狀。 ? ? ?? ? ? ?中國的IT行業因為有人口福...
摘要:我想說的是,有時候選擇比努力更重要。未來職業的選擇是我們在畢業后面對的人生中第一次重大選擇,它與我們未來幾十年的人生走向有著莫大關系。就這樣,幾年過去了,幾十年又過去了,同齡人之間的差距便會凸顯出來越來越大。 大家都知道程序員這個行業,目前是站在風口上的,薪資待遇可以說是高于其他多數行業,但...
摘要:根據公司的調查,計算機科學專業在所有專業的前五年職業生涯的基礎薪資中位數中占據第一位,約為萬美元。市場現狀,產品背景十三五規劃對應年,大方向是加快壯大戰略性新興產業,打造經濟社會發展新引擎。 極客時間是極客邦科技出品的IT類知識服務產品,內容包含專欄訂閱、極客新聞、熱點專題、直播、視頻和音頻等多種形式的知識服務。極客時間服...
摘要:根據公司的調查,計算機科學專業在所有專業的前五年職業生涯的基礎薪資中位數中占據第一位,約為萬美元。市場現狀,產品背景十三五規劃對應年,大方向是加快壯大戰略性新興產業,打造經濟社會發展新引擎。 ??????百????度網盤??提取碼:u6C4?極客時間是極客邦科技出品的IT類知識服務產品,內容包含專欄訂閱、極客新聞、熱點專題...
摘要:作為十幾年的老開發者,今天我來分享一下,我個人認為的大學計算機相關專業該怎么學,希望你們的四年能夠不負年華。粉絲專屬福利九關于考研有能力去考研的,我建議去嘗試一下考研,理由有以下幾點第一,畢業就工作的人,前三年還處于摸索和定性的階段。 ...
閱讀 3569·2021-11-18 13:20
閱讀 2727·2021-10-15 09:40
閱讀 1740·2021-10-11 10:58
閱讀 2107·2021-09-27 13:36
閱讀 2586·2021-09-07 10:06
閱讀 1848·2021-08-11 11:21
閱讀 1424·2019-08-29 17:04
閱讀 2080·2019-08-29 14:06