代碼
%matplotlib inline
import matplotlib
import pandas as pd
df = pd.read_csv('/Users/yss/Downloads/Most-Recent-Cohorts-All-Data-Elements.csv', usecols=['INSTNM', 'REGION', 'ADM_RATE', 'SAT_AVG', 'COSTT4_A'] )
savedf = df
cleandf = df[df.ADM_RATE > 0]
df= cleandf
cleandf = df[df.SAT_AVG > 0]
df= cleandf
def sat(sat):
try:
t = sat/1000
except ValueError:
t = 0
return t
def expense(tuition):
try:
t = tuition/50000
except ValueError:
t = 0
return t
df.iloc[:, 3] = df.iloc[:, 3].apply(sat)
df.iloc[:, 4] = df.iloc[:, 4].apply(expense)
x= df[['REGION','SAT_AVG','ADM_RATE','COSTT4_A' ]]
y= x.set_index('REGION')
z=y.groupby('REGION').mean()
z.plot.bar(stacked=True)
代碼解釋
對上述代碼各部分進行以下解釋。
首先是用pandas讀取數據我們指定的列阻星,變成dataframe丁侄,并進行過濾惯雳,提取有效數據。
df = pd.read_csv('/Users/yss/Downloads/Most-Recent-Cohorts-All-Data-Elements.csv', usecols=['INSTNM', 'REGION', 'ADM_RATE', 'SAT_AVG', 'COSTT4_A'] )
savedf = df
cleandf = df[df.ADM_RATE > 0]
df= cleandf
cleandf = df[df.SAT_AVG > 0]
df= cleandf
然后我們定義了兩個函數鸿摇,對數據進行處理石景,這樣可以使兩列在同一數量級,畫出的圖形更加美觀拙吉。
def sat(sat):
try:
t = sat/1000
except ValueError:
t = 0
return t
def expense(tuition):
try:
t = tuition/50000
except ValueError:
t = 0
return t
df.iloc[:, 3] = df.iloc[:, 3].apply(sat)
df.iloc[:, 4] = df.iloc[:, 4].apply(expense)
處理完成之后潮孽,將REGION字段作為我們的索引,這也是圖像的x軸變量庐镐。我們以REGION做分組恩商,然后求出每組的均值变逃,最后使用z.plot.bar(stacked=True)繪制圖形必逆。
x= df[['REGION','SAT_AVG','ADM_RATE','COSTT4_A' ]]
y= x.set_index('REGION')
z=y.groupby('REGION').mean()
z.plot.bar(stacked=True)
遇到的問題
- 使用conda創(chuàng)建的虛擬環(huán)境,然后再使用pip安裝完matplot之后揽乱,產生以下異常
Traceback (most recent call last):
File "mp.py", line 2, in <module>
import matplotlib.pyplot as plt
File "/Users/yss/.anaconda/anaconda3/envs/py3.4/lib/python3.4/site-packages/matplotlib/pyplot.py", line 115, in <module>
_backend_mod, new_figure_manager, draw_if_interactive, _show = pylab_setup()
File "/Users/yss/.anaconda/anaconda3/envs/py3.4/lib/python3.4/site-packages/matplotlib/backends/__init__.py", line 62, in pylab_setup
[backend_name], 0)
File "/Users/yss/.anaconda/anaconda3/envs/py3.4/lib/python3.4/site-packages/matplotlib/backends/backend_macosx.py", line 17, in <module>
from matplotlib.backends import _macosx
RuntimeError: Python is not installed as a framework. The Mac OS X backend will not be able to function correctly if Python is not installed as a framework. See the Python documentation for more information on installing Python as a framework on Mac OS X. Please either reinstall Python as a framework, or try one of the other backends. If you are using (Ana)Conda please install python.app and replace the use of 'python' with 'pythonw'. See 'Working with Matplotlib on OSX' in the Matplotlib FAQ for more information.
按stack_over_flow上的方法可以解決名眉。
具體是執(zhí)行shell命令:
echo backend: TkAgg > ~/.matplotlib/matplotlibrc