Python Matplotlib數(shù)據(jù)可視化小白入門（2）

重點(diǎn)內(nèi)容：

數(shù)據(jù)采集及清理
運(yùn)用各類圖形函數(shù)，調(diào)用數(shù)據(jù)繪制對(duì)應(yīng)可視化圖表

數(shù)據(jù)采集

本案例應(yīng)用的是UCI的華盛頓自行車租賃數(shù)據(jù)，數(shù)據(jù)表地址如下：https://archive.ics.uci.edu/ml/datasets/Bike+Sharing+Dataset

對(duì)于在線的數(shù)據(jù)表壓縮包，需要在引用代碼中制定路徑名扛。為避免數(shù)據(jù)量過大、占用內(nèi)存資源等問題，通常會(huì)定義臨時(shí)文件夾來存放數(shù)據(jù)抚岗。

import pandas as pd #讀取數(shù)據(jù)到dataframe
import urllib #獲取url數(shù)據(jù)
import tempfile #創(chuàng)建臨時(shí)文件夾，大量臨時(shí)數(shù)據(jù)放在內(nèi)存中會(huì)占用大量資源哪怔，可以使用臨時(shí)文件來進(jìn)行儲(chǔ)存宣蔚。臨時(shí)文件不用命名，且使用后會(huì)被自動(dòng)刪除
import shutil #文件操作
import zipfile #壓縮解壓

#獲取數(shù)據(jù)
temp_dir = tempfile.mkdtemp() #建立臨時(shí)目錄认境，用于下載線上的zip數(shù)據(jù)表并解壓
data_source = 'https://archive.ics.uci.edu/ml/machine-learning-databases/00275/Bike-Sharing-Dataset.zip'
zipname = temp_dir + '/Bike-Sharing-Dataset.zip' #拼接文件和路徑
try:
    urllib.urlretrieve(data_source,zipname) #獲取數(shù)據(jù)
except:
    urllib.request.urlretrieve(data_source,zipname)
zip_ref = zipfile.ZipFile(zipname,'r') #創(chuàng)建zipefile對(duì)象處理壓縮文件
zip_ref.extractall(temp_dir) #解壓
zip_ref.close()

#清理數(shù)據(jù)
daily_path = temp_dir + "/day.csv"
daily_data = pd.read_csv(daily_path) #讀取csv
daily_data['dteday']=pd.to_datetime(daily_data['dteday']) #把時(shí)間字符串轉(zhuǎn)換為日期格式
drop_list = ['instant','season','yr','mnth','holiday','workingday','weathersit','atemp','hum'] #去掉不關(guān)注的列
daily_data.drop(drop_list, inplace =True, axis=1)

shutil.rmtree(temp_dir) #刪除臨時(shí)文件目錄

#查看數(shù)據(jù)
daily_data.head()

數(shù)據(jù)表結(jié)果如下：

image.png

全局樣式參數(shù)配置

樣式可以做為獨(dú)立的模塊霍弹，方便后期對(duì)圖表全局的樣式進(jìn)行統(tǒng)一配置

import matplotlib
from matplotlib import pyplot as plt
import pandas as pd
import numpy as np
%matplotlib inline

#全局配置圖像
#設(shè)置圖片尺寸
matplotlib.rc('figure',figsize=(14,7)) #rc=resource configuration
#matplotlib.rc('figure',facecolor='red')
#設(shè)置全局字體
matplotlib.rc('font',size=14)
#設(shè)置背景網(wǎng)格
matplotlib.rc('axes',grid=False)
#設(shè)置背景顏色
matplotlib.rc('axes',facecolor='white')

可視化

在了解數(shù)據(jù)以及數(shù)據(jù)之間的關(guān)系以后叠赐，通過不同類型的圖表可視化數(shù)據(jù)之間的關(guān)系，說明要表達(dá)的內(nèi)容觀點(diǎn)。

用matplotlib實(shí)現(xiàn)圖表可視化分為兩塊內(nèi)容：

配置基礎(chǔ)圖表函數(shù)（模版化）
調(diào)用數(shù)據(jù)賦予給圖表变汪，生成根據(jù)數(shù)據(jù)繪制的圖表

案例：散點(diǎn)圖

配置散點(diǎn)圖函數(shù)模版

#散點(diǎn)圖函數(shù)模版，單獨(dú)列出便于復(fù)用
def scatterplot(x_data,y_data,x_label,y_label,title,ax = None):
    if ax:
        pass
    else:
        fig,ax = plt.subplots() #fig翘贮，ax是figure和axes的縮寫迂曲，使用該函數(shù)來確定圖的位置，fig是圖像對(duì)象佳遂，ax是坐標(biāo)軸對(duì)象
    
    #不顯示頂部和右側(cè)的坐標(biāo)線
    ax.spines['top'].set_color('none')
    ax.spines['right'].set_color('none')
    ax.scatter(x_data,y_data,s=10,color="blue",alpha=0.7)
    
    #添加標(biāo)題和坐標(biāo)說明
    ax.set_title(title)
    ax.set_xlabel(x_label)
    ax.set_ylabel(y_label)

調(diào)用數(shù)據(jù)繪制圖表

#調(diào)用數(shù)據(jù)繪制圖表营袜，觀點(diǎn)：表達(dá)溫度和租車人數(shù)的正相關(guān)關(guān)系
scatterplot(x_data = daily_data['temp'],
           y_data = daily_data['cnt'],
           x_label = 'normalized temperature',
           y_label = 'check outs', #租車量
           title = 'numbers of check outs vs temperature')

image.png

圖表函數(shù)模版

下面分享一下各類圖表的函數(shù)模版，具體調(diào)用數(shù)據(jù)部分不再列出
1.雙軸曲線圖

#雙軸曲線圖
def lineplot(x_data,x_label,y1_data,y1_label,y1_color,y2_data,y2_label,y2_color,title):
    ax1=plt.subplot() #一個(gè)figure里面可以包含多個(gè)axes
    ax1.plot(x_data,y1_data,color=y1_color)
    
    ax2=ax1.twinx()#關(guān)鍵函數(shù)丑罪，表示ax2和ax1共用x軸
    ax2.plot(x_data,y2_data,color=y2_color)
    
    #不顯示頂部的坐標(biāo)線
    ax1.spines['top'].set_color('none')
    ax2.spines['top'].set_color('none')

    #添加標(biāo)題和坐標(biāo)說明
    ax1.set_title(title)
    ax1.set_xlabel(x_label)
    ax1.set_ylabel(y1_label)
    ax2.set_ylabel(y2_label)

2.直方圖

#直方圖
def histoplot(data,x_label,y_label,title):
    _,ax = plt.subplots() # _,多個(gè)變量時(shí)需使用
    res = ax.hist(data,color = '#539caf',bins=10) #res輸出圖形的細(xì)節(jié)值
    ax.set_xlabel(x_label)
    ax.set_ylabel(y_label)
    ax.set_title(title)
    return res

3.堆疊直方圖

#堆疊直方圖,比較兩個(gè)數(shù)據(jù)的分布
def overlaid_histoplot(data1,data1_name,data2,data2_name,x_label,y_label,title):
    max_nbins = 10
    data_range = [min(min(data1),min(data2)),max(max(data1),max(data2))] #對(duì)齊兩組數(shù)據(jù)的起止位置
    binwidth = (data_range[1] - data_range[0])/max_nbins
    bins = np.arange(data_range[0],data_range[1]+binwidth,binwidth)
    
    _,ax = plt.subplots()
    ax.hist(data1,bins=bins,color = 'black',alpha=0.7,label=data1_name)
    ax.hist(data2,bins=bins,color ='blue',alpha=0.7,label=data2_name)
    ax.set_xlabel(x_label)
    ax.set_ylabel(y_label)
    ax.set_title(title)
    ax.legend(loc='best')

4.柱狀圖

#繪制柱狀圖：數(shù)據(jù)處理荚板，按照周來查看租車量的分布趨勢
mean_data=daily_data[['weekday','cnt']].groupby('weekday').agg([np.mean,np.std])
mean_data.columns = mean_data.columns.droplevel()

#mean_data.head()#查看mean_data的數(shù)據(jù)

#定義繪制柱狀圖的函數(shù)
def barplot(x_data,y_data,x_label,y_label,title):
    _,ax = plt.subplots()
    
    ax.bar(x_data,y_data,align='center')
    
    ax.set_xlabel(x_label)
    ax.set_ylabel(y_label)
    ax.set_title(title)

barplot(x_data=mean_data.index.values,#為什么不用直接取weekday column？
       y_data=mean_data['mean'],
       x_label='day of weekdays',
       y_label='number of mean check outs',
       title="check outs by day of week")

5.堆積柱狀圖

#堆積柱狀圖

#數(shù)據(jù)處理吩屹，按照register和causual來查看不同類別用戶的租車量分布趨勢
mean_data_by_type =daily_data[['weekday','registered','casual']].groupby('weekday').mean()
mean_data_by_type['total']=mean_data_by_type['registered']+mean_data_by_type['casual']
mean_data_by_type['reg_%']=mean_data_by_type['registered']/mean_data_by_type['total']
mean_data_by_type['cas_%']=mean_data_by_type['casual']/mean_data_by_type['total']

#mean_data_by_type.head()#查看mean_data的數(shù)據(jù)

#定義堆積柱狀圖的函數(shù)
def stackedbarplot(x_data,x_label,y_data_list,y_data_name,y_label,colors,title):
    _,ax = plt.subplots()
    #循環(huán)繪制堆積柱狀圖
    for i in range(0,len(y_data_list)):
        if i ==0:
            ax.bar(x_data,y_data_list[i],color=colors[i],align='center',label=y_data_name[i])
        else:
            ax.bar(x_data,y_data_list[i],bottom=y_data_list[i-1],color=colors[i],align='center',label=y_data_name[i])
    
    #定義標(biāo)題和圖例
    ax.set_xlabel(x_label)
    ax.set_ylabel(y_label)
    ax.set_title(title)
    ax.legend()

練習(xí)

image.png

最后編輯于：2020.01.30 16:00:41

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者

人面猴
序言：七十年代末啸驯，一起剝皮案震驚了整個(gè)濱河市，隨后出現(xiàn)的幾起案子祟峦，更是在濱河造成了極大的恐慌罚斗，老刑警劉巖，帶你破解...
沈念sama閱讀 206,378評(píng)論 6贊 481
死咒
序言：濱河連續(xù)發(fā)生了三起死亡事件宅楞，死亡現(xiàn)場離奇詭異针姿，居然都是意外死亡袱吆，警方通過查閱死者的電腦和手機(jī)，發(fā)現(xiàn)死者居然都...
沈念sama閱讀 88,356評(píng)論 2贊 382
救了他兩次的神仙讓他今天三更去死
文/潘曉璐我一進(jìn)店門距淫，熙熙樓的掌柜王于貴愁眉苦臉地迎上來绞绒，“玉大人，你說我怎么就攤上這事榕暇∨詈猓” “怎么了？”我有些...
開封第一講書人閱讀 152,702評(píng)論 0贊 342
道士緝兇錄：失蹤的賣姜人
文/不壞的土叔我叫張陵彤枢，是天一觀的道長狰晚。經(jīng)常有香客問我，道長缴啡，這世上最難降的妖魔是什么壁晒？我笑而不...
開封第一講書人閱讀 55,259評(píng)論 1贊 279
?港島之戀（遺憾婚禮）
正文為了忘掉前任，我火速辦了婚禮业栅，結(jié)果婚禮上秒咐，老公的妹妹穿的比我還像新娘。我一直安慰自己碘裕，他們只是感情好携取，可當(dāng)我...
茶點(diǎn)故事閱讀 64,263評(píng)論 5贊 371
惡毒庶女頂嫁案：這布局不是一般人想出來的
文/花漫我一把揭開白布。她就那樣靜靜地躺著帮孔，像睡著了一般歹茶。火紅的嫁衣襯著肌膚如雪。梳的紋絲不亂的頭發(fā)上你弦，一...
開封第一講書人閱讀 49,036評(píng)論 1贊 285
城市分裂傳說
那天惊豺，我揣著相機(jī)與錄音，去河邊找鬼禽作。笑死尸昧，一個(gè)胖子當(dāng)著我的面吹牛，可吹牛的內(nèi)容都是我干的旷偿。我是一名探鬼主播烹俗，決...
沈念sama閱讀 38,349評(píng)論 3贊 400
雙鴛鴦連環(huán)套：你想象不到人心有多黑
文/蒼蘭香墨我猛地睜開眼，長吁一口氣：“原來是場噩夢啊……” “哼萍程！你這毒婦竟也來了幢妄？” 一聲冷哼從身側(cè)響起，我...
開封第一講書人閱讀 36,979評(píng)論 0贊 259
萬榮殺人案實(shí)錄
序言：老撾萬榮一對(duì)情侶失蹤茫负，失蹤者是張志新（化名）和其女友劉穎蕉鸳，沒想到半個(gè)月后，有當(dāng)?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體，經(jīng)...
沈念sama閱讀 43,469評(píng)論 1贊 300
?護(hù)林員之死
正文獨(dú)居荒郊野嶺守林人離奇死亡潮尝，尸身上長有42處帶血的膿包…… 初始之章·張勛以下內(nèi)容為張勛視角年9月15日...
茶點(diǎn)故事閱讀 35,938評(píng)論 2贊 323
?白月光啟示錄
正文我和宋清朗相戀三年榕吼，在試婚紗的時(shí)候發(fā)現(xiàn)自己被綠了。大學(xué)時(shí)的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片勉失。...
茶點(diǎn)故事閱讀 38,059評(píng)論 1贊 333
活死人
序言：一個(gè)原本活蹦亂跳的男人離奇死亡羹蚣，死狀恐怖，靈堂內(nèi)的尸體忽然破棺而出乱凿，到底是詐尸還是另有隱情顽素，我是刑警寧澤，帶...
沈念sama閱讀 33,703評(píng)論 4贊 323
?日本核電站爆炸內(nèi)幕
正文年R本政府宣布徒蟆，位于F島的核電站胁出，受9級(jí)特大地震影響，放射性物質(zhì)發(fā)生泄漏后专。R本人自食惡果不足惜，卻給世界環(huán)境...
茶點(diǎn)故事閱讀 39,257評(píng)論 3贊 307
男人毒藥：我在死后第九天來索命
文/蒙蒙一输莺、第九天我趴在偏房一處隱蔽的房頂上張望戚哎。院中可真熱鬧，春花似錦嫂用、人聲如沸型凳。這莊子的主人今日做“春日...
開封第一講書人閱讀 30,262評(píng)論 0贊 19
一樁弒父案嘱函，背后竟有這般陰謀
文/蒼蘭香墨我抬頭看了看天上的太陽甘畅。三九已至，卻和暖如春往弓，著一層夾襖步出監(jiān)牢的瞬間疏唾，已是汗流浹背。一陣腳步聲響...
開封第一講書人閱讀 31,485評(píng)論 1贊 262
情欲美人皮
我被黑心中介騙來泰國打工函似，沒想到剛下飛機(jī)就差點(diǎn)兒被人妖公主榨干…… 1. 我叫王不留槐脏，地道東北人。一個(gè)月前我還...
沈念sama閱讀 45,501評(píng)論 2贊 354
代替公主和親
正文我出身青樓撇寞，卻偏偏與公主長得像顿天，于是被迫代替她去往敵國和親。傳聞我的和親對(duì)象是個(gè)殘疾皇子蔑担，可洞房花燭夜當(dāng)晚...
茶點(diǎn)故事閱讀 42,792評(píng)論 2贊 345