這天山哥正在看IT動向堕油∨似看到網(wǎng)上有人用Python和R來分析微信朋友,于是來了興趣馍迄,也玩了一把福也。不過不會R,于是用Python畫圖攀圈。(參考網(wǎng)址 http://www.sohu.com/a/154250476_467794)
一開始是在Windows下面玩的,后來裝不了 Jieba和Wordcloud峦甩,就轉(zhuǎn)向Mac了赘来。
第一步现喳,安裝 itchat
要方便,你得用PIP: pip install itchat
第二步犬辰,獲取微信朋友資料嗦篱,保存為JSON
import itchat
import json
if __name__ == '__main__':
# 把獲取到的資料存為Json,那樣在之后的調(diào)試過程幌缝,不用次次連接微信
f = open("C:\\Users\\Samuel\\Desktop\\friends.json", encoding="UTF-8", mode="w")
itchat.login() # 這個會彈出二維碼讓你掃碼登陸微信
friends = itchat.get_friends(update=True)[0:
![gender.png](http://upload-images.jianshu.io/upload_images/6409065-b6c686e33427cdd5.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)
] #取得朋友資料數(shù)組
json.dump(friends, fp=f) # 保存為Json
f.close()
開始玩灸促,分析性別比例
這個是在Mac下的代碼,Windows處理中文亂碼和Mac有點不同涵卵,其它一樣
# coding:utf-8
import json
import matplotlib.pyplot as plt
from matplotlib.font_manager import FontProperties
# Define this to solve the Mac Chinese problem.. if you use english, no need
def getChineseFont():
return FontProperties(fname='/System/Library/Fonts/PingFang.ttc')
f = open("/Users/sam/Desktop/friends.json", encoding="UTF-8", mode="r")
friends = json.load(fp=f)
f.close()
male = female = other = 0
for friend in friends[1:]:
sex = friend["Sex"]
if sex == 1:
male += 1
elif sex == 2:
female += 1
else:
other += 1
# 計算朋友總數(shù)
total = len(friends[1:])
# 打印出自己的好友性別比例
print("男性好友: %.2f%%" % (float(male) / total * 100))
print("女性好友: %.2f%%" % (float(female) / total * 100))
print("不明性別好友: %.2f%%" % (float(other) / total * 100))
# For windows to solve the Chinese problem. No need to add `fontproperties` to the methods.
#matplotlib.rcParams['font.sans-serif'] = ['SimHei']
plt.xticks((0, 1, 2),('其它', '男', '女'), fontproperties=getChineseFont())
plt.title('微信朋友圈性別比例分析', fontproperties=getChineseFont())
plt.bar(left=(0, 1, 2), height=(other/total * 100, male/total * 100, female/total * 100), color=('yellow', 'blue', 'red'))
plt.ylabel("百分比 %",fontproperties=getChineseFont())
plt.show()
輸出結(jié)果:
男性好友: 49.50%
女性好友: 38.25%
不明性別好友: 12.25%
再玩浴栽,微信好友個性簽名的自定義詞云圖
這個是好玩的東東,原參考文章里那個地址分析的畫圖太復(fù)雜轿偎,沒有源碼典鸡,而且是R的,咱就不玩了坏晦。咱來分析一下大伙兒個性簽名時使用的高頻詞語是什么萝玷,做個詞云圖。
個性簽名(Signature)有很多本來是表情的昆婿,例如 emoji球碉、span、class等等這些無關(guān)緊要的詞仓蛆,需要先替換掉睁冬,另外,還有類似<>/= 之類的符號多律,也需要寫個簡單的正則替換掉痴突,再把所有拼起來,得到text字串狼荞。不多說了辽装,上代碼。
先安裝 JieBa 和 WordCloud:
pip install jieba
pip install wordcloud
# -*- coding:utf-8 -*-
# coding:utf-8
import json
import matplotlib.pyplot as plt
import jieba
from wordcloud import WordCloud, ImageColorGenerator
import numpy as np
import PIL.Image as Image
import re
# Load the JSON file
f = open("/Users/sam/Desktop/friends.json", encoding="UTF-8", mode="r")
friends = json.load(fp=f)
f.close()
# Use the jieba to analyze the signature.
siglist = []
for i in friends:
signature = i["Signature"].strip().replace("span","").replace("class","").replace("emoji","")
rep = re.compile("1fd+w*|[<>/=]")
signature = rep.sub("", signature)
siglist.append(signature)
text = "".join(siglist)
wordlist = jieba.cut(text, cut_all=True)
word_space_split = " ".join(wordlist)
# 這里用一張圖作底版相味,WordCloud會根據(jù)顏色來分布不同頻率出現(xiàn)的詞匯拾积。
coloring = np.array(Image.open("/Users/sam/Desktop/wechat.jpg"))
my_wordcloud = WordCloud(background_color="white", max_words=2000,
mask=coloring, max_font_size=60, random_state=42, scale=2,
font_path="/Library/Fonts/Songti.ttc").generate(word_space_split)
image_colors = ImageColorGenerator(coloring)
plt.imshow(my_wordcloud.recolor(color_func=image_colors))
plt.imshow(my_wordcloud)
plt.axis("off")
plt.show()
好了!大功告成丰涉!親個嘴兒拓巧!