可以根據(jù)網(wǎng)易云音樂任何歌單的ID昧廷,抓取歌單中所有歌曲的信息以及歌詞,并根據(jù)歌詞中的詞頻生成詞云圖片。項目中還將歌曲信息及歌詞保存在本地數(shù)據(jù)庫妓美,詳細(xì)信息見代碼
github地址 lyricWordCloud.
詞云圖
QQ20180404-182638.png
1.根據(jù)歌單ID 獲取歌單中歌曲列表信息
def get163SongList(song_url,headers):
res = requests.request('GET',song_url,headers=headers)
song_list = res.json()['result']['tracks']
return song_list
2.獲取每首歌歌詞
def getSongLyric(headers,lyric_url):
res = requests.request('GET',lyric_url,headers=headers)
# print(res.json())
if 'lrc' in res.json():
lyric = res.json()['lrc']['lyric']
lyric_without_time = re.sub(r'[\d:.[\]]','',lyric)
return lyric_without_time
else:
return ''
3.根據(jù)詞頻 生成詞云
print('根據(jù)詞頻坐慰,開始生成詞云!')
f1 = f.replace('作詞','')
f2 = f1.replace('作曲','')
cut_text = " ".join(jieba.cut(f2,cut_all=False, HMM=True))
# print(cut_text)
# color_mask = plt.imread("dy.png")
# color_mask = np.array(Image.open(os.path.join(os.path.dirname(__file__), "aa.jpg")))
wc = WordCloud(
font_path="aaa.ttf",
# mask=color_mask,
max_words=100,
width=2000,
height=1200,
margin=2,
)
wordcloud = wc.generate(cut_text)
wordcloud.to_file(os.path.join(os.path.dirname(__file__), "h11.jpg"))
print('打開詞云圖片')
plt.imshow(wordcloud)
plt.axis("off")
plt.show()
所用到的模塊
from bs4 import BeautifulSoup
import sqlite3
import sys
import re
import os
from wordcloud import WordCloud
import matplotlib.pyplot as plt
import jieba
from PIL import Image
import numpy as np
效果如下
image
github地址 lyricWordCloud.