前言:
嗨嘍~大家好呀淘捡,這里是魔王吶 ? ~!
2023年的中秋節(jié)和國慶節(jié)即將來臨,好消息是池摧,它們將連休8天=钩!作彤!
這個長假為許多人提供了絕佳的休閑機會膘魄,
讓許多人都迫不及待地想要釋放他們被壓抑已久的旅游熱情乌逐,
所以很多朋友已經(jīng)開始著手規(guī)劃他們的旅游行程。
今天我們來分析下去哪兒的旅游攻略數(shù)據(jù)创葡,
看看吃浙踢、住、游玩在價位合適的情況下灿渴,怎樣才能玩的開心
環(huán)境使用
解釋器版本 >>> python 3.8
代碼編輯器 >>> pycharm 2021.2
模塊使用
requests >>> 主要用來發(fā) 送 HTTP 請求 / 第三方模塊
parsel >>> 主要用來將請求后的字符串格式解析成re,xpath,css進行內容的匹配 / 第三方模塊
csv
第三方模塊安裝:
win + R 輸入cmd 輸入安裝命令 pip install 模塊名
(如果你覺得安裝速度比較慢, 你可以切換國內鏡像源)
數(shù)據(jù)來源分析
1. 明確需求
這次選的月份為10 ~ 12月洛波,游玩費用為1000 ~ 2999這個價位
2. 抓包分析
按F12,打開開發(fā)者工具骚露,點擊搜索蹬挤,輸入你想要的數(shù)據(jù)
找到數(shù)據(jù)鏈接
https://travel.qunar.com/travelbook/list.htm?page=1&order=hot_heat&&month=10_11_12&avgPrice=2
代碼實現(xiàn)
導入模塊
import requests
import parsel
import csv
請求數(shù)據(jù)
模擬瀏覽器: <可以直接復制>
response.text 獲取響應文本數(shù)據(jù)
response.json() 獲取響應json數(shù)據(jù)
response.content 獲取響應二進制數(shù)據(jù)
我們使用requests.get()方法向指定的URL發(fā)送GET請求,并獲取到響應的內容
url = f'https://travel.qunar.com/travelbook/list.htm?page=1&order=hot_heat&&month=10_11_12&&avgPrice=2'
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36'
}
response = requests.get(url, headers=headers)
解析
先取響應文本數(shù)據(jù)
selector = parsel.Selector(response.text)
css選擇器::根據(jù)標簽屬性提取數(shù)據(jù)內容棘幸,看元素面板, 為了幫助找到數(shù)據(jù)標簽,
lis = selector.css('.list_item')
for li in lis:
title = li.css('.tit a::text').get()
user_name = li.css('.user_name a::text').get()
date = li.css('.date::text').get()
days = li.css('.days::text').get()
photo_nums = li.css('.photo_nums::text').get()
fee = li.css('.fee::text').get()
people = li.css('.people::text').get()
trip = li.css('.trip::text').get()
places = ''.join(li.css('.places ::text').getall()).split('行程')
place_1 = places[0].replace('途經(jīng):', '')
place_2 = places[-1].replace(':', '')
href = li.css('.tit a::attr(href)').get().split('/')[-1]
link = f'https://travel.qunar.com/travelbook/note/{href}'
dit = {
'標題': title,
'昵稱': user_name,
'日期': date,
'耗時': days,
'照片': photo_nums,
'費用': fee,
'人員': people,
'標簽': trip,
'途徑': place_1,
'行程': place_2,
'詳情頁': link,
}
print(title, user_name, date, days, photo_nums, fee, people, trip, place_1, place_2, link, sep=' | ')
保存
f = open('data.csv', mode='w', encoding='utf-8', newline='')
csv_writer = csv.DictWriter(f, fieldnames=[
'標題',
'昵稱',
'日期',
'耗時',
'照片',
'費用',
'人員',
'標簽',
'途徑',
'行程',
'詳情頁',
])
csv_writer.writeheader()
數(shù)據(jù)可視化
導入模塊焰扳、數(shù)據(jù)
import pandas as pd
df = pd.read_csv('data.csv')
df.head()
年份分布情況
from pyecharts import options as opts
from pyecharts.charts import Pie
from pyecharts.faker import Faker
num = df['年份'].value_counts().to_list()
info = df['年份'].value_counts().index.to_list()
c = (
Pie()
.add(
"",
[
list(z)
for z in zip(
info,
num,
)
],
center=["40%", "50%"],
)
.set_global_opts(
title_opts=opts.TitleOpts(title="年份分布情況"),
legend_opts=opts.LegendOpts(type_="scroll", pos_left="80%", orient="vertical"),
)
.set_series_opts(label_opts=opts.LabelOpts(formatter=": {c}"))
# .render("pie_scroll_legend.html")
)
c.render_notebook()
月份分布情況
from pyecharts import options as opts
from pyecharts.charts import Pie
from pyecharts.faker import Faker
num = df['月份'].value_counts().to_list()
info = df['月份'].value_counts().index.to_list()
c = (
Pie()
.add(
"",
[
list(z)
for z in zip(
info,
num,
)
],
center=["40%", "50%"],
)
.set_global_opts(
title_opts=opts.TitleOpts(title="月份分布情況"),
legend_opts=opts.LegendOpts(type_="scroll", pos_left="80%", orient="vertical"),
)
.set_series_opts(label_opts=opts.LabelOpts(formatter="误续: {c}"))
# .render("pie_scroll_legend.html")
)
c.render_notebook()
出行時間情況
from pyecharts import options as opts
from pyecharts.charts import Pie
from pyecharts.faker import Faker
num = df['耗時'].value_counts().to_list()
info = df['耗時'].value_counts().index.to_list()
c = (
Pie()
.add(
"",
[
list(z)
for z in zip(
info,
num,
)
],
center=["40%", "50%"],
)
.set_global_opts(
title_opts=opts.TitleOpts(title="出行時間情況"),
legend_opts=opts.LegendOpts(type_="scroll", pos_left="80%", orient="vertical"),
)
.set_series_opts(label_opts=opts.LabelOpts(formatter="吨悍: {c}"))
# .render("pie_scroll_legend.html")
)
c.render_notebook()
費用分布情況
from pyecharts import options as opts
from pyecharts.charts import Pie
from pyecharts.faker import Faker
num = df['費用'].value_counts().to_list()
info = df['費用'].value_counts().index.to_list()
c = (
Pie()
.add(
"",
[
list(z)
for z in zip(
info,
num,
)
],
center=["40%", "50%"],
)
.set_global_opts(
title_opts=opts.TitleOpts(title="費用分布情況"),
legend_opts=opts.LegendOpts(type_="scroll", pos_left="80%", orient="vertical"),
)
.set_series_opts(label_opts=opts.LabelOpts(formatter=": {c}"))
# .render("pie_scroll_legend.html")
)
c.render_notebook()
人員分布情況
from pyecharts import options as opts
from pyecharts.charts import Pie
from pyecharts.faker import Faker
num = df['人員'].value_counts().to_list()
info = df['人員'].value_counts().index.to_list()
c = (
Pie()
.add(
"",
[
list(z)
for z in zip(
info,
num,
)
],
center=["40%", "50%"],
)
.set_global_opts(
title_opts=opts.TitleOpts(title="人員分布情況"),
legend_opts=opts.LegendOpts(type_="scroll", pos_left="80%", orient="vertical"),
)
.set_series_opts(label_opts=opts.LabelOpts(formatter="女嘲: {c}"))
# .render("pie_scroll_legend.html")
)
c.render_notebook()
尾語
最后感謝你觀看我的文章吶~本次航班到這里就結束啦 ??
希望本篇文章有對你帶來幫助 ??畜份,有學習到一點知識~
躲起來的星星??也在努力發(fā)光诞帐,你也要努力加油(讓我們一起努力叭)欣尼。
<a id="article_bottom"></a>最后,宣傳一下呀~??????更多源碼停蕉、資料愕鼓、素材、解答慧起、交流皆點擊下方名片獲取呀????