環(huán)境說明
扒取的網(wǎng)站:天氣網(wǎng),http://lishi.tianqi.com/
Python版本:2.7
操作系統(tǒng):windows 7
所依賴的包(如若沒有請安裝)
包名說明官方地址
bs4是一個(gè)可以從HTML或XML文件中提取數(shù)據(jù)的Python庫https://pypi.python.org/pypi/requests/
requests基于 urllib贷揽,采用 Apache2 Licensed 開源協(xié)議的 HTTP 庫https://pypi.python.org/pypi/requests/
xlwt本文中將獲取得到的天氣數(shù)據(jù)存儲(chǔ)在xls中,需要對(duì)excel進(jìn)行處理的包https://pypi.python.org/pypi/xlwt
# -*- coding=utf-8 -*-
from bs4 import BeautifulSoup
import requests import xlwt import os#獲得某一個(gè)月的天氣數(shù)據(jù)
defgetListByUrl(url):
res = requests.get(url)
soup = BeautifulSoup(res.text,"html.parser")
weathers = soup.select("#tool_site")
title = weathers[1].select("h3")[0].text
weatherInfors = weathers[1].select("ul")
weatherList = list()forweatherInforinweatherInfors:
singleWeather = list()forliinweatherInfor.select('li'):
singleWeather.append(li.text)
weatherList.append(singleWeather)
print(title)returnweatherList,title#@par:addressUrl 獲得某地區(qū)的數(shù)據(jù)#@par:excelSavePath? 數(shù)據(jù)的保存地址defgetListByAddress(addressUrl,excelSavePath):# url = "http://lishi.tianqi.com/beijing/index.html"url = addressUrl
res = requests.get(url)
soup = BeautifulSoup(res.text,"html.parser")
dates = soup.select(".tqtongji1 ul li a")
workbook = xlwt.Workbook(encoding='utf-8')fordindates:
weatherList,title = getListByUrl(d["href"])
booksheet = workbook.add_sheet(title,cell_overwrite_ok=True)fori,rowinenumerate(weatherList):forj,colinenumerate(row):
booksheet.write(i,j,col)
workbook.save(excelSavePath)if__name__ =="__main__":
addressName = raw_input("請輸入即將獲取天氣的城市:\n")
addresses = BeautifulSoup(requests.get('http://lishi.tianqi.com/').text,"html.parser")
queryAddress = addresses.find_all('a',text=addressName)iflen(queryAddress):
savePath = raw_input("檢測到有該城市數(shù)據(jù),請輸入即將保存天氣數(shù)據(jù)的路徑(如若不輸入搜吧,將默認(rèn)保存到c:/weather/"+addressName+".xls):\n")ifnotsavePath.strip():ifnotos.path.exists('c:/weather'):
os.makedirs('c:/weather')
savePath ="c:/weather/"+addressName+".xls"forqinqueryAddress:
getListByAddress(q["href"],savePath)
print("已經(jīng)天氣數(shù)據(jù)保存到:"+savePath)else:
print("不存在該城市的數(shù)據(jù)")
本代碼是在windows下編輯的屹堰,如若想要在Linux下運(yùn)行叔扼,請?jiān)陬^部加上#! python環(huán)境
本代碼功能描述:
輸入:
城市名稱,如果沒有則提示“不存在該城市的數(shù)據(jù)”
保存路徑姚建,如果不輸入則默認(rèn)保存在“c:/weather/城市名稱.xls”
輸出:
帶有該城市的天氣數(shù)據(jù)
代碼演示
本程序是在pycharm IDE下開發(fā)的矫俺,直接利用pycharm執(zhí)行,大家可以用別的編輯器來執(zhí)行(注意掸冤,由于編碼問題厘托,如果直接用dos來執(zhí)行會(huì)報(bào)錯(cuò))
將getWeatherByQuery.py文件導(dǎo)入到pycharm下執(zhí)行,如下圖所示:
按照提示稿湿,輸入城市铅匹,檢測到有該城市,請求輸入保存路徑
在這里我們輸入要保存的路徑(當(dāng)然也可以直接回車)饺藤,在這里我輸入“d:\weather\bj.xls”(請確保該目錄存在包斑,因?yàn)槌绦虿粫?huì)自動(dòng)去創(chuàng)建目錄)
此時(shí)程序會(huì)將各月份的數(shù)據(jù)導(dǎo)出
執(zhí)行完之后結(jié)果如下圖:
excel中的數(shù)據(jù)如下圖: