MacBook-Air:~ huangyong$ python3
Python 3.6.1 (default, Apr? 4 2017, 09:40:21)
[GCC 4.2.1 Compatible Apple LLVM 8.1.0 (clang-802.0.38)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import urllib.request as ur
>>> s=ur.urlopen('https://www.zhihu.com')
>>> sl=s.read()
#略去print(sl)
>>>from bs4 import BeautifulSoup
>>> bsObj = BeautifulSoup(s.read())
#使用bsObj = BeautifulSoup(sl)的話會(huì)有警告
>>> print(bsObj.h1)
<h1 class="logo hide-text">知乎</h1>
bs4是用來給html代碼分塊的而涉。
>>> f=open('test.txt','w+')
沒有test.txt 會(huì)自動(dòng)創(chuàng)建一個(gè)著瓶,python讀寫文件還是非常簡(jiǎn)單的。
>>> f.write(sl.decode('utf-8'))
把整個(gè)頁(yè)面信息保存下來了啼县,f.write()只能保存字符串材原,不解碼也不能保存沸久,
Make sure you use the right version ofpiporeasy_installfor your Python version (these may be namedpip3andeasy_install3respectively if you’re using Python 3).
pip pip3的區(qū)別是一個(gè)下載到python2.*,一個(gè)下載到python3.*