- 裘宗燕的數(shù)據(jù)結(jié)構(gòu)與算法:Python語言描述個人覺得寫得還是不錯的,于是在網(wǎng)上找了下課件急鳄,發(fā)現(xiàn)不好打包下載唯竹,于是弄了個簡陋的腳本來幫助減少重復(fù)勞動。
網(wǎng)站:
http://www.math.pku.edu.cn/teachers/qiuzy/ds_python/courseware/index.htm
Python語言描述
- 用python3運行腳本即在當(dāng)前目錄下創(chuàng)建了文件夾存儲下載文件肯骇。
另外,關(guān)于python數(shù)據(jù)結(jié)構(gòu)與算法的書好評的有: (祖很。笛丙。。都是英文版)
- Data Structures and Algorithms in Python pdf下載鏈接
- Problem Solving with Algorithms and Data Structures using Python 在線閱讀鏈接
# -*- coding: utf-8 -*-
"""
Created on Sat Jul 9 15:28:39 2016
@author: 樹中湖
"""
import os
import re
import urllib.request
from bs4 import BeautifulSoup
url = 'http://www.math.pku.edu.cn/teachers/qiuzy/ds_python/courseware/index.htm'
html = urllib.request.urlopen(url).read()
soup = BeautifulSoup(html, 'html.parser', from_encoding = 'utf-8')
#nodes = soup.find_all('td', {'align':'left'})
nodes_tr = soup.find_all('tr')
nodes = []
for node in nodes_tr:
try:
nodes.append((node.find_all('td', align = "left")[0]))
except:
pass
urls = {}
url_head = 'http://www.math.pku.edu.cn/teachers/qiuzy/ds_python/courseware/'
code_subs = {}
for node in nodes:
key = (node.get_text().split('假颇,'))[0]
value = node.find_all('a',href = re.compile("(.*)\.[(py)(pdf)]"))
urls[key] = value
for i in node.get_text().split('胚鸯,'):
a = i.find('代碼文件')
if a != -1:
code_subs[key] = i[0:a]
os.mkdir('裘宗燕python')
os.chdir('裘宗燕python')
from multiprocessing.dummy import Pool, freeze_support
func = urllib.request.urlretrieve
with Pool(4) as pool:
for key in urls:
try:
os.mkdir(key)
except:
pass
os.chdir(key)
a = urls[key]
toCrawl = []
for i in a:
url_temp = i.attrs['href']
text_temp = i.getText()
if text_temp == '代碼文件':
text_temp = code_subs[key]
toCrawl.append((url_head + url_temp, url_temp))
with open('%s.txt' % text_temp, 'w') as f:
f.write(str(i))
print('%s' % text_temp, 'is done!')
pool.starmap(func, toCrawl)
os.chdir('..')