Numpy Dot 用來計(jì)算兩個(gè)向量之間的點(diǎn)積合敦,
點(diǎn)積:每個(gè)條目的數(shù)相乘后相加
例:
a = [1 ,2, 3, 4]
b = [2 ,3, 4, 5]
那么 a與b的點(diǎn)積 = 12+23+34+45 =40
我們可以用numpy.dot來計(jì)算
a = [1,2,3,4]
b = [2,3,4,5]
numpy.dot(a,b) = 40
數(shù)組與矩陣相乘:
Paste_Image.png
矩陣和矩陣相乘:
Paste_Image.png
例子:計(jì)算出所有獲獎(jiǎng)國家的得分喻奥,金牌4分 朱庆,銀牌2分,銅牌1分。最后以包含獲獎(jiǎng)國家名稱和得分的數(shù)據(jù)框輸出:
import numpy
from pandas import DataFrame, Series
def numpy_dot():
countries = ['Russian Fed.', 'Norway', 'Canada', 'United States',
'Netherlands', 'Germany', 'Switzerland', 'Belarus',
'Austria', 'France', 'Poland', 'China', 'Korea',
'Sweden', 'Czech Republic', 'Slovenia', 'Japan',
'Finland', 'Great Britain', 'Ukraine', 'Slovakia',
'Italy', 'Latvia', 'Australia', 'Croatia', 'Kazakhstan']
gold = [13, 11, 10, 9, 8, 8, 6, 5, 4, 4, 4, 3, 3, 2, 2, 2, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0]
silver = [11, 5, 10, 7, 7, 6, 3, 0, 8, 4, 1, 4, 3, 7, 4, 2, 4, 3, 1, 0, 0, 2, 2, 2, 1, 0]
bronze = [9, 10, 5, 12, 9, 5, 2, 1, 5, 7, 1, 2, 2, 6, 2, 4, 3, 1, 2, 1, 0, 6, 2, 1, 0, 1]
# YOUR CODE HERE
return olympic_points_df
在‘# YOUR CODE HERE’處 輸入正確代碼:
-
整理數(shù)據(jù)并轉(zhuǎn)成dataFrame的形式:
data ={'country_name':countries, 'gold': gold, 'silver':silver, 'bronze':bronze} base_data_df = DataFrame(data);
-
篩選出金牌,銀牌注整,銅牌數(shù)量,并乘以 [4,2,1]這樣就能計(jì)算出points這列了,計(jì)算完并保存到points列
base_data_df['points'] = base_data_df[['gold','silver','bronze']].dot([4,2,1])
我們打印下這個(gè)時(shí)候的base_data_df:
bronze country_name gold silver points
0 9 Russian Fed. 13 11 83
1 10 Norway 11 5 64
2 5 Canada 10 10 65
3 12 United States 9 7 62
4 9 Netherlands 8 7 55
5 5 Germany 8 6 49
6 2 Switzerland 6 3 32
7 1 Belarus 5 0 21
8 5 Austria 4 8 37
9 7 France 4 4 31
10 1 Poland 4 1 19
11 2 China 3 4 22
12 2 Korea 3 3 20
13 6 Sweden 2 7 28
14 2 Czech Republic 2 4 18
15 4 Slovenia 2 2 16
16 3 Japan 1 4 15
17 1 Finland 1 3 11
18 2 Great Britain 1 1 8
19 1 Ukraine 1 0 5
20 0 Slovakia 1 0 4
21 6 Italy 0 2 10
22 2 Latvia 0 2 6
23 1 Australia 0 2 5
24 0 Croatia 0 1 2
25 1 Kazakhstan 0 0 1
- 看完上面的數(shù)據(jù),我們只需要將country_name 和points兩列篩選出來就ok了:
olympic_points_df = base_data_df[['country_name','points']]
看下結(jié)果:
country_name points
0 Russian Fed. 83
1 Norway 64
2 Canada 65
3 United States 62
4 Netherlands 55
5 Germany 49
6 Switzerland 32
7 Belarus 21
8 Austria 37
9 France 31
10 Poland 19
11 China 22
12 Korea 20
13 Sweden 28
14 Czech Republic 18
15 Slovenia 16
16 Japan 15
17 Finland 11
18 Great Britain 8
19 Ukraine 5
20 Slovakia 4
21 Italy 10
22 Latvia 6
23 Australia 5
24 Croatia 2
25 Kazakhstan 1
ok ,這就是我們要的:
有些人在獲取金牌銀牌銅牌的數(shù)據(jù)時(shí)可能會(huì)直接通過基礎(chǔ)數(shù)據(jù)生成個(gè)DataFrame度硝,
data ={ 'gold': gold,
'silver':silver,
'bronze':bronze}
base_data_df = DataFrame(data);
然后直接 base_data_df.dot([4,2,1]),算出的結(jié)果是錯(cuò)的肿轨。
為什么呢?
我們來輸出下以上面形式組成的base_data_df:
bronze gold silver
0 9 13 11
1 10 11 5
2 5 10 10
3 12 9 7
4 9 8 7
5 5 8 6
6 2 6 3
7 1 5 0
8 5 4 8
9 7 4 4
10 1 4 1
11 2 3 4
12 2 3 3
13 6 2 7
14 2 2 4
15 4 2 2
16 3 1 4
17 1 1 3
18 2 1 1
19 1 1 0
20 0 1 0
21 6 0 2
22 2 0 2
23 1 0 2
24 0 0 1
25 1 0 0
看出區(qū)別了嗎塘淑,gold,silver,,bronze三個(gè)的排序是不定的萝招,所以乘以[4,2,1]就得出錯(cuò)誤的結(jié)果了