1. List
通過(guò)sum(list)來(lái)對(duì)list進(jìn)行求和包蓝,不過(guò)看情況需要先將元素轉(zhuǎn)化為數(shù)字。
import csv
f = open("world_alcohol.csv", "r")
world_alcohol = csv.reader(f)
years = []
for row in world_alcohol[1:]:
years.append(row[0])
total = sum(float(i) for i in years)
avg_year = total / len(years)
print (avg_year)
2. Numpy
利用Numpy讀取csv到 numpy_ndarray的格式
import numpy as np
world_alcohol = np.genfromtxt("world_alcohol.csv", delimiter = ",")
print (type(world_alcohol))
# <class 'numpy.ndarray'>
利用Numpy把list轉(zhuǎn)化為vector(一維向量),或把list of list轉(zhuǎn)化為matrix(二維矩陣)
import numpy as np
vector = np.array([10, 20, 30])
matrix = np.array([[5, 10, 15], [20, 25, 30], [35, 40, 45]])
print (vector)
print (matrix)
檢查矩陣的shape
vector = numpy.array([10, 20, 30])
matrix = numpy.array([[5, 10, 15], [20, 25, 30], [35, 40, 45]])
vector_shape = vector.shape
matrix_shape = matrix.shape
Each value in a NumPy array has to have the same data type.
numbers = numpy.array([1, 2, 3, 4])
numbers.dtype
關(guān)于NaN and NA
When NumPy can't convert a value to a numeric data type like float or integer, it uses a special nan value that stands for Not a Number. NumPy assigns an na value, which stands for Not Available, when the value doesn't exist. nan and na values are types of missing data.
把所有的內(nèi)容讀成string格式
為了防止NAN和NA的出現(xiàn),這里把所有的內(nèi)容都以string的格式讀進(jìn)來(lái),設(shè)置dtype和skip_header苫耸。
import numpy as np
world_alcohol = np.genfromtxt("world_alcohol.csv", dtype = "U75", skip_header = 1, delimiter = ",")
print (world_alcohol)
Numpy Slicing
matrix = numpy.array([
[5, 10, 15],
[20, 25, 30],
[35, 40, 45]
])
print(matrix[:,1])
# to get all the rows, second column
# returns 10, 25, 40
print(matrix[:,0:2])
# to get all the rows, and 1st -2nd columns
# returns
[
[5, 10],
[20, 25],
[35, 40]
]
用Numpy返回Boolean,進(jìn)而進(jìn)行選擇
matrix = numpy.array([
[5, 10, 15],
[20, 25, 30],
[35, 40, 45]
])
second_column_25 = (matrix[:,1] == 25)
print(matrix[second_column_25, :])
# it returns:
[
[20, 25, 30]
]
Replacing Values
s1986 = (world_alcohol[:, 0] == "1986")
world_alcohol[s1986, 0] = "2014"
sWine = (world_alcohol[:, 3] == "Wine")
world_alcohol[sWine, 3] = "Grog"