計(jì)算距離
In [25]: def distance(x,y):
m_x = np.array(x)
m_y = np.array(y)
result = sum((m_x - m_y)*np.transpose(m_x - m_y))
return math.sqrt(float(result))
Test:
In [172]: distance([5,10],[7,8])
Out[172]: 2.8284271247461903
需要注意的是numpy的數(shù)據(jù)類型和原始類型是不同的呢蔫,numpy進(jìn)行了封裝。參見這里和這里
scipy庫片吊,包含了十幾種不同的距離計(jì)算方法
>> scipy.spatial.distance
>> help(scipy.spatial.distance)
...
braycurtis -- the Bray-Curtis distance.
canberra -- the Canberra distance.
chebyshev -- the Chebyshev distance.
cityblock -- the Manhattan distance.
correlation -- the Correlation distance.
cosine -- the Cosine distance.
dice -- the Dice dissimilarity (boolean).
euclidean -- the Euclidean distance.
hamming -- the Hamming distance (boolean).
jaccard -- the Jaccard distance (boolean).
kulsinski -- the Kulsinski distance (boolean).
mahalanobis -- the Mahalanobis distance.
matching -- the matching dissimilarity (boolean).
minkowski -- the Minkowski distance.
rogerstanimoto -- the Rogers-Tanimoto dissimilarity (boolean).
russellrao -- the Russell-Rao dissimilarity (boolean).
seuclidean -- the normalized Euclidean distance.
sokalmichener -- the Sokal-Michener dissimilarity (boolean).
sokalsneath -- the Sokal-Sneath dissimilarity (boolean).
sqeuclidean -- the squared Euclidean distance.
wminkowski -- the weighted Minkowski distance.
yule -- the Yule dissimilarity (boolean).
sklearn庫俏脊,著名的python機(jī)器學(xué)習(xí)庫
sklearn.metrics.pairwise.pairwise_distances