《机器学习实战》实现时遇到的问题

《机器学习实战》第二章k-近邻算法，自己实现时遇到的问题，以及解决方法。做个记录。

报错：only 2 non-keyword arguments accepted。
问题所在：粗心少写了两个中括号
本来是array([[1.0,1.1],[1.0，1.0],[0,0],[0,0.1]])，结果少写了最外面的两个中括号

from numpy import *  
import operator  
   
def createDataSet():  
    group = array([[1.0,1.1],[1.0,1.0],[0,0],[0,0.1]])  
    labels = ['A','A','B','B']  
    return group,labels  
  
def classify0(inX,dataSet,labels,k):  
    dataSetSize = dataSet.shape[0]  
    diffMat = tile(inX,(dataSetSize,1)) - dataSet  
    sqDiffMat = diffMat**2  
    sqDistances = sqDiffMat.sum(axis = 1)  
    distances = sqDistances**0.5  
    sortedDistIndicies = distances.argsort()  
    classCount = {}  
    for i in range(k):  
        voteIlabel = labels[sortedDistIndicies[i]]  
        classCount[voteIlabel] = classCount.get(voteIlabel,0) + 1  
    sortedClassCount = sorted(classCount.iteritems(),  
                              key=operator.itemgetter(1),reverse=True)  
    return sortedClassCount[0][0]

标注几个不懂的地方：

1. .shape[0] :

>>> w=np.zeros((5,6))
>>> w
array([[ 0., 0., 0., 0., 0., 0.],
       [ 0., 0., 0., 0., 0., 0.],
       [ 0., 0., 0., 0., 0., 0.],
       [ 0., 0., 0., 0., 0., 0.],
       [ 0., 0., 0., 0., 0., 0.]])

>>> w.shape[0]
5

>>> w.shape[1]
6

w是一个5行6列的矩阵

w.shape[0]返回的是w的行数

w.shape[1]返回的是w的列数

2、 tile()

numpy.tile()是个什么函数呢，就是把数组沿各个方向复制

比如 a = np.array([0,1,2]), np.tile(a,(2,1))就是把a先沿x轴（就这样称呼吧）复制1倍，即没有复制，仍然是 [0,1,2]。再把结果沿y方向复制2倍，即最终得到

array([[0,1,2],

[0,1,2]])

同理：

>>> b = np.array([[1, 2], [3, 4]])
>>> np.tile(b, 2) #沿X轴复制2倍
array([[1, 2, 1, 2],
       [3, 4, 3, 4]])
>>> np.tile(b, (2, 1))#沿X轴复制1倍（相当于没有复制），再沿Y轴复制2倍
array([[1, 2],
       [3, 4],
       [1, 2],
       [3, 4]])

Examples

>>>
>>> a = np.array([0, 1, 2])
>>> np.tile(a, 2)
array([0, 1, 2, 0, 1, 2])
>>> np.tile(a, (2, 2))
array([[0, 1, 2, 0, 1, 2],
       [0, 1, 2, 0, 1, 2]])
>>> np.tile(a, (2, 1, 2))
array([[[0, 1, 2, 0, 1, 2]],
       [[0, 1, 2, 0, 1, 2]]])

>>>
>>> b = np.array([[1, 2], [3, 4]])
>>> np.tile(b, 2)
array([[1, 2, 1, 2],
       [3, 4, 3, 4]])
>>> np.tile(b, (2, 1))
array([[1, 2],
       [3, 4],
       [1, 2],
       [3, 4]])

>>>

>>> c = np.array([1,2,3,4])
>>> np.tile(c,(4,1))
array([[1, 2, 3, 4],
       [1, 2, 3, 4],
       [1, 2, 3, 4],
       [1, 2, 3, 4]])

3、.sum(axis = 1)

看起来挺简单的样子，但是在给sum函数中加入参数。sum（a，axis=0）或者是.sum(axis=1) 就有点不解了

在我实验以后发现我们平时用的sum应该是默认的axis=0 就是普通的相加

而当加入axis=1以后就是将一个矩阵的每一行向量相加

例如：

import numpy as np

np.sum([[0,1,2],[2,1,3],axis=1)

结果就是：array（[3,6]）

下面是自己的实验结果，与上面的说明有些不符：

a = np.array([[0, 2, 1]])

print a.sum()
print a.sum(axis=0)
print a.sum(axis=1)

结果分别是：3, [0 1 2], [3]

b = np.array([0, 2, 1])

print b.sum()
print b.sum(axis=0)
print b.sum(axis=1)

结果分别是：3, 3, 运行错误：'axis' entry is out of bounds

可知：对一维数组，只有第0轴，没有第1轴

c = np.array([[0, 2, 1], [3, 5, 6], [0, 1, 1]])

print c.sum()
print c.sum(axis=0)
print c.sum(axis=1)

结果分别是：19, [3 8 8], [ 3 14 2]

来源：https://www.cnblogs.com/ltxblog/p/8710187.html

标签

axis

array