TensorFlow 學(xué)前班

本文我參加Udacity的深度學(xué)習(xí)基石課程的學(xué)習(xí)的第3周總結(jié)规脸，主題是在學(xué)習(xí) TensorFlow 之前细办，先自己做一個(gè)miniflow雪隧，通過本周的學(xué)習(xí)这溅，對于TensorFlow有了個(gè)簡單的認(rèn)識(shí)属瓣，github上的項(xiàng)目是：https://github.com/zhuanxuhit/nd101 载迄，歡迎關(guān)注的。

我們知道創(chuàng)建一個(gè)神經(jīng)網(wǎng)絡(luò)的一般步驟是：

normalization
learning hyperparameters
initializing weights
forward propagation
caculate error
backpropagation

而上面步驟在TensorFlow中實(shí)現(xiàn)的時(shí)候抡蛙，一般我們的步驟是：

Define the graph of nodes and edges.
Propagate（傳播） values through the graph.

接著在我們實(shí)現(xiàn)miniflow的時(shí)候护昧，我們會(huì)先來定義node和graph，然后再來實(shí)現(xiàn) forward propagation 和 backpropagation

1. node

我們先來看node的概念粗截，看個(gè)簡單的神經(jīng)網(wǎng)絡(luò):

上面的神經(jīng)網(wǎng)絡(luò)就是一個(gè)大的網(wǎng)絡(luò)惋耙，每個(gè)node都有輸入和輸出，每個(gè)node根據(jù)輸入都會(huì)計(jì)算出輸出熊昌，因此我們先來定義node：

class Node(object):
    def __init__(self, inbound_nodes=[]):
        self.inbound_nodes = inbound_nodes
        self.outbound_nodes = []
        for n in self.inbound_nodes:
            n.outbound_nodes.append(self)
        self.value = None

有了最簡單的node绽榛，下一步就是來實(shí)現(xiàn) forward propagation。

Forward propagation

為了計(jì)算一個(gè)node婿屹，需要知道它的輸入灭美，而輸入又依賴于其他節(jié)點(diǎn)的輸出，這種為了計(jì)算當(dāng)前節(jié)點(diǎn)而求其所有前置節(jié)點(diǎn)的技術(shù)叫拓?fù)渑判騮opological sort
用圖來表示就如下圖：

上面為了計(jì)算最后的Node F昂利，我們給出了一個(gè)可行的計(jì)算順序届腐，我們此處直接給出一個(gè)算法：Kahn's Algorithm,代碼如下：

def topological_sort(feed_dict):
    input_nodes = [n for n in feed_dict.keys()]

    G = {}
    nodes = [n for n in input_nodes]
    while len(nodes) > 0:
        n = nodes.pop(0)
        if n not in G:
            G[n] = {'in': set(), 'out': set()}
        for m in n.outbound_nodes:
            if m not in G:
                G[m] = {'in': set(), 'out': set()}
            G[n]['out'].add(m)
            G[m]['in'].add(n)
            nodes.append(m)

    L = []
    S = set(input_nodes)
    while len(S) > 0:
        n = S.pop()

        if isinstance(n, Input):
            n.value = feed_dict[n]

        L.append(n)
        for m in n.outbound_nodes:
            G[n]['out'].remove(m)
            G[m]['in'].remove(n)
            # if no other incoming edges add to S
            if len(G[m]['in']) == 0:
                S.add(m)
    return L

def forward_pass(output_node, sorted_nodes):
    for n in sorted_nodes:
        n.forward()

    return output_node.value

下面我們來實(shí)現(xiàn)一些簡單的Node類型铁坎，第一個(gè)是Input類型：

class Input(Node):
    def __init__(self):
        Node.__init__(self)

    def forward(self, value=None):
        if value is not None:
            self.value = value

下面是Mul類型：

class Mul(Node):
    def __init__(self, *inputs):
        Node.__init__(self, inputs)

    def forward(self):
        sum = 1.0
        for n in self.inbound_nodes:
            sum *= n.value
        self.value = sum

具體的用法如下：

x, y, z = Input(), Input(), Input()

f = Mul(x, y, z)

feed_dict = {x: 4, y: 5, z: 10}

graph = topological_sort(feed_dict)
output = forward_pass(f, graph)

# should output 19
print("{} * {} * {} = {} (according to miniflow)".format(feed_dict[x], feed_dict[y], feed_dict[z], output))

4 * 5 * 10 = 200.0 (according to miniflow)

下面我們來實(shí)現(xiàn)下稍微復(fù)雜點(diǎn)的Node類型：Linear Node

class Linear(Node):
    def __init__(self, inputs, weights, bias):
        Node.__init__(self, [inputs, weights, bias])

    def forward(self):
        inputs = self.inbound_nodes[0].value
        weights = self.inbound_nodes[1].value
        bias = self.inbound_nodes[2].value

        
        sum = 0
        for i in range(len(inputs)):
            sum += inputs[i] * weights[i]
            
        self.value =  sum + bias

有了LinearNode，我們就可以進(jìn)行下面的計(jì)算了：

inputs, weights, bias = Input(), Input(), Input()

f = Linear(inputs, weights, bias)

feed_dict = {
    inputs: [6, 20, 4],
    weights: [0.5, 0.25, 1.5],
    bias: 2
}

graph = topological_sort(feed_dict)
output = forward_pass(f, graph)

print(output)

16.0

有了LinearNode犁苏，我們還可以再定義sigmoidNode厢呵。

class Sigmoid(Node):
    def __init__(self, node):
        Node.__init__(self, [node])

    def _sigmoid(self, x):
        return 1. / (1. + np.exp(-x))

    def forward(self):
        input_value = self.inbound_nodes[0].value
        self.value = self._sigmoid(input_value)

定義完node，我們下一步就是來看怎么定義輸出好壞的標(biāo)準(zhǔn)了傀顾。

2. 定義cost函數(shù)

我們在訓(xùn)練神經(jīng)網(wǎng)絡(luò)的時(shí)候，需要有個(gè)目標(biāo)碌奉，就是盡可能的讓輸出準(zhǔn)確短曾，怎么衡量呢赐劣？我們可以通過均方誤差 (MSE)來衡量嫉拐，這也可以用一個(gè)MSENode來建模

class MSE(Node):
    def __init__(self, y, a):
        Node.__init__(self, [y, a])

    def forward(self):
        y = self.inbound_nodes[0].value.reshape(-1, 1)
        a = self.inbound_nodes[1].value.reshape(-1, 1)
        # TODO: your code here
        m = len(y)
        sum = 0.
        for (yi,ai) in zip(y,a):
            sum += np.square(yi-ai)
        self.value = sum / m

3. 定義反向傳播

現(xiàn)在我們有了衡量輸出好壞的函數(shù)婉徘，我們需要的是怎么能快速的讓輸出盡可能的好，這就要引出Gradient Descent化撕，梯度即slope斜率植阴，我們通過它來定義我們優(yōu)化的方向掠手，更詳細(xì)的可以看文章停下來思考下神經(jīng)網(wǎng)絡(luò)
有了梯度的概念后众雷，我們來看一個(gè)神經(jīng)網(wǎng)絡(luò)圖：

上面我們?yōu)榱擞?jì)算MESE對于w1的梯度纯蛾，我們沿著圖中的紅色線走翻诉，給出了梯度的計(jì)算方式，這種計(jì)算方式就是微積分中的鏈?zhǔn)椒▌t蛾派，能讓我們計(jì)算任意一個(gè)變量的梯度洪乍，下面我們給出梯度的計(jì)算代碼，相比較之前的Node中，多了一個(gè)backward函數(shù)弃酌，看下面的實(shí)現(xiàn)：

import numpy as np


class Node(object):
    def __init__(self, inbound_nodes=[]):
        self.inbound_nodes = inbound_nodes
        self.value = None
        self.outbound_nodes = []
        self.gradients = {}
        for node in inbound_nodes:
            node.outbound_nodes.append(self)

    def forward(self):
        raise NotImplementedError

    def backward(self):
        raise NotImplementedError


class Input(Node):
    def __init__(self):
        Node.__init__(self)

    def forward(self):        
        pass

    def backward(self):
        self.gradients = {self: 0}
        # 輸入節(jié)點(diǎn)的梯度等于所有輸出的梯度相加
        for n in self.outbound_nodes:
            grad_cost = n.gradients[self]
            self.gradients[self] += grad_cost * 1


class Linear(Node):
    def __init__(self, X, W, b):       
        Node.__init__(self, [X, W, b])

    def forward(self):     
        X = self.inbound_nodes[0].value
        W = self.inbound_nodes[1].value
        b = self.inbound_nodes[2].value
        
        X = self.inbound_nodes[0].value
        W = self.inbound_nodes[1].value
        b = self.inbound_nodes[2].value
        self.value = np.dot(X, W) + b  

    def backward(self):
        self.gradients = {n: np.zeros_like(n.value) for n in self.inbound_nodes}
        for n in self.outbound_nodes:
            
            grad_cost = n.gradients[self]
            # y = XW + b
            # 分別計(jì)算y相對于每個(gè)輸入節(jié)點(diǎn)的梯度
            # delta_x = w
            self.gradients[self.inbound_nodes[0]] += np.dot(grad_cost, self.inbound_nodes[1].value.T)
            # delta_w = x
            self.gradients[self.inbound_nodes[1]] += np.dot(self.inbound_nodes[0].value.T, grad_cost)
            # delta_b = 1
            self.gradients[self.inbound_nodes[2]] += np.sum(grad_cost, axis=0, keepdims=False)


class Sigmoid(Node):

    def __init__(self, node):
        # The base class constructor.
        Node.__init__(self, [node])

    def _sigmoid(self, x):
        return 1. / (1. + np.exp(-x))

    def forward(self):
        input_value = self.inbound_nodes[0].value
        self.value = self._sigmoid(input_value)

    def backward(self):
        # Initialize the gradients to 0.
        self.gradients = {n: np.zeros_like(n.value) for n in self.inbound_nodes}

     
        for n in self.outbound_nodes:
            # Get the partial of the cost with respect to this node.
            grad_cost = n.gradients[self]
          
            sigmoid = self.value
            self.gradients[self.inbound_nodes[0]] = sigmoid * (1-sigmoid) * grad_cost


class MSE(Node):
    def __init__(self, y, a):
       
        # Call the base class' constructor.
        Node.__init__(self, [y, a])

    def forward(self):
        
        y = self.inbound_nodes[0].value.reshape(-1, 1)
        a = self.inbound_nodes[1].value.reshape(-1, 1)

        self.m = self.inbound_nodes[0].value.shape[0]
       
        self.diff = y - a
        self.value = np.mean(self.diff**2)

    def backward(self):
    
        self.gradients[self.inbound_nodes[0]] = (2 / self.m) * self.diff
        self.gradients[self.inbound_nodes[1]] = (-2 / self.m) * self.diff


def topological_sort(feed_dict):

    input_nodes = [n for n in feed_dict.keys()]

    G = {}
    nodes = [n for n in input_nodes]
    while len(nodes) > 0:
        n = nodes.pop(0)
        if n not in G:
            G[n] = {'in': set(), 'out': set()}
        for m in n.outbound_nodes:
            if m not in G:
                G[m] = {'in': set(), 'out': set()}
            G[n]['out'].add(m)
            G[m]['in'].add(n)
            nodes.append(m)

    L = []
    S = set(input_nodes)
    while len(S) > 0:
        n = S.pop()

        if isinstance(n, Input):
            n.value = feed_dict[n]

        L.append(n)
        for m in n.outbound_nodes:
            G[n]['out'].remove(m)
            G[m]['in'].remove(n)
            # if no other incoming edges add to S
            if len(G[m]['in']) == 0:
                S.add(m)
    return L


def forward_and_backward(graph):
    # Forward pass
    for n in graph:
        n.forward()

    # Backward pass
    # see: https://docs.python.org/2.3/whatsnew/section-slices.html
    for n in graph[::-1]:
        n.backward()

上面定義了所有需要的節(jié)點(diǎn)和函數(shù)，根據(jù)上面我們就可以得出下面的方法了：

X, W, b = Input(), Input(), Input()
y = Input()
f = Linear(X, W, b)
a = Sigmoid(f)
cost = MSE(y, a)

X_ = np.array([[-1., -2.], [-1, -2]])
W_ = np.array([[2.], [3.]])
b_ = np.array([-3.])
y_ = np.array([1, 2])

feed_dict = {
    X: X_,
    y: y_,
    W: W_,
    b: b_,
}

graph = topological_sort(feed_dict)
forward_and_backward(graph)
# return the gradients for each Input
gradients = [t.gradients[t] for t in [X, y, W, b]]

print(gradients)

[array([[ -3.34017280e-05,  -5.01025919e-05],
       [ -6.68040138e-05,  -1.00206021e-04]]), array([[ 0.9999833],
       [ 1.9999833]]), array([[  5.01028709e-05],
       [  1.00205742e-04]]), array([ -5.01028709e-05])]

## 4. 隨機(jī)梯度下降（Stochastic Gradient Descent）
以前一直沒明白SGD是什么锹漱，最近才知道。
我們來看如果我們每次對全量數(shù)據(jù)都計(jì)算gradient后再去更新參數(shù)砂心，我們可能會(huì)出現(xiàn)內(nèi)存不夠的情況，
因此我們的一個(gè)策略是：從全量中選出一部分?jǐn)?shù)據(jù)，計(jì)算這些數(shù)據(jù)后就更新參數(shù)
因此我們就有了下面的代碼：

def sgd_update(trainables, learning_rate=1e-2):
    for n in trainables:
        n.value -= learning_rate * n.gradients[n]
        
from sklearn.datasets import load_boston
from sklearn.utils import shuffle, resample

# Load data
data = load_boston()
X_ = data['data']
y_ = data['target']

# Normalize data
X_ = (X_ - np.mean(X_, axis=0)) / np.std(X_, axis=0)

n_features = X_.shape[1]
n_hidden = 10
W1_ = np.random.randn(n_features, n_hidden)
b1_ = np.zeros(n_hidden)
W2_ = np.random.randn(n_hidden, 1)
b2_ = np.zeros(1)

# Neural network
X, y = Input(), Input()
W1, b1 = Input(), Input()
W2, b2 = Input(), Input()

l1 = Linear(X, W1, b1)
s1 = Sigmoid(l1)
l2 = Linear(s1, W2, b2)
cost = MSE(y, l2)

feed_dict = {
    X: X_,
    y: y_,
    W1: W1_,
    b1: b1_,
    W2: W2_,
    b2: b2_
}

epochs = 10
# Total number of examples
m = X_.shape[0]
batch_size = 11
steps_per_epoch = m // batch_size

graph = topological_sort(feed_dict)
trainables = [W1, b1, W2, b2]

print("Total number of examples = {}".format(m))

# Step 4
for i in range(epochs):
    loss = 0
    for j in range(steps_per_epoch):
        # Step 1
        # Randomly sample a batch of examples
        X_batch, y_batch = resample(X_, y_, n_samples=batch_size)

        # Reset value of X and y Inputs
        X.value = X_batch
        y.value = y_batch

        # Step 2
        forward_and_backward(graph)

        # Step 3
        sgd_update(trainables)

        loss += graph[-1].value

    print("Epoch: {}, Loss: {:.3f}".format(i+1, loss/steps_per_epoch))

Total number of examples = 506
Epoch: 1, Loss: 133.910
Epoch: 2, Loss: 36.332
Epoch: 3, Loss: 22.353
Epoch: 4, Loss: 26.704
Epoch: 5, Loss: 23.121
Epoch: 6, Loss: 23.491
Epoch: 7, Loss: 21.393
Epoch: 8, Loss: 15.300
Epoch: 9, Loss: 13.391
Epoch: 10, Loss: 15.651

總結(jié)

以上就是我們miniflow的全部了拧咳，我們先是定義Node骆膝，然后定義Node之間的關(guān)系得到圖碎连，再通過forward propagation計(jì)算輸出，通過MES來衡量輸出好壞，通過鏈?zhǔn)椒▌t計(jì)算梯度來更新參數(shù)讓cost不斷縮小，最后通過SGD來加快計(jì)算。

最后編輯于：2017.12.05 17:22:37

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者

人面猴
序言：七十年代末俯萌，一起剝皮案震驚了整個(gè)濱河市穷蛹，隨后出現(xiàn)的幾起案子土陪，更是在濱河造成了極大的恐慌，老刑警劉巖肴熏，帶你破解...
沈念sama閱讀 219,539評論 6贊 508
死咒
序言：濱河連續(xù)發(fā)生了三起死亡事件鬼雀，死亡現(xiàn)場離奇詭異，居然都是意外死亡蛙吏，警方通過查閱死者的電腦和手機(jī)源哩，發(fā)現(xiàn)死者居然都...
沈念sama閱讀 93,594評論 3贊 396
救了他兩次的神仙讓他今天三更去死
文/潘曉璐我一進(jìn)店門，熙熙樓的掌柜王于貴愁眉苦臉地迎上來鸦做，“玉大人励烦，你說我怎么就攤上這事∑糜眨” “怎么了坛掠？”我有些...
開封第一講書人閱讀 165,871評論 0贊 356
道士緝兇錄：失蹤的賣姜人
文/不壞的土叔我叫張陵，是天一觀的道長治筒。經(jīng)常有香客問我却音，道長，這世上最難降的妖魔是什么矢炼？我笑而不...
開封第一講書人閱讀 58,963評論 1贊 295
?港島之戀（遺憾婚禮）
正文為了忘掉前任，我火速辦了婚禮阿纤，結(jié)果婚禮上句灌，老公的妹妹穿的比我還像新娘。我一直安慰自己欠拾，他們只是感情好胰锌，可當(dāng)我...
茶點(diǎn)故事閱讀 67,984評論 6贊 393
惡毒庶女頂嫁案：這布局不是一般人想出來的
文/花漫我一把揭開白布。她就那樣靜靜地躺著藐窄，像睡著了一般资昧。火紅的嫁衣襯著肌膚如雪。梳的紋絲不亂的頭發(fā)上荆忍，一...
開封第一講書人閱讀 51,763評論 1贊 307
城市分裂傳說
那天格带，我揣著相機(jī)與錄音，去河邊找鬼刹枉。笑死叽唱，一個(gè)胖子當(dāng)著我的面吹牛，可吹牛的內(nèi)容都是我干的微宝。我是一名探鬼主播棺亭，決...
沈念sama閱讀 40,468評論 3贊 420
雙鴛鴦連環(huán)套：你想象不到人心有多黑
文/蒼蘭香墨我猛地睜開眼，長吁一口氣：“原來是場噩夢啊……” “哼蟋软！你這毒婦竟也來了镶摘？” 一聲冷哼從身側(cè)響起嗽桩，我...
開封第一講書人閱讀 39,357評論 0贊 276
萬榮殺人案實(shí)錄
序言：老撾萬榮一對情侶失蹤，失蹤者是張志新（化名）和其女友劉穎凄敢，沒想到半個(gè)月后碌冶，有當(dāng)?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體，經(jīng)...
沈念sama閱讀 45,850評論 1贊 317
?護(hù)林員之死
正文獨(dú)居荒郊野嶺守林人離奇死亡贡未，尸身上長有42處帶血的膿包…… 初始之章·張勛以下內(nèi)容為張勛視角年9月15日...
茶點(diǎn)故事閱讀 38,002評論 3贊 338
?白月光啟示錄
正文我和宋清朗相戀三年种樱，在試婚紗的時(shí)候發(fā)現(xiàn)自己被綠了。大學(xué)時(shí)的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片俊卤。...
茶點(diǎn)故事閱讀 40,144評論 1贊 351
活死人
序言：一個(gè)原本活蹦亂跳的男人離奇死亡嫩挤，死狀恐怖，靈堂內(nèi)的尸體忽然破棺而出消恍，到底是詐尸還是另有隱情岂昭，我是刑警寧澤，帶...
沈念sama閱讀 35,823評論 5贊 346
?日本核電站爆炸內(nèi)幕
正文年R本政府宣布狠怨，位于F島的核電站约啊，受9級特大地震影響，放射性物質(zhì)發(fā)生泄漏佣赖。R本人自食惡果不足惜恰矩，卻給世界環(huán)境...
茶點(diǎn)故事閱讀 41,483評論 3贊 331
男人毒藥：我在死后第九天來索命
文/蒙蒙一、第九天我趴在偏房一處隱蔽的房頂上張望憎蛤。院中可真熱鬧外傅，春花似錦、人聲如沸俩檬。這莊子的主人今日做“春日...
開封第一講書人閱讀 32,026評論 0贊 22
一樁弒父案，背后竟有這般陰謀
文/蒼蘭香墨我抬頭看了看天上的太陽棚辽。三九已至技竟，卻和暖如春，著一層夾襖步出監(jiān)牢的瞬間屈藐，已是汗流浹背榔组。一陣腳步聲響...
開封第一講書人閱讀 33,150評論 1贊 272
情欲美人皮
我被黑心中介騙來泰國打工，沒想到剛下飛機(jī)就差點(diǎn)兒被人妖公主榨干…… 1. 我叫王不留联逻，地道東北人瓷患。一個(gè)月前我還...
沈念sama閱讀 48,415評論 3贊 373
代替公主和親
正文我出身青樓，卻偏偏與公主長得像遣妥，于是被迫代替她去往敵國和親擅编。傳聞我的和親對象是個(gè)殘疾皇子，可洞房花燭夜當(dāng)晚...
茶點(diǎn)故事閱讀 45,092評論 2贊 355