Keras 源碼分析

`Keras` 源碼分析

此文檔中续室，凡代碼里用pass栋烤，均系省略源碼以便閱讀，起“本枝百世”之用猎贴。此注明者班缎，乃pass非源碼所有蝴光，勿叫讀者疑心不解也。

[TOC]

`Keras` 概覽

我們從一個(gè)簡(jiǎn)單的全連接分類器來看Keras的設(shè)計(jì)原則和閱讀源代碼达址。在Keras的官網(wǎng)上有這樣一個(gè)簡(jiǎn)單全連接網(wǎng)絡(luò)的示例The Sequential model API：

import keras
from keras.models import Sequential
from keras.layers import Dense, Activation
from keras import backend as K

model = Sequential()
model.add(Dense(32, input_shape=(500,)))
model.add(Dense(10, activation='softmax'))
model.compile(optimizer='rmsprop',
      loss='categorical_crossentropy',
      metrics=['accuracy'])
model.fit(train['data'], train['label'], batch_size=32, 
      nb_epoch=10, verbose=1)

其中蔑祟，Sequential模型的代碼在keras/models.py中。
后端backend的代碼在keras/backend里沉唠。
網(wǎng)絡(luò)的核心概念——層(Layer)的核心源碼則在keras/engine/topology.py中疆虚，Dense網(wǎng)絡(luò)只是Layer類的一個(gè)繼承，其他其他所有的層都是這樣的一種繼承满葛，所以developer可以通過繼承Layer類來實(shí)現(xiàn)自己需要的層径簿。

從整體上看，Keras源碼的組織和功能是這樣的：

.
│  activations.py
│  callbacks.py
│  constraints.py
│  initializations.py
│  metrics.py
│  models.py
│  objectives.py
│  optimizers.py
│  regularizers.py
│  __init__.py
│  
├─applications
|      # 一些典型的應(yīng)用
│      ...
│      
├─backend
|      # Theano, Tensorflow 后端
|      # tensorflow_backend.py 和 theano_backend.py 有一些同名的函數(shù)
|      # 這樣 import backend as K 以后應(yīng)用時(shí)嘀韧，就不需要考慮 Tensorflow 和 Theano 的具體差別了
│      common.py
│      tensorflow_backend.py
│      theano_backend.py
│      __init__.py
│      
├─datasets
|      # 下載數(shù)據(jù)集的腳本
│      ...
|
├─engine
│      topology.py # Keras Layer, Input, Merge, Container的基礎(chǔ)
│      training.py
│      __init__.py
│      
├─layers
|      # 相當(dāng)于 engine 的應(yīng)用
|      # 通過繼承 engine/topology.py 的 Layer 來實(shí)現(xiàn)不同的層
│      convolutional.py
│      __init__.py
│      ...
│      
├─preprocessing
│      image.py
│      sequence.py
│      text.py
│      __init__.py
│      
├─utils
│      data_utils.py
│      generic_utils.py
│      io_utils.py
│      layer_utils.py
│      np_utils.py
│      test_utils.py
│      visualize_util.py
│      __init__.py
│      
└─wrappers
        scikit_learn.py
        __init__.py

backend 設(shè)計(jì)

首先要說一下后端設(shè)計(jì)篇亭。Keras最初后端只有Theano，現(xiàn)在可以支持Tensorflow锄贷。
Keras之所以易于擴(kuò)展backend译蒂，是因?yàn)?em>后端采用的函數(shù)名都一樣。這等于說是在Tensorflow和Theano基礎(chǔ)上又向上封裝了一層谊却。

在backend/theano_backend.py和tensorflow_backend.py兩個(gè)文件中柔昼，封裝到backend中的函數(shù)有：Backend functions

這些同名函數(shù)的功能基本上如字面意思所述，例如：

def maximum(x, y):
    return tf.maximum(x, y)
def maximum(x, y):
    return T.maximum(x, y)

Tensorflow和Theano的差別基本就如上例所示炎辨。由于存在大量同名的函數(shù)捕透，所以在調(diào)用后端時(shí)只需要：

import backend as K
K.function_name(args)

我們可以通過這些重名的函數(shù)看到，哪些構(gòu)件對(duì)于深度學(xué)習(xí)而言是基本的碴萧、必需的乙嘀，這對(duì)于芯片設(shè)計(jì)會(huì)有一定的啟發(fā)。

`class Layer(object)` 設(shè)計(jì)

class Sequential(Model)繼承了keras/engine/training.py中的Model類勿决，而Model類則繼承了同目錄下的keras/engine/topology.py中的Container類乒躺，Container類繼承了同文件中的Layer類。

也就是說低缩，Sequential模型實(shí)際上是泛型模型(functional API)的一個(gè)特殊情況嘉冒。而泛型模型又是容器(Container)、是層(Layer)的特殊情況咆繁，因此有必要先搞清楚Layer的設(shè)計(jì)原理讳推。

`class Node(object)`

Layer類和Node類很有關(guān)系。兩個(gè)Layer之間用Node連接玩般。Layer有inbound_nodes和outbound_nodes兩種list银觅，他們的元素都是Node，用來綁定輸入與輸出的張量坏为。

每當(dāng)一個(gè)Layer接收新的輸入張量時(shí)究驴，就在layer.inbound_nodes中增加一個(gè)Node镊绪。同理，當(dāng)一個(gè)輸出張量被另一層Layer調(diào)用時(shí)洒忧，在layer.outbound_nodes增加新的節(jié)點(diǎn)蝴韭。Node的作用類似于函數(shù)之間的參數(shù)傳遞。

class Node(object):
    def __init__(self, outbound_layer,
                 inbound_layers, node_indices, tensor_indices,
                 input_tensors, output_tensors,
                 input_masks, output_masks,
                 input_shapes, output_shapes):
        '''
        構(gòu)造函數(shù)
        outbound_layer 
            此 Node 綁定的輸出 Layer 熙侍，也就是說當(dāng)前 Node 在 outbound_layer 的 inbound_nodes 中榄鉴；
        inbound_layers
            輸入 Layer，當(dāng)前 Node 作為其 outbound_nodes 的元素

        下面的循環(huán)將 Node 加入到所有要綁定的輸入 Layer 中蛉抓。
        同時(shí)庆尘，也綁定了要輸出的 Layer 的 Node。
        '''
        for layer in inbound_layers:
            if layer is not None:
                layer.outbound_nodes.append(self)
        outbound_layer.inbound_nodes.append(self)

    @classmethod # 指定函數(shù) create_node 為類方法而非實(shí)例方法巷送，因此可以直接進(jìn)行類調(diào)用 Node.create_node()

    def create_node(cls, outbound_layer,
                    inbound_layers, node_indices=None, tensor_indices=None):
        '''
        inbound_layers
            從 inbound_layers.inbound_nodes 中讀取所有的輸入 Node 信息驶忌，包括數(shù)據(jù)、mask笑跛、shape位岔。
        outbound_layer
            根據(jù)從 inbound_layers 讀到的足夠多的信息來確定新建一個(gè) Node 傳遞給 outbound_layers 。
        函數(shù)返還一個(gè)outbound_layers 的 Node 類節(jié)點(diǎn)堡牡。
        '''
        return cls(outbound_layer,
                   inbound_layers, node_indices, tensor_indices,
                   input_tensors, output_tensors,
                   input_masks, output_masks,
                   input_shapes, output_shapes)

    def get_config(self):
        '''
        返還輸入輸出層的名字、節(jié)點(diǎn)與張量的索引
        '''
        return {'outbound_layer': self.outbound_layer.name if self.outbound_layer else None,
                'inbound_layers': inbound_names,
                'node_indices': self.node_indices,
                'tensor_indices': self.tensor_indices}

Node和Layer互為成員變量杨刨，所以在Layer創(chuàng)建的時(shí)候就已經(jīng)創(chuàng)建了晤柄，不需要單獨(dú)創(chuàng)建。

`class Layer(object)`

至于Layer類妖胀，它主要包括這些成員變量(properties)芥颈、實(shí)例方法(methods)和類方法(class methods)

主要的 Properties

input_spec

class InputSpec的list。每一個(gè)元素描述了對(duì)于輸入的要求赚抡，例如維度ndim和數(shù)據(jù)類型dtype爬坑。
trainable

用來標(biāo)志這個(gè)Layer在訓(xùn)練中權(quán)重是否更新(訓(xùn)練)的bool值
input_shape, output_shape
inbound_nodes, outbound_nodes

Layer之間存放的Node的list
input, output

輸入輸出的張量（tensor）
trainable_weights, non_trainable_weights, weights
可以訓(xùn)練、不可以訓(xùn)練的變量list涂臣，weights是他們的串接盾计。它們是以函數(shù)形式存在的property，返回list赁遗。

關(guān)鍵的 Methods

I/O相關(guān)的 Methods
- def create_input_layer(self, batch_input_shape, input_dtype=None, name=None)
  
  這個(gè)函數(shù)會(huì)按照輸入?yún)?shù)修改當(dāng)前Layer的batch_input_shape和input_dtype署辉，并且調(diào)用函數(shù)def Input(shape=None, batch_shape=None, name=None, dtype=K.floatx(), sparse=False, tensor=None)得到一個(gè)Keras tensor，x岩四。
```
  x = Input(batch_shape=batch_input_shape,
    dtype=input_dtype, name=name)
  self(x)
```
  Keras tensor是在backend的 tensor 基礎(chǔ)之上增加內(nèi)容的張量哭尝。用返還的Keras tensor將自身實(shí)例化為Layer，這是為了創(chuàng)造當(dāng)前Layer與剛剛創(chuàng)造的輸入Layer之間的連接Node剖煌。
  Keras tensor實(shí)際上是InputLayer輸入Node的輸出張量：
```
  input_layer = InputLayer(batch_input_shape=batch_shape,
                           name=name, input_dtype=dtype,
                           sparse=sparse,
                           input_tensor=tensor)
  outputs = input_layer.inbound_nodes[0].output_tensors
  if len(outputs) == 1:
      return outputs[0]
  else:
      return outputs
```
  因?yàn)樵谧宇?code>InputLayer中并沒有調(diào)用該函數(shù)材鹦，所以沒有矛盾的地方逝淹。
與Losses, Update有關(guān)的 Methods
- def add_loss(self, losses, inputs=None)
  
  這個(gè)函數(shù)會(huì)不斷添加self.losses列表，參數(shù)losses會(huì)被轉(zhuǎn)化為list然后被加到self.losses后面桶唐。
  然后根據(jù)參數(shù)inputs栅葡，獲得它的用戶級(jí)編號(hào)uid作為hash值。uid是根據(jù)python的id()函數(shù)得到的莽红，某種意義上類似于C的內(nèi)存地址妥畏。
```
  inputs_hash = object_list_uid(inputs)
```
  然后將losses列表加入對(duì)應(yīng)的hash值位置：
```
  self._per_input_losses[inputs_hash] += losses
```
- def get_losses_for(self, inputs)
  
  將add_loss函數(shù)設(shè)定的inputs位置的losses取出來
- update類的函數(shù)
  
  基本上和losses都一樣，只是將關(guān)鍵字losses改成updates安吁。
Weight相關(guān)的 Methods
- def weights(self)
  
  串接可訓(xùn)練與不可訓(xùn)練的權(quán)重：
```
  return self.trainable_weights + self.non_trainable_weights
```
- def set_weights(self, weights)
  
  將self.weights和參數(shù)weights的張量載入到[numpy.array]形式的weight_value_tuples
- def get_weights(self)
  
  以[numpy.array]的形式返回當(dāng)前Layer的張量

以 `Dense` 層為例

Dense層是Keras中最簡(jiǎn)單的一個(gè)全連接的網(wǎng)絡(luò)醉蚁。整個(gè)Dense層的代碼大致如下：

class Dense(Layer):
    def __init__(self, output_dim, init='glorot_uniform',
                 activation=None, weights=None,
                 W_regularizer=None, b_regularizer=None, activity_regularizer=None,
                 W_constraint=None, b_constraint=None,
                 bias=True, input_dim=None, **kwargs):
        pass
        super(Dense, self).__init__(**kwargs)

    def build(self, input_shape):
        pass

    def call(self, x, mask=None):
        output = K.dot(x, self.W)
        if self.bias:
            output += self.b
        return self.activation(output)

    def get_output_shape_for(self, input_shape):
        pass

    def get_config(self):
        pass

首先，Dense是對(duì)父類Layer的繼承鬼店，但是覆蓋了

build
定義權(quán)重网棍。可以訓(xùn)練的加入self.trainable_weights妇智，不可以訓(xùn)練的加入self.non_trainabe_weights滥玷，需要更新的以(old_tensor, new_tensor)的形式加入self.updates。
call
定義功能巍棱，具體的數(shù)學(xué)運(yùn)算惑畴。
get_output_shape_for
給Keras指明shape的變化。
get_config
給出Layer的確認(rèn)信息航徙，包括output_dim, W_constraint等如贷。

這四個(gè)函數(shù)，這四個(gè)函數(shù)具有“多態(tài)”的特點(diǎn)到踏。

在Dense實(shí)例化時(shí)杠袱，在構(gòu)造函數(shù)__init__的結(jié)尾調(diào)用父類Layer的構(gòu)造函數(shù)，這時(shí)候Layer調(diào)用的多態(tài)函數(shù)就被子類覆蓋了窝稿，實(shí)現(xiàn)了子類的特有功能楣富。

在官方手冊(cè)Writing your own Keras layers中，并不需要用戶實(shí)現(xiàn)get_config伴榔，只需要自己編寫另外三個(gè)多態(tài)函數(shù)即可纹蝴。

數(shù)學(xué)運(yùn)算

可見Layer的計(jì)算功能集中在call函數(shù)。

Dense的call如上面的代碼所示潮梯，它實(shí)際上還是按照輸入的activation關(guān)鍵字調(diào)用了keras/activations.py中的激活函數(shù)骗灶。
keras/activations.py提供了如下這些激活類型，

def softmax(x)
def elu(x, alpha=1.0)
def softplus(x)
def softsign(x)
def relu(x, alpha=0., max_value=None)
def tanh(x)
def sigmoid(x)
def hard_sigmoid(x)
def linear(x)

在不指定activation參數(shù)的情況下秉馏，參數(shù)傳遞為None耙旦，默認(rèn)調(diào)用linear。

由此可見，Layer只具有前向傳播的計(jì)算能力免都，不具備反向傳播的計(jì)算能力锉罐。

`class Container(Layer)` 設(shè)計(jì)

Container是由Layer組成的有向無環(huán)的計(jì)算圖(a directed acyclic graph of layers)，實(shí)際上是一個(gè)Model的拓?fù)浣Y(jié)構(gòu)绕娘。Container和Model之間的差別在于訓(xùn)練脓规，所以在構(gòu)造時(shí)，Model是對(duì)Container的繼承险领。

`init()` 函數(shù)

構(gòu)造函數(shù)__init__()通過一種“自頂向下”的方法構(gòu)造了計(jì)算圖模型侨舆。在__init__()中，首先會(huì)將參數(shù)輸入的input, output這兩張層的張量處理好：

def __init__(self, input, output, name=None):
    # 先將輸入的張量`input`, `output`
    # 處理成`Container`專用的`tensor list`
    # Container-specific properties.
    if isinstance(input, (list, tuple)):
        self.inputs = list(input)  # Tensor or list of tensors.
    else:
        self.inputs = [input]
    if isinstance(output, (list, tuple)):
        self.outputs = list(output)
    else:
        self.outputs = [output]

    # Build self.output_layers:
    for x in self.outputs:
        layer, node_index, tensor_index = x._keras_history
        # 添加輸出*層*
        self.output_layers.append(layer)
        # 添加輸出層的*結(jié)點(diǎn)*
        self.output_layers_node_indices.append(node_index)
        # 添加*張量*
        self.output_layers_tensor_indices.append(tensor_index)

    # Build self.input_layers:
    for x in self.inputs:
        layer, node_index, tensor_index = x._keras_history
        # It's supposed to be an input layer, so only one node
        # and one tensor output.
        assert node_index == 0
        assert tensor_index == 0
        self.input_layers.append(layer)
        self.input_layers_node_indices.append(node_index)
        self.input_layers_tensor_indices.append(tensor_index)

    # `output_layers`, `input_layers` 的 cache 處理

Graph 構(gòu)建

接下來會(huì)通過下面這個(gè)__init__內(nèi)部定義的函數(shù)來遞歸地構(gòu)造計(jì)算圖

    def build_map_of_graph(tensor, seen_nodes=set(), depth=0,
                           layer=None, node_index=None, tensor_index=None):

構(gòu)造出來的計(jì)算圖大致是這樣的：

output layer
|
#------------#------------#
|            |            |
node   ...   node   ...   node
                          |
#------------#------------#
|            |            |
layer  ...   layer  ...   layer
|
#------------#------------#
|            |            |
node   ...   node   ...   node
...          ...          ...
#
|
input layer

“自頂向下”的構(gòu)造從output layer 開始绢陌，逐個(gè)檢查它的inbound_nodes列表中的結(jié)點(diǎn)挨下，將結(jié)點(diǎn)加入“圖可見結(jié)點(diǎn)”(seen_nodes)中。因?yàn)?code>output layer 到 input layer 之間并不是所有結(jié)點(diǎn)都有用的脐湾，只有在seen_nodes中的才是計(jì)算圖模型所需要的臭笆。

最后再遍歷當(dāng)前結(jié)點(diǎn)的inbound_layers。對(duì)于每一個(gè)layer秤掌，需要繼續(xù)向下添加seen_nodes愁铺。這樣，就遞歸地構(gòu)建了計(jì)算圖闻鉴。

同時(shí)茵乱，對(duì)于一個(gè)新構(gòu)造的Container而言，作為不帶有訓(xùn)練功能的拓?fù)浣Y(jié)構(gòu)孟岛，它的inbound node只有一個(gè)似将，并且沒有outbound node。

Depth 環(huán)路避免

為了在有向圖中防止出現(xiàn)環(huán)蚀苛，所以采用depth（深度）對(duì)Node和Layer進(jìn)行描述。按照depth的順序玷氏，獲得經(jīng)過排序的self.layers_by_depth和self.nodes_by_depth堵未。

在有向圖中利用depth來避免出現(xiàn)環(huán)是很容易理解的，因?yàn)槿绻霈F(xiàn)有向環(huán)的話盏触，那么Node和Layer的深度就會(huì)不斷增加以至于無窮大渗蟹，也就是所謂的“無窮計(jì)數(shù)問題”。

有關(guān)訓(xùn)練的 Property Methods

Update

在方法def updates(self)中赞辩，確定某一個(gè)Layer是否需要更新的方法就是檢查'updates'屬性：

@property
def updates(self):
    updates = []
    for layer in self.layers:
        if hasattr(layer, 'updates'):
            # 根據(jù)`layer.inbound_nodes`進(jìn)行更新
            pass
    return updates

Loss

損失函數(shù)也一樣雌芽，通過檢查'losses'屬性和檢查layer.inbound_nodes來完成。

Weights 相關(guān)的 Methods

@property
def trainable_weights(self):
    pass

@property
def non_trainable_weights(self):
    pass

這兩個(gè)函數(shù)中的權(quán)重選取就根據(jù)Layer.trainable這個(gè)屬性來進(jìn)行選擇辨嗽。在model.summary()中世落，可以查看可訓(xùn)練與不可訓(xùn)練的參數(shù)數(shù)量，就是通過這兩個(gè)函數(shù)實(shí)現(xiàn)的糟需。

前向傳播計(jì)算

`output` 計(jì)算

在這里最重要的函數(shù)是

def run_internal_graph(self, inputs, masks=None):
    # Computes output tensors for new inputs.
    pass

因?yàn)橹霸诮▓D的時(shí)候有記錄depth信息屉佳，所以數(shù)據(jù)的流動(dòng)谷朝、計(jì)算可以逐層進(jìn)行：

    depth_keys = list(self.nodes_by_depth.keys())
    depth_keys.sort(reverse=True)
    for depth in depth_keys:
        nodes = self.nodes_by_depth[depth]
        for node in nodes:
            # This is always a single layer, never a list.
            pass

因?yàn)槭怯邢驘o環(huán)圖，而且圖又自有深度學(xué)習(xí)任務(wù)的特點(diǎn)武花，所以必然在每一層都只有一個(gè)Layer圆凰。這是上面這段大循環(huán)的基礎(chǔ)，這段循環(huán)將從深depth到淺depth遍歷層体箕，也就是從input layer到output layer遍歷专钉。

對(duì)于每一個(gè)層而言，前向傳播計(jì)算就是將從input層來的累铅、已經(jīng)算過的張量拿過來計(jì)算跃须，并且傳給下一層。具體的計(jì)算就要通過Layer的call方法實(shí)現(xiàn)：

output_tensors = to_list(layer.call(computed_tensors,
                                    computed_masks))
output_masks = to_list(layer.compute_mask(computed_tensors,
                                          computed_masks))

每完成一張層的前向計(jì)算争群，要添加一下update回怜，loss以及緩沖池的記錄，并且更新_keras_shape换薄。這樣方便將來的訓(xùn)練玉雾。當(dāng)上面的大循環(huán)走完時(shí)，數(shù)據(jù)流也就從input layer流到了output layer轻要。此時(shí)再將數(shù)據(jù)收集起來即可：

    for x in self.outputs:
        tensor, mask = tensor_map[str(id(x))]
        pass
        output_tensors.append(tensor)
        output_masks.append(mask)

Method `call` 與 cache 策略

call函數(shù)沒有承擔(dān)主要計(jì)算任務(wù)复旬，計(jì)算任務(wù)主要還是由run_internal_graph方法實(shí)現(xiàn)的。

但是call利用了一個(gè)巧妙的緩沖策略降低了調(diào)用run_internal_graph的次數(shù)（顯然冲泥，這個(gè)函數(shù)要進(jìn)行一次圖的全局計(jì)算驹碍，代價(jià)相對(duì)比較高）。在Container的構(gòu)造函數(shù)__init__中凡恍，特別預(yù)置了三個(gè)dict：

    self._output_mask_cache = {}
    self._output_tensor_cache = {}
    self._output_shape_cache = {}

就是為了起到緩沖作用志秃，降低調(diào)用run_internal_graph的次數(shù)。在call中嚼酝，可以清晰地看到它們的作用：

def call(self, input, mask=None):
    pass
    if cache_key in self._output_tensor_cache:
        return self._output_tensor_cache[cache_key]
    else:
        output_tensors, output_masks, output_shapes = self.run_internal_graph(inputs, masks)
        return output_tensors

這個(gè)緩沖的設(shè)計(jì)浮还，或許也可以作為芯片設(shè)計(jì)的參考。

到此為止闽巩，keras/engine/topology.py中的主要三個(gè)類：

Node
Layer
Container

已經(jīng)解釋過了钧舌，其實(shí)還有諸如Merge之類的內(nèi)容也是很重要的。但如果只是Sequential模型涎跨，就未必需要在這里添加說明以起探微Keras代碼之功效洼冻，讀者可以自去讀源碼。

`class Model(Container)` 設(shè)計(jì)

Model類相比于Container類而言隅很，最大的特點(diǎn)就是它具有了反向傳播的能力撞牢，換而言之也就是說Model可以進(jìn)行訓(xùn)練，這一點(diǎn)落在fit函數(shù)上。至于其他的方法普泡，諸如predict之類播掷，對(duì)于理解深度學(xué)習(xí)框架而言并不十分重要。因此撼班，主要需要理解的就是compile和fit兩個(gè)函數(shù)歧匈，實(shí)際上用戶在進(jìn)行訓(xùn)練時(shí)，也是這兩個(gè)函數(shù)最重要砰嘁。

Method `compile`

compile函數(shù)對(duì)輸入進(jìn)行了一些確認(rèn)

def compile(self, optimizer, loss, metrics=None, loss_weights=None,
            sample_weight_mode=None, **kwargs):
    pass

注意到用戶在調(diào)用compile時(shí)件炉，實(shí)際上還沒有填充訓(xùn)練集與樣本標(biāo)簽，例如我們最開始使用的例子：

model.compile(optimizer='rmsprop',
      loss='categorical_crossentropy',
      metrics=['accuracy'])

化歸格式

compile會(huì)對(duì)用來設(shè)定模型的參數(shù)進(jìn)行檢查矮湘，包括數(shù)據(jù)類型檢查等等斟冕，以化歸到恰當(dāng)?shù)男问健＠?code>loss可能有dict和list的表達(dá)缅阳，那么compile要分別處理兩種表達(dá)的輸入磕蛇。

準(zhǔn)備預(yù)測(cè)

預(yù)測(cè)結(jié)果用self.targets來儲(chǔ)存：

    self.targets = []
    for i in range(len(self.outputs)):
        shape = self.internal_output_shapes[i]
        name = self.output_names[i]
        self.targets.append(K.placeholder(ndim=len(shape),
                            name=name + '_target',
                            sparse=K.is_sparse(self.outputs[i]),
                            dtype=K.dtype(self.outputs[i])))

用Tensorflow和Theano后端產(chǎn)生一個(gè)占位符。

`loss` 計(jì)算

誤差計(jì)算用函數(shù)參數(shù)loss_weights計(jì)算權(quán)重十办，加權(quán)計(jì)算總的誤差：

    total_loss = None
    for i in range(len(self.outputs)):
        y_true = self.targets[i]
        y_pred = self.outputs[i]
        weighted_loss = weighted_losses[i]
        sample_weight = sample_weights[i]
        mask = masks[i]
        loss_weight = loss_weights_list[i]
        output_loss = weighted_loss(y_true, y_pred,
                                    sample_weight, mask)
        if len(self.outputs) > 1:
            self.metrics_tensors.append(output_loss)
            self.metrics_names.append(self.output_names[i] + '_loss')
        if total_loss is None:
            total_loss = loss_weight * output_loss
        else:
            total_loss += loss_weight * output_loss

最后秀撇，對(duì)于每一組真實(shí)值與預(yù)測(cè)值，都要確定計(jì)算他們之間loss的目標(biāo)函數(shù)：

    for i in range(len(self.outputs)):
        y_true = self.targets[i]
        y_pred = self.outputs[i]
        output_metrics = nested_metrics[i]

        for metric in output_metrics:
            if metric == 'accuracy' or metric == 'acc':
                output_shape = self.internal_output_shapes[i]
                acc_fn = None
                # 選用不同的目標(biāo)函數(shù)
                if output_shape[-1] == 1 or self.loss_functions[i] == objectives.binary_crossentropy:
                    acc_fn = metrics_module.binary_accuracy
                elif self.loss_functions[i] == objectives.sparse_categorical_crossentropy:
                    acc_fn = metrics_module.sparse_categorical_accuracy
                else:
                    acc_fn = metrics_module.categorical_accuracy

                append_metric(i, 'acc', acc_fn(y_true, y_pred))
            else:
                pass

目標(biāo)函數(shù)本身是在keras/metrics.py中向族，在keras/engine/training.py中被調(diào)用為metrics_module

from .. import metrics as metrics_module

這部分代碼是用backend寫成的呵燕。以(0, 1)二值標(biāo)簽為例，目標(biāo)函數(shù)為：

def binary_accuracy(y_true, y_pred):
    return K.mean(K.equal(y_true, K.round(y_pred)))

Method `fit`

fit函數(shù)是具有批量反向傳播訓(xùn)練能力的函數(shù)：

def fit(self, x, y, batch_size=32, nb_epoch=10, verbose=1, callbacks=None,
        validation_split=0., validation_data=None, shuffle=True,
        class_weight=None, sample_weight=None, initial_epoch=0):
    pass

數(shù)據(jù)標(biāo)準(zhǔn)化

在fit函數(shù)中件相，首先會(huì)調(diào)用類方法_standardize_user_data以進(jìn)行數(shù)據(jù)處理：

    x, y, sample_weights = self._standardize_user_data(
        x, y,
        sample_weight=sample_weight,
        class_weight=class_weight,
        check_batch_dim=False,
        batch_size=batch_size)

這樣得到標(biāo)準(zhǔn)化的數(shù)據(jù)再扭。

訓(xùn)練函數(shù)

關(guān)于驗(yàn)證數(shù)據(jù)validation_data這部分，因?yàn)橛袝r(shí)候用戶不會(huì)使用夜矗，所以就不在這里說明了泛范。接下來就是要準(zhǔn)備輸入數(shù)據(jù)和訓(xùn)練函數(shù)（目標(biāo)函數(shù)）：

    # prepare input arrays and training function
    if self.uses_learning_phase and not isinstance(K.learning_phase(), int):
        ins = x + y + sample_weights + [1.]
    else:
        ins = x + y + sample_weights
    # 準(zhǔn)備算子`self.train_function`
    self._make_train_function()
    # 給出`self.train_function`算子，供`_fit_loop`使用
    f = self.train_function

其中屬性uses_learning_phase是從Layer繼承來的紊撕，經(jīng)過Container和Model的封裝敦跌。它的本意是用來標(biāo)志Layer是否會(huì)用到后端函數(shù)K.in_training_phase()或K.in_test_phase()。

調(diào)用_make_train_function以準(zhǔn)備數(shù)據(jù)逛揩，以及準(zhǔn)備好目標(biāo)函數(shù)的算子：

def _make_train_function(self):
    pass
    if self.train_function is None:
        # 準(zhǔn)備`inputs`
        if self.uses_learning_phase and not isinstance(K.learning_phase(), int):
            inputs = self.inputs + self.targets + self.sample_weights + [K.learning_phase()]
        else:
            inputs = self.inputs + self.targets + self.sample_weights
        # 準(zhǔn)備`updates`
        training_updates = self.optimizer.get_updates(self._collected_trainable_weights,
                                                      self.constraints,
                                                      self.total_loss)
        updates = self.updates + training_updates
        # 調(diào)用backend，準(zhǔn)備好目標(biāo)函數(shù)的算子
        self.train_function = K.function(inputs,
                                         [self.total_loss] + self.metrics_tensors,
                                         updates=updates,
                                         **self._function_kwargs)

`Theano` 后端的 `Function`

后端函數(shù)function用傳遞來的參數(shù)實(shí)例化一個(gè)Keras的Function類返回：

def function(inputs, outputs, updates=[], **kwargs):
    pass
    return Function(inputs, outputs, updates=updates, **kwargs)

Function類只有兩個(gè)函數(shù)麸俘，除了__init__以外還有一個(gè)__call__辩稽，使其成為“可調(diào)用的類”。這相當(dāng)于Function的對(duì)象當(dāng)作函數(shù)來使用从媚，相當(dāng)于重載了括號(hào)運(yùn)算符逞泄。這樣就可以通過下面的代碼直接求出outputs:

outputs = self.train_function(ins)

Function的構(gòu)造函數(shù)主要完成變量的更新：

class Function(object):

    def __init__(self, inputs, outputs, updates=[], **kwargs):
        unique_variables_to_update = {}
        for v, nv in updates:
            if v not in unique_variables_to_update:
                unique_variables_to_update[v] = nv
        updates = unique_variables_to_update.items()
        self.function = theano.function(inputs, outputs, updates=updates,
                                        allow_input_downcast=True,
                                        on_unused_input='ignore',
                                        **kwargs)

同時(shí)，定義好了形式上的輸入輸出函數(shù)self.function。這是通過theano.function實(shí)現(xiàn)的喷众，關(guān)于這個(gè)有用的函數(shù)可以去看Theano的官方手冊(cè)function - defines theano.function

__call__函數(shù)的主要任務(wù)則是進(jìn)行數(shù)據(jù)計(jì)算各谚，給出outputs表達(dá)式的數(shù)值：

    def __call__(self, inputs):
        assert isinstance(inputs, (list, tuple))
        return self.function(*inputs)

`_fit_loop`

函數(shù)fit最終就是返回_fit_loop的結(jié)果，這是訓(xùn)練過程中的一切歷史記錄信息到千。此時(shí)原始輸入的訓(xùn)練集已經(jīng)被改造成ins昌渤，樣本標(biāo)簽也成為out_labels，其他的參數(shù)都傳遞給訓(xùn)練函數(shù)_fit_loop了：

    return self._fit_loop(f, ins, out_labels=out_labels,
                          batch_size=batch_size, nb_epoch=nb_epoch,
                          verbose=verbose, callbacks=callbacks,
                          val_f=val_f, val_ins=val_ins, shuffle=shuffle,
                          callback_metrics=callback_metrics,
                          initial_epoch=initial_epoch)

_fit_loop是一個(gè)抽象的函數(shù)f(ins)憔四，這里f是從后端構(gòu)建來的算子self.train_function膀息，ins是輸入的訓(xùn)練集。

歷史記錄是通過keras/callbacks.py搜集的：

self.history = cbks.History()

撇開數(shù)據(jù)的處理了赵、準(zhǔn)備潜支，訓(xùn)練的主要代碼是這段循環(huán)，非常關(guān)鍵：

    for epoch in range(initial_epoch, nb_epoch):
        # 記錄本回epoch的歷史信息
        callbacks.on_epoch_begin(epoch)
        # 按照batch批次打混索引
        if shuffle == 'batch':
            index_array = batch_shuffle(index_array, batch_size)
        elif shuffle:
            np.random.shuffle(index_array)
        # 得到一個(gè)批次的索引
        batches = make_batches(nb_train_sample, batch_size)
        epoch_logs = {}

以下的循環(huán)是批量對(duì)訓(xùn)練集進(jìn)行訓(xùn)練柿汛，首先是準(zhǔn)備訓(xùn)練集的數(shù)據(jù)切片冗酿，切片大小自然是按照批次設(shè)定的：

        for batch_index, (batch_start, batch_end) in enumerate(batches):
            batch_ids = index_array[batch_start:batch_end]
            try:
                if isinstance(ins[-1], float):
                    # do not slice the training phase flag
                    ins_batch = slice_X(ins[:-1], batch_ids) + [ins[-1]]
                else:
                    ins_batch = slice_X(ins, batch_ids)
            except TypeError:
                raise TypeError('TypeError while preparing batch. '
                                'If using HDF5 input data, '
                                'pass shuffle="batch".')

這里調(diào)用了函數(shù)slice_X，這個(gè)函數(shù)是用來截取python的list和numpy的array兩種格式的列表的络断。如此獲得的ins_batch自然就是此回epoch裁替、此batch批次的輸入ins。

            batch_logs = {}
            batch_logs['batch'] = batch_index
            batch_logs['size'] = len(batch_ids)
            callbacks.on_batch_begin(batch_index, batch_logs)
            outs = f(ins_batch)
            if not isinstance(outs, list):
                outs = [outs]
            for l, o in zip(out_labels, outs):
                batch_logs[l] = o

            callbacks.on_batch_end(batch_index, batch_logs)

            if batch_index == len(batches) - 1:  # last batch
                # validation
                if do_validation:
                    # replace with self._evaluate
                    val_outs = self._test_loop(val_f, val_ins,
                                               batch_size=batch_size,
                                               verbose=0)
                    if not isinstance(val_outs, list):
                        val_outs = [val_outs]
                    # same labels assumed
                    for l, o in zip(out_labels, val_outs):
                        epoch_logs['val_' + l] = o
        callbacks.on_epoch_end(epoch, epoch_logs)
        if callback_model.stop_training:
            break

最后編輯于：2017.12.05 14:06:33

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者

人面猴
序言：七十年代末妓羊，一起剝皮案震驚了整個(gè)濱河市胯究，隨后出現(xiàn)的幾起案子，更是在濱河造成了極大的恐慌躁绸，老刑警劉巖裕循，帶你破解...
沈念sama閱讀 221,548評(píng)論 6贊 515
死咒
序言：濱河連續(xù)發(fā)生了三起死亡事件，死亡現(xiàn)場(chǎng)離奇詭異净刮，居然都是意外死亡剥哑，警方通過查閱死者的電腦和手機(jī)，發(fā)現(xiàn)死者居然都...
沈念sama閱讀 94,497評(píng)論 3贊 399
救了他兩次的神仙讓他今天三更去死
文/潘曉璐我一進(jìn)店門淹父，熙熙樓的掌柜王于貴愁眉苦臉地迎上來株婴，“玉大人，你說我怎么就攤上這事暑认±Ы椋” “怎么了？”我有些...
開封第一講書人閱讀 167,990評(píng)論 0贊 360
道士緝兇錄：失蹤的賣姜人
文/不壞的土叔我叫張陵蘸际，是天一觀的道長(zhǎng)座哩。經(jīng)常有香客問我，道長(zhǎng)粮彤，這世上最難降的妖魔是什么根穷？我笑而不...
開封第一講書人閱讀 59,618評(píng)論 1贊 296
?港島之戀（遺憾婚禮）
正文為了忘掉前任姜骡，我火速辦了婚禮，結(jié)果婚禮上屿良，老公的妹妹穿的比我還像新娘圈澈。我一直安慰自己，他們只是感情好尘惧，可當(dāng)我...
茶點(diǎn)故事閱讀 68,618評(píng)論 6贊 397
惡毒庶女頂嫁案：這布局不是一般人想出來的
文/花漫我一把揭開白布康栈。她就那樣靜靜地躺著，像睡著了一般褥伴。火紅的嫁衣襯著肌膚如雪谅将。梳的紋絲不亂的頭發(fā)上，一...
開封第一講書人閱讀 52,246評(píng)論 1贊 308
城市分裂傳說
那天重慢，我揣著相機(jī)與錄音饥臂，去河邊找鬼。笑死似踱，一個(gè)胖子當(dāng)著我的面吹牛隅熙，可吹牛的內(nèi)容都是我干的。我是一名探鬼主播核芽，決...
沈念sama閱讀 40,819評(píng)論 3贊 421
雙鴛鴦連環(huán)套：你想象不到人心有多黑
文/蒼蘭香墨我猛地睜開眼囚戚，長(zhǎng)吁一口氣：“原來是場(chǎng)噩夢(mèng)啊……” “哼！你這毒婦竟也來了轧简？” 一聲冷哼從身側(cè)響起驰坊，我...
開封第一講書人閱讀 39,725評(píng)論 0贊 276
萬榮殺人案實(shí)錄
序言：老撾萬榮一對(duì)情侶失蹤，失蹤者是張志新（化名）和其女友劉穎哮独，沒想到半個(gè)月后拳芙，有當(dāng)?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體，經(jīng)...
沈念sama閱讀 46,268評(píng)論 1贊 320
?護(hù)林員之死
正文獨(dú)居荒郊野嶺守林人離奇死亡皮璧，尸身上長(zhǎng)有42處帶血的膿包…… 初始之章·張勛以下內(nèi)容為張勛視角年9月15日...
茶點(diǎn)故事閱讀 38,356評(píng)論 3贊 340
?白月光啟示錄
正文我和宋清朗相戀三年舟扎，在試婚紗的時(shí)候發(fā)現(xiàn)自己被綠了。大學(xué)時(shí)的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片悴务。...
茶點(diǎn)故事閱讀 40,488評(píng)論 1贊 352
活死人
序言：一個(gè)原本活蹦亂跳的男人離奇死亡睹限，死狀恐怖，靈堂內(nèi)的尸體忽然破棺而出讯檐，到底是詐尸還是另有隱情羡疗，我是刑警寧澤，帶...
沈念sama閱讀 36,181評(píng)論 5贊 350
?日本核電站爆炸內(nèi)幕
正文年R本政府宣布别洪，位于F島的核電站叨恨，受9級(jí)特大地震影響，放射性物質(zhì)發(fā)生泄漏蕉拢。R本人自食惡果不足惜特碳，卻給世界環(huán)境...
茶點(diǎn)故事閱讀 41,862評(píng)論 3贊 333
男人毒藥：我在死后第九天來索命
文/蒙蒙一、第九天我趴在偏房一處隱蔽的房頂上張望晕换。院中可真熱鬧午乓，春花似錦、人聲如沸闸准。這莊子的主人今日做“春日...
開封第一講書人閱讀 32,331評(píng)論 0贊 24
一樁弒父案，背后竟有這般陰謀
文/蒼蘭香墨我抬頭看了看天上的太陽夷家。三九已至蒸其，卻和暖如春，著一層夾襖步出監(jiān)牢的瞬間库快，已是汗流浹背摸袁。一陣腳步聲響...
開封第一講書人閱讀 33,445評(píng)論 1贊 272
情欲美人皮
我被黑心中介騙來泰國(guó)打工，沒想到剛下飛機(jī)就差點(diǎn)兒被人妖公主榨干…… 1. 我叫王不留义屏，地道東北人靠汁。一個(gè)月前我還...
沈念sama閱讀 48,897評(píng)論 3贊 376
代替公主和親
正文我出身青樓，卻偏偏與公主長(zhǎng)得像闽铐，于是被迫代替她去往敵國(guó)和親。傳聞我的和親對(duì)象是個(gè)殘疾皇子，可洞房花燭夜當(dāng)晚...
茶點(diǎn)故事閱讀 45,500評(píng)論 2贊 359

Keras 源碼分析

Keras 源碼分析

Keras 概覽

backend 設(shè)計(jì)

class Layer(object) 設(shè)計(jì)

class Node(object)

class Layer(object)

主要的 Properties

關(guān)鍵的 Methods

以 Dense 層為例

數(shù)學(xué)運(yùn)算

class Container(Layer) 設(shè)計(jì)

__init__() 函數(shù)

Graph 構(gòu)建

Depth 環(huán)路避免

有關(guān)訓(xùn)練的 Property Methods

Update

Loss

Weights 相關(guān)的 Methods

前向傳播計(jì)算

output 計(jì)算

Method call 與 cache 策略

class Model(Container) 設(shè)計(jì)

Method compile

化歸格式

準(zhǔn)備預(yù)測(cè)

loss 計(jì)算

Method fit

數(shù)據(jù)標(biāo)準(zhǔn)化

訓(xùn)練函數(shù)

Theano 后端的 Function

_fit_loop

推薦閱讀更多精彩內(nèi)容

`Keras` 源碼分析

`Keras` 概覽

`class Layer(object)` 設(shè)計(jì)

`class Node(object)`

`class Layer(object)`

以 `Dense` 層為例

`class Container(Layer)` 設(shè)計(jì)

`init()` 函數(shù)

`output` 計(jì)算

Method `call` 與 cache 策略

`class Model(Container)` 設(shè)計(jì)

Method `compile`

`loss` 計(jì)算

Method `fit`

`Theano` 后端的 `Function`

`_fit_loop`