-
初始化為常量
tf中使用tf.constant_initializer(value)類生成一個(gè)初始值為常量value的tensor對(duì)象八堡。
constant_initializer類的構(gòu)造函數(shù)定義:
def __init__(self, value=0, dtype=dtypes.float32, verify_shape=False): self.value = value self.dtype = dtypes.as_dtype(dtype) self._verify_shape = verify_shape
value:指定的常量
dtype: 數(shù)據(jù)類型
verify_shape: 是否可以調(diào)整tensor的形狀,默認(rèn)可以調(diào)整
import tensorflow as tf value = [0, 1, 2, 3, 4, 5, 6, 7] init = tf.constant_initializer(value) with tf.Session() as sess: x = tf.get_variable('x', shape=[8], initializer=init) x.initializer.run() print(x.eval()) #output: #[ 0. 1. 2. 3. 4. 5. 6. 7.]
神經(jīng)網(wǎng)絡(luò)中經(jīng)常使用常量初始化方法來初始化偏置項(xiàng)。
當(dāng)初始化一個(gè)維數(shù)很多的常量時(shí),一個(gè)一個(gè)指定每個(gè)維數(shù)上的值很不方便,tf提供了 tf.zeros_initializer() 和 tf.ones_initializer() 類,分別用來初始化全0和全1的tensor對(duì)象纹磺。
import tensorflow as tf
init_zeros=tf.zeros_initializer()
init_ones = tf.ones_initializer
with tf.Session() as sess:
x = tf.get_variable('x', shape=[8], initializer=init_zeros)
y = tf.get_variable('y', shape=[8], initializer=init_ones)
x.initializer.run()
y.initializer.run()
print(x.eval())
print(y.eval())
#output:
# [ 0. 0. 0. 0. 0. 0. 0. 0.]
# [ 1. 1. 1. 1. 1. 1. 1. 1.]
-
初始化為正太分布
初始化參數(shù)為正太分布在神經(jīng)網(wǎng)絡(luò)中應(yīng)用的最多,可以初始化為標(biāo)準(zhǔn)正太分布和截?cái)嗾植肌?/p>
tf中使用 tf.random_normal_initializer() 類來生成一組符合標(biāo)準(zhǔn)正太分布的tensor锌奴。
tf中使用 tf.truncated_normal_initializer() 類來生成一組符合截?cái)嗾植嫉膖ensor。
tf.random_normal_initializer 類和 tf.truncated_normal_initializer 的構(gòu)造函數(shù)定義:
def __init__(self, mean=0.0, stddev=1.0, seed=None, dtype=dtypes.float32): self.mean = mean self.stddev = stddev self.seed = seed self.dtype = _assert_float_dtype(dtypes.as_dtype(dtype))
- mean: 正太分布的均值鲁冯,默認(rèn)值0
- stddev: 正太分布的標(biāo)準(zhǔn)差,默認(rèn)值1
- seed: 隨機(jī)數(shù)種子色查,指定seed的值可以每次都生成同樣的數(shù)據(jù)
- dtype: 數(shù)據(jù)類型
import tensorflow as tf init_random = tf.random_normal_initializer(mean=0.0, stddev=1.0, seed=None, dtype=tf.float32) init_truncated = tf.truncated_normal_initializer(mean=0.0, stddev=1.0, seed=None, dtype=tf.float32) with tf.Session() as sess: x = tf.get_variable('x', shape=[10], initializer=init_random) y = tf.get_variable('y', shape=[10], initializer=init_truncated) x.initializer.run() y.initializer.run() print(x.eval()) print(y.eval()) #output: # [-0.40236568 -0.35864913 -0.94253045 -0.40153521 0.1552504 1.16989613 # 0.43091929 -0.31410623 0.70080078 -0.9620409 ] # [ 0.18356581 -0.06860946 -0.55245203 1.08850253 -1.13627422 -0.1006074 # 0.65564936 0.03948414 0.86558545 -0.4964745 ]
-
初始化為均勻分布
tf中使用 tf.random_uniform_initializer 類來生成一組符合均勻分布的tensor薯演。 tf.random_uniform_initializer類構(gòu)造函數(shù)定義:
def __init__(self, minval=0, maxval=None, seed=None, dtype=dtypes.float32): self.minval = minval self.maxval = maxval self.seed = seed self.dtype = dtypes.as_dtype(dtype)
- minval: 最小值
- maxval: 最大值
- seed:隨機(jī)數(shù)種子
- dtype: 數(shù)據(jù)類型
import tensorflow as tf init_uniform = tf.random_uniform_initializer(minval=0, maxval=10, seed=None, dtype=tf.float32) with tf.Session() as sess: x = tf.get_variable('x', shape=[10], initializer=init_uniform) x.initializer.run() print(x.eval()) # output: # [ 6.93343639 9.41196823 5.54009819 1.38017178 1.78720832 5.38881063 # 3.39674473 8.12443542 0.62157512 8.36026382]
從輸出可以看到,均勻分布生成的隨機(jī)數(shù)并不是從小到大或者從大到小均勻分布的秧了,這里均勻分布的意義是每次從一組服從均勻分布的數(shù)里邊隨機(jī)抽取一個(gè)數(shù)跨扮。
tf中另一個(gè)生成均勻分布的類是 tf.uniform_unit_scaling_initializer(),構(gòu)造函數(shù)是
def __init__(self, factor=1.0, seed=None, dtype=dtypes.float32): self.factor = factor self.seed = seed self.dtype = _assert_float_dtype(dtypes.as_dtype(dtype))
同樣都是生成均勻分布,tf.uniform_unit_scaling_initializer 跟 tf.random_uniform_initializer 不同的地方是前者不需要指定最大最小值衡创,是通過公式計(jì)算出來的:
max_val = math.sqrt(3 / input_size) * factor min_val = -max_val
input_size是生成數(shù)據(jù)的維度帝嗡,factor是系數(shù)。
import tensorflow as tf init_uniform_unit = tf.uniform_unit_scaling_initializer(factor=1.0, seed=None, dtype=tf.float32) with tf.Session() as sess: x = tf.get_variable('x', shape=[10], initializer=init_uniform_unit) x.initializer.run() print(x.eval()) # output: # [-1.65964031 0.59797513 -0.97036457 -0.68957627 1.69274557 1.2614969 # 1.55491126 0.12639415 0.54466736 -1.56159735]
-
初始化為變尺度正太璃氢、均勻分布
tf中tf.variance_scaling_initializer()類可以生成截?cái)嗾植己途鶆蚍植嫉膖ensor哟玷,增加了更多的控制參數(shù)。構(gòu)造函數(shù):
def __init__(self, scale=1.0, mode="fan_in", distribution="normal", seed=None, dtype=dtypes.float32): if scale <= 0.: raise ValueError("`scale` must be positive float.") if mode not in {"fan_in", "fan_out", "fan_avg"}: raise ValueError("Invalid `mode` argument:", mode) distribution = distribution.lower() if distribution not in {"normal", "uniform"}: raise ValueError("Invalid `distribution` argument:", distribution) self.scale = scale self.mode = mode self.distribution = distribution self.seed = seed self.dtype = _assert_float_dtype(dtypes.as_dtype(dtype))
- scale: 縮放尺度
- mode: 有3個(gè)值可選一也,分別是 “fan_in”, “fan_out” 和 “fan_avg”巢寡,用于控制計(jì)算標(biāo)準(zhǔn)差 stddev的值
- distribution: 2個(gè)值可選,”normal”或“uniform”椰苟,定義生成的tensor的分布是截?cái)嗾植歼€是均勻分布
distribution選‘normal’的時(shí)候抑月,生成的是截?cái)嗾植迹瑯?biāo)準(zhǔn)差 stddev = sqrt(scale / n), n的取值根據(jù)mode的不同設(shè)置而不同:
- mode = "fan_in"舆蝴, n為輸入單元的結(jié)點(diǎn)數(shù)谦絮;
- mode = "fan_out",n為輸出單元的結(jié)點(diǎn)數(shù)洁仗;
- mode = "fan_avg",n為輸入和輸出單元結(jié)點(diǎn)數(shù)的平均值;
distribution選 ‘uniform’层皱,生成均勻分布的隨機(jī)數(shù)tensor,最大值 max_value和 最小值 min_value 的計(jì)算公式:
max_value = sqrt(3 * scale / n) min_value = -max_value
import tensorflow as tf init_variance_scaling_normal = tf.variance_scaling_initializer(scale=1.0,mode="fan_in", distribution="normal",seed=None,dtype=tf.float32) init_variance_scaling_uniform = tf.variance_scaling_initializer(scale=1.0,mode="fan_in", distribution="uniform",seed=None,dtype=tf.float32) with tf.Session() as sess: x = tf.get_variable('x', shape=[10], initializer=init_variance_scaling_normal) y = tf.get_variable('y', shape=[10], initializer=init_variance_scaling_uniform) x.initializer.run() y.initializer.run() print(x.eval()) print(y.eval()) # output: # [ 0.55602223 0.36556259 0.39404872 -0.11241052 0.42891756 -0.22287074 # 0.15629818 0.56271428 -0.15364751 -0.03651841] # [ 0.22965753 -0.1339919 -0.21013224 0.112804 -0.49030468 0.21375734 # 0.24524075 -0.48397955 0.02254289 -0.07996771]
-
其他初始化方式
- tf.orthogonal_initializer() 初始化為正交矩陣的隨機(jī)數(shù)赠潦,形狀最少需要是二維的
- tf.glorot_uniform_initializer() 初始化為與輸入輸出節(jié)點(diǎn)數(shù)相關(guān)的均勻分布隨機(jī)數(shù)
- tf.glorot_normal_initializer() 初始化為與輸入輸出節(jié)點(diǎn)數(shù)相關(guān)的截?cái)嗾植茧S機(jī)數(shù)
import tensorflow as tf init_orthogonal = tf.orthogonal_initializer(gain=1.0, seed=None, dtype=tf.float32) init_glorot_uniform = tf.glorot_uniform_initializer() init_glorot_normal = tf.glorot_normal_initializer() with tf.Session() as sess: x = tf.get_variable('x', shape=[4,4], initializer=init_orthogonal) y = tf.get_variable('y', shape=[10], initializer=init_glorot_uniform) z = tf.get_variable('z', shape=[10], initializer=init_glorot_normal) x.initializer.run() y.initializer.run() z.initializer.run() print(x.eval()) print(y.eval()) print(z.eval()) # output: # [[ 0.41819954 0.38149482 0.82090431 0.07541249] # [ 0.41401231 0.21400851 -0.38360971 0.79726893] # [ 0.73776144 -0.62585676 -0.06246936 -0.24517137] # [ 0.33077344 0.64572859 -0.41839844 -0.54641217]] # [-0.11182356 0.01995623 -0.0083192 -0.09200105 0.49967837 0.17083591 # 0.37086374 0.09727859 0.51015782 -0.43838671] # [-0.50223351 0.18181904 0.43594137 0.3390047 0.61405027 0.02597036 # 0.31719241 0.04096413 0.10962497 -0.13165198]