學(xué)習(xí)原因：

通過學(xué)習(xí)“魔鏡杯”風(fēng)控大賽金獎獲得者的代碼，發(fā)現(xiàn)在模型建立完畢之后八千，可以使用hyperopt包進(jìn)行自動化參數(shù)調(diào)優(yōu)吗讶。而不需要人工不停輸入?yún)?shù)進(jìn)行判斷燎猛，尤其在參數(shù)組合較多的情況下，很好用照皆。

參數(shù)調(diào)優(yōu)常用的工具包：

常用的調(diào)參方式有 grid search 和 random search 重绷，grid search 是全空間掃描，所以比較慢膜毁，random search 雖然快昭卓，但可能錯失空間上的一些重要的點(diǎn)，精度不夠瘟滨，于是候醒，貝葉斯優(yōu)化出現(xiàn)了。

hyperopt是一種通過貝葉斯優(yōu)化（貝葉斯優(yōu)化簡介）來調(diào)整參數(shù)的工具杂瘸，對于像XGBoost這種參數(shù)比較多的算法倒淫，可以用它來獲取比較好的參數(shù)值。

使用方法

fmin應(yīng)該是最重要的一個方法了败玉，下面要介紹的都是在fmin中可以設(shè)置的參數(shù)昌简。全文是通過對fmin參數(shù)的介紹和使用來進(jìn)行搜索模型的最優(yōu)化參數(shù)的。
Hyheropt四個重要的因素：指定需要最小化的函數(shù)绒怨，搜索的空間纯赎，采樣的數(shù)據(jù)集(trails database)（可選），搜索的算法（可選）南蹂。
先來一個簡單的例子犬金，然后根據(jù)這個列子進(jìn)行講解和擴(kuò)展

from hyperopt import hp, fmin, rand, tpe, space_eval
space = [hp.uniform(’x’, 0, 1), hp.normal(’y’, 0, 1)]
def q (args) :
    x, y = args
    return x ?? 2 + y ?? 2
best = fmin(q, space, algo=rand.suggest,max_evals=100)
print space_eval(space, best)

以上面的函數(shù)為例，fmin尋找最佳匹配的 space 六剥，使 fn 的函數(shù)返回值最小晚顷，采用了 tpe.suggest (tree of Parzen estimators) 的算法，反復(fù)嘗試100次疗疟，最終得到的結(jié)果類似于 {'x': 0.000269455723739237}

最小化目標(biāo)函數(shù)

首先该默，定義一個目標(biāo)函數(shù),接受一個變量,計算后返回一個函數(shù)的損失值，比如要最小化函數(shù)q(x,y) = x2 + y2策彤。注意栓袖，一定是loss內(nèi)涵的min函數(shù)，不要是score那種的max函數(shù)

搜索空間

定義一個參數(shù)空間店诗，比如x在0-1區(qū)間內(nèi)取值裹刮，y是實(shí)數(shù)，所以在上面的代碼中為
hp.uniform（label,start,end)

搜索空間表現(xiàn)形式

搜索空間可以含有l(wèi)ist和dictionary.

from hyperopt import hp
list_space = [
hp.uniform(’a’, 0, 1),
hp.loguniform(’b’, 0, 1)]
tuple_space = (
hp.uniform(’a’, 0, 1),
hp.loguniform(’b’, 0, 1))
dict_space = {
’a’: hp.uniform(’a’, 0, 1),
’b’: hp.loguniform(’b’, 0, 1)}

使用sample函數(shù)從參數(shù)空間內(nèi)采樣：

from hyperopt.pyll.stochasti import sample
print sample(list_space)
# => [0.13, .235]
print sample(nested_space)
# => [[{’case’: 1, ’a’, 0.12‘}, {’case’: 2, ’b’: 2.3}],
# ’extra_literal_string’,
# 3]

可選參數(shù)

hp.choice 可以返回list中的值庞瘸，還可以構(gòu)成tuple.options捧弃，用于組成條件參數(shù)

num_leaves=hp.choice('num_leaves',range(10,100,10))

hp.pchoice(label,p_options)以一定的概率返回一個p_options的一個選項。這個選項使得函數(shù)在搜索過程中對每個選項的可能性不均勻。
hp.uniform(label,low,high)參數(shù)在low和high之間均勻分布违霞。
hp.quniform(label,low,high,q),參數(shù)的取值round(uniform(low,high)/q)*q嘴办，適用于那些離散的取值。
hp.loguniform(label,low,high) 返回根據(jù) exp（uniform（low买鸽，high））繪制的值涧郊，以便返回值的對數(shù)是均勻分布的。
優(yōu)化時癞谒，該變量被限制在[exp（low）底燎，exp（high）]區(qū)間內(nèi)。
hp.randint(label,upper) 返回一個在[0,upper)前閉后開的區(qū)間內(nèi)的隨機(jī)整數(shù)弹砚。
hp.normal(label, mu, sigma) where mu and sigma are the mean and standard deviation σ , respectively. 正態(tài)分布双仍，返回值范圍沒法限制。
hp.qnormal(label, mu, sigma, q)
hp.lognormal(label, mu, sigma)
hp.qlognormal(label, mu, sigma, q)

注意事項

如果需要枚舉從[1, 100]桌吃，那么用choice朱沃，而不應(yīng)該用quniform hp.randint(label, upper) 返回從[0, upper)的隨機(jī)整數(shù)，一般用作隨機(jī)數(shù)的種子值茅诱。
hp.quniform(label, low, high) where low and high are the lower and upper bounds on the range. 但只取整數(shù)(round)的float形式逗物，返回可能是 1.0 這樣的數(shù)值，如果模型參數(shù)類型有Interger的限制瑟俭，需要顯式做一個 int()的轉(zhuǎn)換翎卓。
hp.loguniform(label, low, high) 返回的值在 [elow,ehigh] 之間，屬于log uniform分布摆寄，取值偏聚集于前部失暴，概率上類似于拋物線

一個比較復(fù)雜的搜索空間

 from hyperopt import hp
    space = hp.choice('classifier_type', [
        {
        'type': 'naive_bayes',
        },
        {
        'type': 'svm',
        'C': hp.lognormal('svm_C', 0, 1),
        'kernel': hp.choice('svm_kernel', [
            {'ktype': 'linear'},
            {'ktype': 'RBF', 'width': hp.lognormal('svm_rbf_width', 0, 1)},
            ]),
        },
        {
        'type': 'dtree',
        'criterion': hp.choice('dtree_criterion', ['gini', 'entropy']),
        'max_depth': hp.choice('dtree_max_depth',
            [None, hp.qlognormal('dtree_max_depth_int', 3, 1, 1)]),
        'min_samples_split': hp.qlognormal('dtree_min_samples_split', 2, 1, 1),
        },
        ])

搜索算法

algo指定搜索算法，目前支持以下算法：
①隨機(jī)搜索(hyperopt.rand.suggest)
②模擬退火(hyperopt.anneal.suggest)
③TPE算法（hyperopt.tpe.suggest微饥，算法全稱為Tree-structured Parzen Estimator Approach）

其他有用的參數(shù)和方法

Trials

Trials只是用來記錄每次eval的時候逗扒，具體使用了什么參數(shù)以及相關(guān)的返回值。這時候欠橘，fn的返回值變?yōu)閐ict矩肩，除了loss，還有一個status肃续。Trials對象將數(shù)據(jù)存儲為一個BSON對象黍檩，可以利用MongoDB做分布式運(yùn)算。

from hyperopt import fmin, tpe, hp, STATUS_OK, Trials
 
fspace = {
    'x': hp.uniform('x', -5, 5)
}
 
def f(params):
    x = params['x']
    val = x**2
    return {'loss': val, 'status': STATUS_OK}
 
trials = Trials()
best = fmin(fn=f, space=fspace, algo=tpe.suggest, max_evals=50, trials=trials)
 
print('best:', best)
 
print('trials:')
for trial in trials.trials[:2]:
    print(trial)

對于STATUS_OK的返回痹升，會統(tǒng)計它的loss值建炫，而對于STATUS_FAIL的返回，則會忽略疼蛾。

可以通過這里面的值，把一些變量與loss的點(diǎn)繪圖艺配，來看匹配度察郁⊙苌鳎或者tid與變量繪圖，看它搜索的位置收斂（非數(shù)學(xué)意義上的收斂）情況皮钠。
trials有這幾種：

trials.trials - a list of dictionaries representing everything about the search
trials.results - a list of dictionaries returned by ‘objective’ during the search
trials.losses() - a list of losses (float for each ‘ok’ trial) trials.statuses() - a list of status strings

通過交叉驗(yàn)證的方式確定最佳的參數(shù)

cross_val_score
對衡量的estimator稳捆，它默認(rèn)返回的是一個array，包含K folder情況下的各次的評分麦轰，一般采用mean()乔夯。需要確定這個estimator默認(rèn)的 scoring 是什么，它的值是越大越匹配還是越小越匹配款侵。如果自己指定了scoring末荐，一定要確定這個scoring值的意義，切記切記新锈！而如果用戶不指定甲脏，一般對于Classification類的estimator，使用accuracy妹笆，它是越大越好块请，那么，hyperopt里面的loss的值就應(yīng)該是對這個值取負(fù)數(shù)拳缠，因?yàn)閔yperopt通過loss最小取找最佳匹配墩新。可以把feature的normalize或者scale作為一個choice，然后看看是否更合適窟坐。如果更合適海渊，best里面就會顯示 normalize 為1。

from sklearn.datasets import load_iris
from sklearn import datasets
from sklearn.preprocessing import normalize, scale
from hyperopt import fmin, tpe, hp, STATUS_OK, Trials
 
 
iris = load_iris()
X = iris.data
y = iris.target
 
def hyperopt_train_test(params):
    X_ = X[:]
 
    # 因?yàn)橄旅娴膬蓚€參數(shù)都不屬于KNeighborsClassifier支持的參數(shù)狸涌，故使用后直接刪除
    if 'normalize' in params:
        if params['normalize'] == 1:
            X_ = normalize(X_)
            del params['normalize']
 
    if 'scale' in params:
        if params['scale'] == 1:
            X_ = scale(X_)
            del params['scale']
 
    clf = KNeighborsClassifier(**params)
    return cross_val_score(clf, X_, y).mean()
 
space4knn = {
    'n_neighbors': hp.choice('n_neighbors', range(1,50)),
    'scale': hp.choice('scale', [0, 1]),  # 必須是choice切省，不要用quniform
    'normalize': hp.choice('normalize', [0, 1])
}
 
def f(params):
    acc = hyperopt_train_test(params)
    return {'loss': -acc, 'status': STATUS_OK} #注意這里的負(fù)號
 
trials = Trials()
best = fmin(f, space4knn, algo=tpe.suggest, max_evals=100, trials=trials)
print best

參考的網(wǎng)頁們：

hyperopt的gitlab主頁
 hyperopt的中文翻譯文檔主頁
 調(diào)參神器：Hyperopt
Hyperopt的使用注意點(diǎn)
python機(jī)器學(xué)習(xí)模型選擇&調(diào)參工具Hyperopt-sklearn（1）——綜述&分類問題
 Hyperopt中文文檔:FMin

使用sklearn的數(shù)據(jù)進(jìn)行一次測試

#coding:utf-8
from hyperopt import fmin, tpe, hp, rand
import numpy as np
from sklearn.metrics import accuracy_score
from sklearn import svm
from sklearn import datasets

# SVM的三個超參數(shù)：C為懲罰因子，kernel為核函數(shù)類型帕胆，gamma為核函數(shù)的額外參數(shù)（對于不同類型的核函數(shù)有不同的含義）
# 有別于傳統(tǒng)的網(wǎng)格搜索（GridSearch）朝捆，這里只需要給出最優(yōu)參數(shù)的概率分布即可，而不需要按照步長把具體的值給一個個枚舉出來
parameter_space_svc ={
    # loguniform表示該參數(shù)取對數(shù)后符合均勻分布
    'C':hp.loguniform("C", np.log(1), np.log(100)),
    'kernel':hp.choice('kernel',['rbf','poly']),
    'gamma': hp.loguniform("gamma", np.log(0.001), np.log(0.1)),
}

# 鳶尾花卉數(shù)據(jù)集懒豹，是一類多重變量分析的數(shù)據(jù)集
# 通過花萼長度芙盘，花萼寬度，花瓣長度脸秽，花瓣寬度4個屬性預(yù)測鳶尾花卉屬于（Setosa儒老，Versicolour，Virginica）三個種類中的哪一類
iris = datasets.load_digits()

#--------------------劃分訓(xùn)練集和測試集--------------------
train_data = iris.data[0:1300]
train_target = iris.target[0:1300]
test_data = iris.data[1300:-1]
test_target = iris.target[1300:-1]
#-----------------------------------------------------------

# 計數(shù)器记餐，每一次參數(shù)組合的枚舉都會使它加1
count = 0

def function(args):
    print(args)

    # **可以把dict轉(zhuǎn)換為關(guān)鍵字參數(shù)驮樊，可以大大簡化復(fù)雜的函數(shù)調(diào)用
    clf = svm.SVC(**args)

    # 訓(xùn)練模型
    clf.fit(train_data,train_target)

    # 預(yù)測測試集
    prediction = clf.predict(test_data)

    global count
    count = count + 1
    score = accuracy_score(test_target,prediction)
    print("第%s次，測試集正確率為：" % str(count),score)

    # 由于hyperopt僅提供fmin接口，因此如果要求最大值囚衔，則需要取相反數(shù)
    return -score

# algo指定搜索算法挖腰，目前支持以下算法：
# ①隨機(jī)搜索(hyperopt.rand.suggest)
# ②模擬退火(hyperopt.anneal.suggest)
# ③TPE算法（hyperopt.tpe.suggest，算法全稱為Tree-structured Parzen Estimator Approach）
# max_evals指定枚舉次數(shù)上限练湿，即使第max_evals次枚舉仍未能確定全局最優(yōu)解猴仑，也要結(jié)束搜索，返回目前搜索到的最優(yōu)解
best = fmin(function, parameter_space_svc, algo=tpe.suggest, max_evals=100)

# best["kernel"]返回的是數(shù)組下標(biāo)肥哎，因此需要把它還原回來
kernel_list = ['rbf','poly']
best["kernel"] = kernel_list[best["kernel"]]

print("最佳參數(shù)為：",best)

clf = svm.SVC(**best)
print(clf)

輸出結(jié)果如下：

{'gamma': 0.0010051585652497248, 'kernel': 'poly', 'C': 29.551164584073586}
第1次辽俗，測試集正確率為： 0.959677419355
{'gamma': 0.006498482991283678, 'kernel': 'rbf', 'C': 6.626826808981864}
第2次，測試集正確率為： 0.834677419355
{'gamma': 0.008192671915044216, 'kernel': 'poly', 'C': 34.48947180442318}
第3次篡诽，測試集正確率為： 0.959677419355
{'gamma': 0.001359874432712413, 'kernel': 'rbf', 'C': 1.6402360233244775}
第98次崖飘，測試集正確率為： 0.971774193548
{'gamma': 0.0029328466160223813, 'kernel': 'poly', 'C': 1.6328276445108112}
第99次，測試集正確率為： 0.959677419355
{'gamma': 0.0015786919481979775, 'kernel': 'rbf', 'C': 4.669133703622153}
第100次霞捡，測試集正確率為： 0.969758064516
最佳參數(shù)為： {'gamma': 0.00101162002595069, 'kernel': 'rbf', 'C': 21.12514792460218}
SVC(C=21.12514792460218, cache_size=200, class_weight=None, coef0=0.0,
  decision_function_shape=None, degree=3, gamma=0.00101162002595069,
  kernel='rbf', max_iter=-1, probability=False, random_state=None,
  shrinking=True, tol=0.001, verbose=False)

python使用hyperopt工具進(jìn)行自動調(diào)參