Using libsvm - part[1]

Purpose

libsvm is a tool collection for SVM (Support Vector Machines) related topics created by Chih-Jen Lin, NTU.

Currently, version 3.22 provides multiple interfaces for Matlab/octave/python and more. I will try to introduce the usage of this powerful toolbox in a pratical way.

SVM - Support Vector Machnes

  • It is very hard to explain this concept without massive math or numerical procedures, refer to Original paper of libsvm by Chih-Jen Lin if you want to know more then how to use.

  • The idea behind SVMs is to make the original problem linearly separable by applying an non-linear mapping function. The SVM then automatically discovers the optimal separating hyperplane, which indicates we can predict future data sets by comparing with this hyperplane. So, SVM is a tool for CLASSIFICATION and PREDICTION under the hood whose accuracy is determined by the selection of the mapping method.

  • Basic steps for a SVM procedure:

      1. Select a training set of instance-label pairs: P[i]=(x[i],y[i]) where x[i] holds quantitive properties of P[i] and y[i] is a binary label for P[i] which indicates y[i] can only be 1 or 0;
      1. Select a mapping function framework for target SVM, then its parameters will be given by solving an equivalent optimization problem;
      1. Select the hyperplane in mapped space to represent the margin of two values of y;
      1. Classify y[j] for P[j] from test set by applying mapping function to P[j] and comparing relative position with the selected hyperplane in step 3.

Using libsvm package to solve problem

Install libsvm package

    1. Download libsvm package from Download SECTION on its homepage;
    1. Untar/unzip the tarball/zip file to obtain the source code;
    1. Check all Makefiles inside the packages, if you are not familiar with make, treat the Makefiles as the method lists for converting the source code into binary;
    1. Make it directly if you just need to use these tools in command line or make it inside subdirs to support other methods like python;
    • 4.1. If you are blocked by "make: g++: Command not found", just install "gcc-c++" package (Fedora) or other C++ compilers.
        # An installation example on FC rawhide
        [chunwang@localhost matlab]$ uname -r 
        4.11.0-0.rc7.git3.1.fc27.x86_64
    
        # Download packages
        [chunwang@localhost libsvm]$ export LIBSVM_URL="http://www.csie.ntu.edu.tw/~cjlin/cgi-bin/libsvm.cgi?+http://www.csie.ntu.edu.tw/~cjlin/libsvm+tar.gz"
        [chunwang@localhost libsvm]$ wget $LIBSVM_URL -O libsvm.tar.gz 2>&1 &>/dev/null; echo $?
        0
    
        # Untar to obtain source code
        [chunwang@localhost libsvm]$ (tar -xvf ./libsvm.tar.gz && rm -f libsvm.tar.gz) 2>&1 &>/dev/null; echo $?
        0
    
        # Check and Make
        [chunwang@localhost libsvm]$ cd libsvm-3.22/
        [chunwang@localhost libsvm-3.22]$ find . -name Makefile
        ./java/Makefile
        ./svm-toy/qt/Makefile
        ./svm-toy/gtk/Makefile
        ./python/Makefile
        ./matlab/Makefile
        ./Makefile
        [chunwang@localhost libsvm-3.22]$ cat ./Makefile|grep all:
        all: svm-train svm-predict svm-scale
        [chunwang@localhost libsvm-3.22]$ rpm -q gcc-c++ || sudo yum install -y gcc-c++
        gcc-c++-7.0.1-0.16.fc27.x86_64
    
        #- Make binary directly
        [chunwang@localhost libsvm-3.22]$ make all &>/dev/null; echo $?
        0
        #- Make for python
        [chunwang@localhost libsvm-3.22]$ cd python/; make &>/dev/null; echo $?; cd ~-
        0
        #- Make for octave
        [chunwang@localhost libsvm-3.22]$ cd matlab/
        [chunwang@localhost matlab]$ octave --eval "make octave" &>/dev/null; echo $?; cd ~-
        0
    

Using libsvm to analysis

    1. Convert data into libsvm input data form;
    • By reading example file integrated into libsvm package, the form is very easy to parse:
        [chunwang@localhost libsvm-3.22]$ cat ./heart_scale | head -1 
        +1 1:0.708333 2:1 3:1 4:-0.320755 5:-0.105023 6:-1 7:1 8:-0.419847 9:-1 10:-0.225806 12:1 13:-1
    
        # Line[i] == "y[i] j:x[i][j] ..." where y[i] is +1/-1 and j is a static int
        # An convert example using AWK
        [chunwang@localhost libsvm-3.22]$ echo 32,-2,+1 | awk -F"," '{print $NF" 1:"$1" 2:"$2}'
        +1 1:32 2:-2
    
    1. Train a model using processed data input file and obtain result (Using heart_scale as an example, select first 200 lines as training set).
    • Refer to Graphic Interface Section of libsvm homepage to obtain more information for the parameters of svm-train
    • A very simple example using default model (Using binary directly):
        # Turn original data file into 2 target sets
        [chunwang@localhost libsvm-3.22]$ head -200 ./heart_scale > ./heart_scale_train
        [chunwang@localhost libsvm-3.22]$ tail -70 ./heart_scale > ./heart_scale_test
    
        # Train the model by optimization
        [chunwang@localhost libsvm-3.22]$ ./svm-train heart_scale_train 
        *
        optimization finished, #iter = 147
        nu = 0.453249
        obj = -75.742327, rho = 0.439634
        nSV = 105, nBSV = 78
        Total nSV = 105
    
        # Predict and store result into target output file
        [chunwang@localhost libsvm-3.22]$ ./svm-predict heart_scale_test heart_scale_train.model heart_scale_test_output
        Accuracy = 81.4286% (57/70) (classification)
    
        # All test results will be stored in this output file, each line represents the result y[i] for Line[i] == "y[i] j:x[i][j] in test set
        [chunwang@localhost libsvm-3.22]$ cat ./heart_scale_test_output | sort | uniq
        1
        -1
    
        # Some Concepts in svm-train output:
        iter     : Iterations times
        nu       : Kernel function parameter
        obj      : Optimal objective value of the target SVM problem
        nSV      : Number of support vectors
        nBSV     : Number of bounded support vectors
        Accuracy = Correctly predicted data / Ttotal testing data × 100%
    
    • Equivalent processes with python or octave
       # python
    
       [chunwang@localhost python]$ cat ./test.py
       from svmutil import *
    
       y, x = svm_read_problem('../heart_scale')
       model = svm_train(y[:200], x[:200])
       p_label, p_acc, p_val = svm_predict(y[200:], x[200:], model)
    
       --------------------------------------------------------------
    
       # octave
    
       # Matlab or Octave change the input format of the x[i] and y[i] into matrix, so the input procedure is different
       >> [label, data] = libsvmread("../heart_scale")    # Read from data file using libsvmread
       >> model = svmtrain(label(1:200,:), data(1:200,:)) # Generate target SVM model
    
       >> svmpredict(label(201:270,:), data(201:270,:), model)     # Predict with test set and SVM model
       Accuracy = 81.4286% (57/70) (classification)
       ans =
    
          1
       ...
    

Useful materials

  • [Google Scholar] Chih-Jen Lin - Link
  • [Quora] How to use libsvm in Matlab? - Link
最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
  • 序言:七十年代末,一起剝皮案震驚了整個(gè)濱河市寂诱,隨后出現(xiàn)的幾起案子垫言,更是在濱河造成了極大的恐慌艰争,老刑警劉巖躲查,帶你破解...
    沈念sama閱讀 218,525評論 6 507
  • 序言:濱河連續(xù)發(fā)生了三起死亡事件危虱,死亡現(xiàn)場離奇詭異玖翅,居然都是意外死亡脚曾,警方通過查閱死者的電腦和手機(jī)东且,發(fā)現(xiàn)死者居然都...
    沈念sama閱讀 93,203評論 3 395
  • 文/潘曉璐 我一進(jìn)店門,熙熙樓的掌柜王于貴愁眉苦臉地迎上來本讥,“玉大人珊泳,你說我怎么就攤上這事】椒校” “怎么了色查?”我有些...
    開封第一講書人閱讀 164,862評論 0 354
  • 文/不壞的土叔 我叫張陵,是天一觀的道長撞芍。 經(jīng)常有香客問我秧了,道長,這世上最難降的妖魔是什么序无? 我笑而不...
    開封第一講書人閱讀 58,728評論 1 294
  • 正文 為了忘掉前任验毡,我火速辦了婚禮衡创,結(jié)果婚禮上,老公的妹妹穿的比我還像新娘晶通。我一直安慰自己璃氢,他們只是感情好,可當(dāng)我...
    茶點(diǎn)故事閱讀 67,743評論 6 392
  • 文/花漫 我一把揭開白布录择。 她就那樣靜靜地躺著拔莱,像睡著了一般。 火紅的嫁衣襯著肌膚如雪隘竭。 梳的紋絲不亂的頭發(fā)上塘秦,一...
    開封第一講書人閱讀 51,590評論 1 305
  • 那天,我揣著相機(jī)與錄音动看,去河邊找鬼尊剔。 笑死,一個(gè)胖子當(dāng)著我的面吹牛菱皆,可吹牛的內(nèi)容都是我干的须误。 我是一名探鬼主播,決...
    沈念sama閱讀 40,330評論 3 418
  • 文/蒼蘭香墨 我猛地睜開眼仇轻,長吁一口氣:“原來是場噩夢啊……” “哼京痢!你這毒婦竟也來了?” 一聲冷哼從身側(cè)響起篷店,我...
    開封第一講書人閱讀 39,244評論 0 276
  • 序言:老撾萬榮一對情侶失蹤祭椰,失蹤者是張志新(化名)和其女友劉穎,沒想到半個(gè)月后疲陕,有當(dāng)?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體方淤,經(jīng)...
    沈念sama閱讀 45,693評論 1 314
  • 正文 獨(dú)居荒郊野嶺守林人離奇死亡,尸身上長有42處帶血的膿包…… 初始之章·張勛 以下內(nèi)容為張勛視角 年9月15日...
    茶點(diǎn)故事閱讀 37,885評論 3 336
  • 正文 我和宋清朗相戀三年蹄殃,在試婚紗的時(shí)候發(fā)現(xiàn)自己被綠了携茂。 大學(xué)時(shí)的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片。...
    茶點(diǎn)故事閱讀 40,001評論 1 348
  • 序言:一個(gè)原本活蹦亂跳的男人離奇死亡诅岩,死狀恐怖讳苦,靈堂內(nèi)的尸體忽然破棺而出,到底是詐尸還是另有隱情按厘,我是刑警寧澤医吊,帶...
    沈念sama閱讀 35,723評論 5 346
  • 正文 年R本政府宣布,位于F島的核電站逮京,受9級特大地震影響卿堂,放射性物質(zhì)發(fā)生泄漏。R本人自食惡果不足惜,卻給世界環(huán)境...
    茶點(diǎn)故事閱讀 41,343評論 3 330
  • 文/蒙蒙 一草描、第九天 我趴在偏房一處隱蔽的房頂上張望览绿。 院中可真熱鬧,春花似錦穗慕、人聲如沸饿敲。這莊子的主人今日做“春日...
    開封第一講書人閱讀 31,919評論 0 22
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽怀各。三九已至,卻和暖如春术浪,著一層夾襖步出監(jiān)牢的瞬間瓢对,已是汗流浹背。 一陣腳步聲響...
    開封第一講書人閱讀 33,042評論 1 270
  • 我被黑心中介騙來泰國打工胰苏, 沒想到剛下飛機(jī)就差點(diǎn)兒被人妖公主榨干…… 1. 我叫王不留硕蛹,地道東北人。 一個(gè)月前我還...
    沈念sama閱讀 48,191評論 3 370
  • 正文 我出身青樓硕并,卻偏偏與公主長得像法焰,于是被迫代替她去往敵國和親。 傳聞我的和親對象是個(gè)殘疾皇子倔毙,可洞房花燭夜當(dāng)晚...
    茶點(diǎn)故事閱讀 44,955評論 2 355

推薦閱讀更多精彩內(nèi)容

  • 王鐸者陕赃,明未清初著名書法家贵试,精擅草書。然而作為一代宗師凯正,在臨懷素帖尾卻留下了這樣一句話:懷素獨(dú)此帖可觀,他書...
    惠明坊閱讀 609評論 0 2
  • 一步上路豌蟋,十方相助廊散,萬籟合心,如何相度梧疲? 原創(chuàng)作品 (Original Article)
    一詩一境界閱讀 110評論 0 0
  • 無欲不是一種訓(xùn)誡允睹,而是一種完美自足的呈現(xiàn)狀態(tài)。獨(dú)立于世間純粹的體驗(yàn)幌氮,無欲的作為即是無為缭受。因?yàn)楸揪褪峭昝溃丛蛔恪?..
    氵汝閱讀 476評論 0 0
  • 在永無止境的黑夜里 有人腳上套著金色鈴鐺 清脆魅惑的聲音 是灰暗世界的樂曲 那在風(fēng)里搖曳的身姿 邁著緩慢而又輕佻的...
    塵致迢迢閱讀 572評論 1 4