本篇為在University of Birmingham 學習Advanced Nature-Inspired Search and Optimisation課程中的筆記之一
This is one of the notes from the Advanced Nature-Inspired Search and Optimisation course at the University of Birmingham
[toc]
1 問題引出——螺栓螺母的匹配問題
1.1 一個螺栓與n個螺母(Matching one Bolt to n Nuts)
- The everyday problem: given one bolt and a collection of n nuts of different sizes, find a nut match the bolt
- In mathematical form: given an array of n elements, find the first element of which the value equals to x
- Q: How to solve it using an algorithm? How to solve it efficiently?
1.2 n個螺栓與n個螺母(the real Nuts and Bolts Problem)
- A disorganized carpenter has a collection of n nuts of distinct sizes and n bolts and would like to find the corresponding pairs of bolts and nuts. Each nut matches exactly one bolt (and vice versa), and he can only compare nuts to bolts, i.e., he can neither compare nuts to nuts nor bolts to bolts.
- Can you help the carpenter match the nuts and bolts quickly?
- This is called Nuts and Bolts Problem or Lock and Key problem.
- Q: How to formulate the problem? How to solve it using an algorithm? How to solve it efficiently?
2 方案介紹——隨機算法 (Randomised Algorithms)
“For many problems, a randomised algorithm is the simplest, the fastest or both.” —— Prabhakar Raghavan, Vice President of Engineering at Google.
2.1 算法一覽_Categories of algorithms by design paradigm
- 分治算法(Divide and conquer algorithms, e.g., quicksort algorithm, Merge-Sort)
- 數學規(guī)劃算法(Mathematical programming algorithms, e.g., linear programming, Multi-objective programming, Dynamic programming algorithms)
- 搜索和枚舉算法(Search and enumeration algorithms)
- 蠻力算法_Brute force algorithms, enumerating all possible candidate solutions and check
- 改進的蠻力算法_Improved brute force algorithms, e.g., branch and bound algorithms
-
啟發(fā)式算法_Heuristic algorithms
- 局部搜索_Local search, e.g., greedy search
- 隨機算法_Randomised algorithms, which include Evolutionary Computation, etc.
2.2 啟發(fā)式算法與隨機算法_Heuristic Algorithms & Randomised Algorithms
1). 啟發(fā)式在計算機中的解釋:
一種(通常是簡單的)算法油狂,可以在合理的時間內為問題提供足夠好的解決方案 (A (usually simple) algorithm that produces a good enough solution for a problem in a reasonable time frame)
- 解決方案(通常)不是最佳方案肉迫,但令人滿意:
- 更快:替代蠻力(窮舉)搜索
- 權衡最優(yōu)性纺蛆,完整性,準確性或精度以提高速度抬旺。
- 通常用于解決其他算法難以解決的問題,例如蠻力算法
- 包括確定性(例如本地搜索_local search algorithm)和隨機算法(Randomised Algorithm)
2). 隨機算法
- Randomised algorithm: An heuristic algorithm that makes random choices during execution to produce a result_隨機選擇產生結果
- Takes a source of random numbers to make random choices
- Behaviour, e.g., output or running time can vary even on a fixed input
- The goal: design an algorithm and analyse it to show that its behaviour is likely to be good, on every input
3). 隨機算法_Randomised Algorithms & 確定性算法_Deterministic Algorithms
graph LR
input => Algorithm
Algorithm=>output
<center>圖1. Deterministic Algorithms</center>
graph LR
input --> Algorithm
Random_number --> Algorithm
Algorithm-->output
<center>圖2. Randomised Algorithms</center>
3 解決方案——拉斯維加斯算法與蒙特卡洛算法
3.1 一個螺栓與n個螺母(Matching one Bolt to n Nuts)
對于第一個螺栓與n個螺母(Matching one Bolt to n Nuts)的問題腕侄,解決思想為使用隨機數找到問題的解決方案(Using random numbers to find a solution to a problem)
將其抽象為數學問題即為: 給定n個元素的數組殊校,找到其值等于x的第一個元素(The problem: given an array of n elements, find the first element of which the value equals to x)
具有代表性的兩個解決算法分別是拉斯維加斯算法和蒙特卡洛算法
1). 介紹 拉斯維加斯算法_Las Vegas algorithm 與蒙特卡洛算法_Monte Carlo algorithm
# 拉斯維加斯算法_Las Vegas algorithm
begin
repeat
Randomly select one element a out of n elements.
until a == x
end
如上所述,拉斯維加斯算法始終返回正確的結果晶默。上面的代碼說明了此屬性谨娜。變量a是隨機生成的;生成a后磺陡,使用a索引n個元素(數組)中比較x趴梢。如果該索引包含值x,則返回a币他;該算法將重復此過程坞靶,直到找到x。盡管可以保證使用此Las Vegas算法來找到正確的答案蝴悉,但它沒有固定的運行時間彰阴。
# 蒙特卡洛算法_Monte Carlo algorithm
begin
i := 0
repeat
Randomly select one element a out of n elements.
i := i + 1
until (a == x)||(i == k)
end
由于拉斯維加斯直到在數組中找到x才結束,它是以運行次數來賭博拍冠。另一方面尿这,Monte Carlo運行k次,這意味著不一定在執(zhí)行代碼的k次循環(huán)中庆杜,在數組中找到“x”妻味;意味著它可能會找到解決方案,也可能不會欣福。因此,與拉斯維加斯不同的是焦履,蒙特卡洛(Monte Carlo)在正確性上賭博拓劝。
2). 比較 拉斯維加斯算法_Las Vegas algorithm 與蒙特卡洛算法_Monte Carlo algorithm
拉斯維加斯算法_Las Vegas algorithm: 始終提供正確結果的隨機算法,一次運行到另一次運行的唯一變化是運行次數(A randomised algorithm that always gives correct results, the only variation from one run to another is the running time)
蒙特卡洛算法_Monte Carlo algorithm: 一種隨機算法嘉裤,其運行次數是確定性的郑临,但其結果在一定(通常較小)概率下可能不正確屑宠。(A randomised algorithm whose running time is deterministic, but whose results may be incorrect with a certain (typically small) probability.)
-
不同點:
- 蒙特卡洛算法運行固定數量的次數(Monte Carlo algorithm runs for a fixed number of steps)
- 拉斯維加斯算法無限循環(huán)運行厢洞,直到找到正確的結果(Las Vegas algorithm runs in an infinite loop until the correct results are found)
- 可以使用提前終止將拉斯維加斯算法轉換為蒙特卡洛算法(Las Vegas algorithm can be converted into Monte Carlo algorithm using early termination)
3). 總結 拉斯維加斯算法_Las Vegas algorithm 與蒙特卡洛算法_Monte Carlo algorithm
- 元素搜索問題非常簡單,但是
- 對于確定性順序(線性)搜索算法,例如躺翻,從頭開始逐一搜索數組:
- 平均時間復雜度:
- 最壞時間復雜度:
- 平均時間復雜度:
- 對于拉斯維加斯算法_Las Vegas algorithm:
- 平均時間復雜度: 取決于輸入; 如果一半數組包含0,另一半包含1丧叽,則查找第一個元素的值等于1的平均時間復雜度為
- 最壞時間復雜度: 無限(Unbound)
- 平均時間復雜度: 取決于輸入; 如果一半數組包含0,另一半包含1丧叽,則查找第一個元素的值等于1的平均時間復雜度為
- 對于蒙特卡洛算法_Monte Carlo algorithm:
- 平均時間復雜度:
- 最壞時間復雜度:
- 會有找不到元素的可能性。
- 平均時間復雜度:
- 對于確定性順序(線性)搜索算法,例如躺翻,從頭開始逐一搜索數組:
3.2 n個螺栓與n個螺母(the real Nuts and Bolts Problem)
- 一個沒有條理的木匠有n個不同大小的螺母和n個螺栓的集合公你,并希望找到相應的螺栓和螺母配對踊淳。每個螺母正好匹配一個螺栓(反之亦然),并且他只能將螺母與螺栓進行比較陕靠,即迂尝,他既不能將螺母與螺母進行比較,也不能將螺栓與螺栓進行比較剪芥。(A disorganized carpenter has a collection of n nuts of distinct sizes and n bolts and would like to find the corresponding pairs of bolts and nuts. Each nut matches exactly one bolt (and vice versa), and he can only compare nuts to bolts, i.e., he can neither compare nuts to nuts nor bolts to bolts.)
- 如果采用暴力算法(將每個螺母與所有螺栓進行比較以找到匹配的螺栓), 時間復雜度為
- 本節(jié)推薦使用隨機化快速排序_Randomised quicksort algorithm
1). 快速排序_Quicksort algorithm
快速排序: 給定n個數字組成的數組A垄开,按遞增順序對數字進行排序 (Given a array A of n numbers, sort the numbers in increasing order)
# 快速排序_Quicksort algorithm
less, equal, greater := three empty arrays
if length(array) > 1
pivot := select an element of array
for each x in array
if x < pivot then add x to less
if x = pivot then add x to equal
if x > pivot then add x to greater
quicksort(less)
quicksort(greater)
array := concatenate(less, equal, greater)
- 平均時間復雜度:
(n是數組的大小)
{Deterministic quicksort algorithm time complexity:on average for a random permutation array (n is the size of the array)}
- 最壞時間復雜度:
2). 隨機化快速排序_Randomised quicksort algorithm
- 隨機化快速排序算法: 隨機選擇樞軸(Selecting a pivot randomly)
- 平均時間復雜度:
- 最壞時間復雜度:
- 證明_Proof: Probabilistic Analysis and
Randomized Quicksort - n個螺栓與n個螺母問題(the real Nuts and Bolts Problem)的具體方案
4 擴展
4.1 隨機算法的應用
-
數學:
- 數論税肪,例如素數檢驗_Number theory, e.g., primality test
- 計算幾何:圖形算法溉躲,例如最小分割_Computational Geometry: graph algorithms, e.g., minimum cut
- 線性代數:矩陣計算_Linear algebra: matrix computations
-
計算機科學:
- 數據分析:網頁排名_Data analysis: PageRank
- 并行計算:避免死鎖_Parallel computing: Deadlock avoidance
- 優(yōu)化:自然啟發(fā)的優(yōu)化和搜索算法_Optimisation: Nature inspired Optimisation and Search algorithms
計算生物: DNA read alignment
4.2 隨機算法的優(yōu)缺點
- 優(yōu)點
- 簡單性:通常非常容易實現(Usually very easy to implement)
- 性能:通常以高概率產生(接近)最佳解決方案(Usually produce (near-) optimum solutions with high probability)
- 缺點
- 以有限的概率得到錯誤的答案(Getting a wrong answer with a finite probability.)
- 解決方案:重復算法
- 難以分析運行時間和獲得錯誤解決方案的概率(Difficult to analyse the running time and probability of getting an incorrect solution)
- 不可能獲得真正的隨機數(Impossible to get truly random numbers)
- 以有限的概率得到錯誤的答案(Getting a wrong answer with a finite probability.)