今天聽了北美培訓機構(gòu)九章算法的一節(jié)公開課菩混,做點心得筆錄忿墅。
1.首先介紹什么是博弈問題,面試中博弈問題的特點
生活中的博弈問題
- 石頭沮峡、剪刀疚脐、布
- tig \tag\toe
數(shù)學中的博弈問題
- 囚徒困境
面試中的博弈問題
- 這個游戲是兩個人玩
- 這個游戲是兩個人輪流玩
- 玩的時候兩個人都要遵守一個規(guī)則
- 最后 告訴我他們誰先玩完
換句話說 就是:
- 兩個玩家輪流游戲
- 每個人都基于局面有一個可以操作的集合(一般一致)
- 達到某個條件游戲結(jié)束
- 問是否必勝 或者 某位玩家的最大收益
2.引例 :Coins in a line(1)
There are n coins in a line. Two players take turns to take one or two coins from right side until there are no more coins left. The player who take the last coin wins.
could you please decide the first player will win or lose?
稍加分析我們就會發(fā)現(xiàn)
n = 2 return true
n = 3 return false
n 是否是3的倍數(shù)
但是這是怎么來的呢?
3.重要的概念:
前提條件: - 游戲中的玩家是AlphaGo 級別邢疙,意味著一定會采取最聰明的策略
- 先手棍弄,后手 的概念
- 必勝 ? 必斉庇巍呼畸?
- 先手必勝狀態(tài)(= 后手必敗狀態(tài),又叫N-position) next敗
- 先手必敗狀態(tài) (= 后手必勝狀態(tài)颁虐,又叫P-position) previous敗
先手后手是對當時的一個局面來說的蛮原,一個人在不同的局面下先手后手會轉(zhuǎn)換
這里的三個概念的理解非常重要:
在這個圖中,剩3個硬幣是先手必敗狀態(tài)另绩,因為對方一定會選擇可以取勝的策略儒陨。
博弈問題花嘶,如果與動態(tài)規(guī)劃的三個組成取得對應(yīng):狀態(tài)表示,狀態(tài)轉(zhuǎn)移蹦漠,狀態(tài)初始化
4.例子
4.1取石子游戲
Two players(A and B) are playing a game with stones. Player A always plays first, and the two players move in alternating turns. In a single move , a player can remove 2,3,or 5 stones from the game board. If a player is unable to make a move, that player loses the game.
"我一出手后椭员, 就可以置你于先手必敗“ == "我先手必勝”
DP[i] = i 個石子是否先手必勝
DP[i] = !(DP[i-2]&&DP[i-3]&&DP[i-5])
4.2Coins in a line(2)
There are n coins with different value in a line. Two players take turns to take one or two coins from left side until there are no more coins left. The player who take the coins with the most value wins.Could you please decide the first player will win or lose?
DP[i]準確描述應(yīng)為 :還剩i個硬幣時,先手能獲得的最大值笛园。
4.3Coins in a line(3)
There are n coins in a line.Two players take turns to take a coin from one of the ends of the line until there are no more coins. The player with the larger amount of money wins.Could you please decide the first player will win or lose?
這個題的思路和上一題還是一樣隘击,
"我一出手后, 就可以置你于先手必敗“ == "我先手必勝”
4.4
如果在其四個子狀態(tài)中研铆,能找到一個先手必敗狀態(tài)埋同,那其就是先手必勝狀態(tài)
DP[i][j] = !(DP[i-2][j-1]&&DP[i-1][j-2]&&DP[i+1][j-2]&&DP[i-2][j+1])