正則表達(dá)式中的“量詞”入門(mén)介紹

正則表達(dá)式中的量詞可以用來(lái)指明某個(gè)字符串匹配的次數(shù)驶臊。將在以下描述“貪心量詞”（Greedy）、“厭惡量詞”（reluctant）耘婚、“占有量詞”（possessive）這三種量詞罢浇。（真的不知道怎么翻譯）。乍一看量詞X？(貪心量詞)嚷闭、X??(厭惡量詞) 和X攒岛？+(占有量詞)好像作用也差不多，因?yàn)樗鼈兊钠ヅ湟?guī)則都是匹配“X” 一次或者零次胞锰，即X出現(xiàn)一次或者一次都不出現(xiàn)灾锯。其實(shí)它們有著細(xì)微的差別，在本文中最后一部分會(huì)說(shuō)明嗅榕。

1.png

讓我們用貪心量詞來(lái)創(chuàng)建三種不同的正則表達(dá)式：a?顺饮、a*、a+凌那、兼雄。看看如果用空字符串來(lái)測(cè)匹配會(huì)得到什么結(jié)果帽蝶。

先給出以下測(cè)試代碼（直接使用終端編譯運(yùn)行即可）：

public class RegexTestHarness {
    public static void main(String[] args){
        Console console = System.console();
        if (console == null) {
            System.err.println("No console.");
            System.exit(1);
        }
        while (true) {

            Pattern pattern =
                    Pattern.compile(console.readLine("%nEnter your regex: "));

            Matcher matcher =
                    pattern.matcher(console.readLine("Enter input string to search: "));

            boolean found = false;
            while (matcher.find()) {
                console.format("I found the text" +
                                " \"%s\" starting at " +
                                "index %d and ending at index %d.%n",
                        matcher.group(),
                        matcher.start(),
                        matcher.end());
                found = true;
            }
            if(!found){
                console.format("No match found.%n");
            }
        }
    }
}

Enter your regex: a?
Enter input string to search:
I found the text "" starting at index 0 and ending at index 0.

Enter your regex: a*
Enter input string to search:
I found the text "" starting at index 0 and ending at index 0.

Enter your regex: a+
Enter input string to search:
No match found.

零長(zhǎng)度匹配

在上面的例子中君旦，前兩個(gè)例子可以匹配成功是因?yàn)楸磉_(dá)式a?和a*允許字符串中不出現(xiàn)‘a(chǎn)’字符。你會(huì)看到開(kāi)始和結(jié)束的下標(biāo)都是0嘲碱〗鹂常空字符串""沒(méi)有長(zhǎng)度，因此這個(gè)正則在開(kāi)始位置（即下標(biāo)為0）即匹配成功麦锯。像這一類的匹配稱之為“零長(zhǎng)度匹配”恕稠。零長(zhǎng)度匹配會(huì)在以下三種情況出現(xiàn)：
1.一個(gè)空字符串匹配。
2.和字符串的開(kāi)端匹配扶欣，即下標(biāo)為0的地方匹配鹅巍。（開(kāi)端即是空字符串）
3.和字符串結(jié)束的位置匹配。（結(jié)束即是空字符串）
4.任意兩個(gè)字符之間,如"bc"料祠，b和c之間即存在一個(gè)空字符串""骆捧。

用“foo”這個(gè)字符串作為例子，下標(biāo)的位置對(duì)應(yīng)關(guān)系為

Paste_Image.png

即index=0和index=3的地方會(huì)匹配。

零長(zhǎng)度匹配是非常容易辨別出來(lái)揣云，因?yàn)樗麄冮_(kāi)始的位置和結(jié)束的位置是同一下標(biāo)匣椰。

然我們?cè)倏磶讉€(gè)列子，輸入一個(gè)“a”字符枫攀。
Enter your regex: a?
Enter input string to search: a
I found the text "a" starting at index 0 and ending at index 1.
I found the text "" starting at index 1 and ending at index 1.

Enter your regex: a*
Enter input string to search: a
I found the text "a" starting at index 0 and ending at index 1.
I found the text "" starting at index 1 and ending at index 1.

Enter your regex: a+
Enter input string to search: a
I found the text "a" starting at index 0 and ending at index 1.

以上三個(gè)量詞都能找到字符“a”，但是前兩個(gè)例子在下標(biāo)為1處匹配株茶，也就是字符的結(jié)尾處来涨。記住，匹配器查找到下標(biāo)0和1之間的“a”启盛，該程序會(huì)一直匹配到?jīng)]有匹配為止蹦掐。

接下來(lái)輸入"ababaaaab",看下會(huì)得到什么輸出技羔。輸出如下：
Enter your regex: a?
Enter input string to search: ababaaaab
I found the text "a" starting at index 0 and ending at index 1.
I found the text "" starting at index 1 and ending at index 1.
I found the text "a" starting at index 2 and ending at index 3.
I found the text "" starting at index 3 and ending at index 3.
I found the text "a" starting at index 4 and ending at index 5.
I found the text "a" starting at index 5 and ending at index 6.
I found the text "a" starting at index 6 and ending at index 7.
I found the text "a" starting at index 7 and ending at index 8.
I found the text "" starting at index 8 and ending at index 8.
I found the text "" starting at index 9 and ending at index 9.

Enter your regex: a*
Enter input string to search: ababaaaab
I found the text "a" starting at index 0 and ending at index 1.
I found the text "" starting at index 1 and ending at index 1.
I found the text "a" starting at index 2 and ending at index 3.
I found the text "" starting at index 3 and ending at index 3.
I found the text "aaaa" starting at index 4 and ending at index 8.
I found the text "" starting at index 8 and ending at index 8.
I found the text "" starting at index 9 and ending at index 9.

Enter your regex: a+
Enter input string to search: ababaaaab
I found the text "a" starting at index 0 and ending at index 1.
I found the text "a" starting at index 2 and ending at index 3.
I found the text "aaaa" starting at index 4 and ending at index 8.

讀者可以自己推敲為什么會(huì)得出以上結(jié)果。

如果要限制某個(gè)字符出現(xiàn)的次數(shù)卧抗，可以使用大括號(hào)"{}"藤滥。如：
匹配“aaa”
Enter your regex: a{3}
Enter input string to search: aa
No match found.

Enter your regex: a{3}
Enter input string to search: aaa
I found the text "aaa" starting at index 0 and ending at index 3.

Enter your regex: a{3}
Enter input string to search: aaaa
I found the text "aaa" starting at index 0 and ending at index 3.

對(duì)于第三個(gè)實(shí)例，要注意的是颗味，當(dāng)匹配了前三個(gè)a超陆，后面的匹配和前面3個(gè)a沒(méi)有任何關(guān)系，正則會(huì)繼續(xù)和“aaa”后面的內(nèi)容繼續(xù)嘗試匹配浦马。

被量詞修飾的子表達(dá)式 如：
Enter your regex: (dog){3}
Enter input string to search: dogdogdogdogdogdog
I found the text "dogdogdog" starting at index 0 and ending at index 9.
I found the text "dogdogdog" starting at index 9 and ending at index 18.

Enter your regex: dog{3}
Enter input string to search: dogdogdogdogdogdog
No match found.

對(duì)于第二個(gè)例子时呀，正則表達(dá)式匹配的內(nèi)容應(yīng)該是"do",后面緊跟3個(gè)"g",因此第二個(gè)例子無(wú)法匹配。

再看多一個(gè)例子：
Enter your regex: [abc]{3}
Enter input string to search: abccabaaaccbbbc
I found the text "abc" starting at index 0 and ending at index 3.
I found the text "cab" starting at index 3 and ending at index 6.
I found the text "aaa" starting at index 6 and ending at index 9.
I found the text "ccb" starting at index 9 and ending at index 12.
I found the text "bbc" starting at index 12 and ending at index 15.

Enter your regex: abc{3}
Enter input string to search: abccabaaaccbbbc
No match found.

貪婪模式和厭惡模式和占有模式的區(qū)別
貪婪模式之所以被稱為貪婪模式晶默，是因?yàn)樨澙纺Ｊ綍?huì)盡可能的去匹配更多的內(nèi)容谨娜，如果匹配不成功，將會(huì)進(jìn)行回溯磺陡，直至匹配成功或者不成功趴梢。
看看下面例子：
Enter your regex: .*foo // greedy quantifier
Enter input string to search: xfooxxxxxxfoo
I found the text "xfooxxxxxxfoo" starting at index 0 and ending at index 13.

Enter your regex: .*?foo // reluctant quantifier
Enter input string to search: xfooxxxxxxfoo
I found the text "xfoo" starting at index 0 and ending at index 4.
I found the text "xxxxxxfoo" starting at index 4 and ending at index 13.

Enter your regex: .+foo // possessive quantifier
Enter input string to search: xfooxxxxxxfoo
No match found.
第一個(gè)例子采用貪婪模式，.部分和整個(gè)字符串"xfooxxxxxxfoo"匹配币他，接著正則中foo部分和字符串"xfooxxxxxxfoo"的剩余部分匹配坞靶，即空字串"",發(fā)現(xiàn)匹配不成功。開(kāi)始回溯, .*與"xfooxxxxxxfo"匹配蝴悉，正則中的foo部分和"xfooxxxxxxfo"剩余部分進(jìn)行匹配彰阴，即"o"，發(fā)現(xiàn)不匹配拍冠，繼續(xù)回溯尿这。重復(fù)上訴過(guò)程，直到匹配成功庆杜。由于是貪婪模式射众，一旦成功，將不會(huì)繼續(xù)匹配晃财，匹配終止叨橱。

第二個(gè)例子采用的是厭惡模式（非貪婪模式），剛好和貪婪模式相反拓劝，一開(kāi)始只會(huì)和字符串開(kāi)始位置進(jìn)行匹配雏逾，此例中，即和空字符串""匹配郑临，匹配成功后，正則中的foo部分和字符串中的開(kāi)頭三個(gè)字符"xfo"匹配屑宠，發(fā)現(xiàn)匹配不成功厢洞。.*?開(kāi)始和第一個(gè)字符匹配，即"x",匹配成功，接著正則中的foo和字符串中的"foo"匹配躺翻。至此整個(gè)正則第一次匹配成功丧叽。接著繼續(xù)匹配，接下來(lái)的匹配內(nèi)容為"xxxxxxfoo",采用相同的規(guī)則繼續(xù)匹配,第二次匹配成功的字符串為"xxxxxxfoo"公你。直至整個(gè)字符串被消耗完畢才終止匹配踊淳。

第三個(gè)例子是占有模式。該模式只進(jìn)行一次匹配陕靠。不進(jìn)行回溯嘗試迂尝，在次例中，.*+與"xfooxxxxxxfoo"匹配剪芥，正則中的foo和空字符串""匹配垄开，匹配失敗。將不進(jìn)行回溯嘗試税肪。匹配結(jié)束溉躲。

以上內(nèi)容大部分是翻譯The Java? Tutorials中關(guān)于正則的教程

最后編輯于：2017.12.06 07:12:43

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者

人面猴
序言：七十年代末，一起剝皮案震驚了整個(gè)濱河市益兄，隨后出現(xiàn)的幾起案子锻梳，更是在濱河造成了極大的恐慌，老刑警劉巖净捅，帶你破解...
沈念sama閱讀 217,826評(píng)論 6贊 506
死咒
序言：濱河連續(xù)發(fā)生了三起死亡事件疑枯，死亡現(xiàn)場(chǎng)離奇詭異，居然都是意外死亡灸叼，警方通過(guò)查閱死者的電腦和手機(jī)神汹，發(fā)現(xiàn)死者居然都...
沈念sama閱讀 92,968評(píng)論 3贊 395
救了他兩次的神仙讓他今天三更去死
文/潘曉璐我一進(jìn)店門(mén)，熙熙樓的掌柜王于貴愁眉苦臉地迎上來(lái)古今，“玉大人屁魏，你說(shuō)我怎么就攤上這事∽叫龋” “怎么了氓拼？”我有些...
開(kāi)封第一講書(shū)人閱讀 164,234評(píng)論 0贊 354
道士緝兇錄：失蹤的賣姜人
文/不壞的土叔我叫張陵，是天一觀的道長(zhǎng)抵碟。經(jīng)常有香客問(wèn)我桃漾，道長(zhǎng)，這世上最難降的妖魔是什么拟逮？我笑而不...
開(kāi)封第一講書(shū)人閱讀 58,562評(píng)論 1贊 293
?港島之戀（遺憾婚禮）
正文為了忘掉前任撬统，我火速辦了婚禮，結(jié)果婚禮上敦迄，老公的妹妹穿的比我還像新娘恋追。我一直安慰自己凭迹，他們只是感情好，可當(dāng)我...
茶點(diǎn)故事閱讀 67,611評(píng)論 6贊 392
惡毒庶女頂嫁案：這布局不是一般人想出來(lái)的
文/花漫我一把揭開(kāi)白布苦囱。她就那樣靜靜地躺著嗅绸，像睡著了一般。火紅的嫁衣襯著肌膚如雪撕彤。梳的紋絲不亂的頭發(fā)上鱼鸠，一...
開(kāi)封第一講書(shū)人閱讀 51,482評(píng)論 1贊 302
城市分裂傳說(shuō)
那天，我揣著相機(jī)與錄音羹铅，去河邊找鬼蚀狰。笑死，一個(gè)胖子當(dāng)著我的面吹牛睦裳，可吹牛的內(nèi)容都是我干的造锅。我是一名探鬼主播，決...
沈念sama閱讀 40,271評(píng)論 3贊 418
雙鴛鴦連環(huán)套：你想象不到人心有多黑
文/蒼蘭香墨我猛地睜開(kāi)眼廉邑，長(zhǎng)吁一口氣：“原來(lái)是場(chǎng)噩夢(mèng)啊……” “哼哥蔚！你這毒婦竟也來(lái)了？” 一聲冷哼從身側(cè)響起蛛蒙，我...
開(kāi)封第一講書(shū)人閱讀 39,166評(píng)論 0贊 276
萬(wàn)榮殺人案實(shí)錄
序言：老撾萬(wàn)榮一對(duì)情侶失蹤糙箍，失蹤者是張志新（化名）和其女友劉穎，沒(méi)想到半個(gè)月后牵祟，有當(dāng)?shù)厝嗽跇?shù)林里發(fā)現(xiàn)了一具尸體深夯，經(jīng)...
沈念sama閱讀 45,608評(píng)論 1贊 314
?護(hù)林員之死
正文獨(dú)居荒郊野嶺守林人離奇死亡，尸身上長(zhǎng)有42處帶血的膿包…… 初始之章·張勛以下內(nèi)容為張勛視角年9月15日...
茶點(diǎn)故事閱讀 37,814評(píng)論 3贊 336
?白月光啟示錄
正文我和宋清朗相戀三年诺苹，在試婚紗的時(shí)候發(fā)現(xiàn)自己被綠了咕晋。大學(xué)時(shí)的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片。...
茶點(diǎn)故事閱讀 39,926評(píng)論 1贊 348
活死人
序言：一個(gè)原本活蹦亂跳的男人離奇死亡收奔，死狀恐怖掌呜，靈堂內(nèi)的尸體忽然破棺而出，到底是詐尸還是另有隱情坪哄，我是刑警寧澤质蕉，帶...
沈念sama閱讀 35,644評(píng)論 5贊 346
?日本核電站爆炸內(nèi)幕
正文年R本政府宣布，位于F島的核電站翩肌，受9級(jí)特大地震影響模暗，放射性物質(zhì)發(fā)生泄漏。R本人自食惡果不足惜念祭，卻給世界環(huán)境...
茶點(diǎn)故事閱讀 41,249評(píng)論 3贊 329
男人毒藥：我在死后第九天來(lái)索命
文/蒙蒙一兑宇、第九天我趴在偏房一處隱蔽的房頂上張望。院中可真熱鬧粱坤，春花似錦顾孽、人聲如沸祝钢。這莊子的主人今日做“春日...
開(kāi)封第一講書(shū)人閱讀 31,866評(píng)論 0贊 22
一樁弒父案若厚，背后竟有這般陰謀
文/蒼蘭香墨我抬頭看了看天上的太陽(yáng)。三九已至蜒什，卻和暖如春测秸，著一層夾襖步出監(jiān)牢的瞬間，已是汗流浹背灾常。一陣腳步聲響...
開(kāi)封第一講書(shū)人閱讀 32,991評(píng)論 1贊 269
情欲美人皮
我被黑心中介騙來(lái)泰國(guó)打工霎冯，沒(méi)想到剛下飛機(jī)就差點(diǎn)兒被人妖公主榨干…… 1. 我叫王不留，地道東北人钞瀑。一個(gè)月前我還...
沈念sama閱讀 48,063評(píng)論 3贊 370
代替公主和親
正文我出身青樓沈撞，卻偏偏與公主長(zhǎng)得像，于是被迫代替她去往敵國(guó)和親雕什。傳聞我的和親對(duì)象是個(gè)殘疾皇子缠俺，可洞房花燭夜當(dāng)晚...
茶點(diǎn)故事閱讀 44,871評(píng)論 2贊 354

正則表達(dá)式中的“量詞”入門(mén)介紹

推薦閱讀更多精彩內(nèi)容