基礎(chǔ)
相對(duì)其他語(yǔ)言秩彤,Java對(duì)反斜線\有不同的處理与柑。在其他語(yǔ)言中,\\表示“在正則表達(dá)式中插入普通的反斜線涡戳,所以不要給他任何特殊意義”,而在Java中渔彰,\\表示“插入一個(gè)正則表達(dá)式的反斜線,所以后面的字符具有特殊意義”舔稀。例如乳丰,Java中如果表示一個(gè)數(shù)字,那么正則表達(dá)式是\\d内贮,而其他語(yǔ)言則是\d。
String中的正則表達(dá)式
String中有3個(gè)方法可以使用正則表達(dá)式夜郁,分別是
//判斷字符串是否匹配正則表達(dá)式什燕,匹配返回true屎即,不然返回false
boolean matches(String regex)
//用給定的正則表達(dá)式切割字符串,注意與正則表達(dá)式匹配的部分技俐,在最終結(jié)果中都不存在了
String[] split(String regex)
//限定切割次數(shù)
String[] split(String regex, int limit)
//替換所有匹配到的字符串
String replaceAll(String regex, String replacement)
//替換掉第一個(gè)匹配到的字符串
String replaceFirst(String regex, String replacement)
Pattern和Matcher
導(dǎo)入java.util.regex,然后用static Pattern.compile()方法來(lái)編譯正則表達(dá)式贱勃。他會(huì)根據(jù)String類型的正則表達(dá)式生成一個(gè)Pattern對(duì)象井赌,接下來(lái)把想要檢索的字符串傳入Pattern對(duì)象的matcher()方法,matcher()方法會(huì)生成一個(gè)Matcher對(duì)象,之后就可以調(diào)用Matcher對(duì)象里面的方法匹配仪缸。例如:
Pattern p = Pattern.compile("a*b"); //編譯正則表達(dá)式
Matcher m = p.matcher("aaaaab"); //要檢索的字符串
/**
*Attempts to match the entire region against the pattern.
*If the match succeeds then more information can be obtained via the start, end, and group methods.
*/
boolean b = m.matches();
下面的代碼跟上面實(shí)現(xiàn)了同樣功能
boolean b = Pattern.matches("a*b", "aaaaab");
CharSequence
接口CharSequence從CharBuffer贵涵、String、StringBuilder恰画、StringBuffer類中抽象出了字符序列的一般化定義:
interface CharSequence {
charAt(int i);
length();
subSequence(int start, int end);
toString();
}
因此下面Pattern和Matcher的一些方法是以CharSequence對(duì)象作為參數(shù)宾茂。
Pattern
Patten類的定義
public final class Pattern
extends Object
implements Serializable
Pattern對(duì)象表示編譯后的正則表達(dá)式。由于Pattern是final的拴还,所以是不可變的跨晴,線程安全的。
一些常用的方法:
編譯正則表達(dá)式
public static Pattern compile(String regex)
public static Pattern compile(String regex, int flags)
flags表示編譯標(biāo)記片林,一共有9種端盆,他們都是final static int類型的,分別是
- Pattern.CANON_EQ
- Pattern.CASE_INSENSITIVE(?i)
- Pattern.COMMENTS(?x)
- Pattern.DOTALL(?s)
- Pattern.LITERAL
- Pattern.MULTILINE(?m)
- Pattern.UNICODE_CASE(?u)
- Pattern.UNICODE_CHARACTER_CLASS(?U)
- Pattern.UNIX_LINES(?d)
后面括號(hào)字符表示當(dāng)插入到字符串里面會(huì)被識(shí)別到而啟動(dòng)這種模式(注意:可以插入到任何位置)费封。例如:
Matcher m = Pattern.compile("(?m)(\\\\S+)\\\\s+((\\\\S+)\\\\s+(\\\\S+))$").matcher("I love Java");
等于
Matcher m = Pattern.compile("(\\\\S+)\\\\s+((\\\\S+)\\\\s+(\\\\S+))$", Pattern.MULTILINE).matcher("I love Java");
生成Matcher對(duì)象
public Matcher matcher(CharSequence input)
匹配操作
public static boolean matches(String regex, CharSequence input)
public String[] split(CharSequence input, int limit)
public String[] split(CharSequence input)
Matcher
Matcher類的定義
public final class Matcher
extends Object
implements MatchResult
Matcher是通過(guò)解釋Pattern而在CharSequence上執(zhí)行匹配操作的一個(gè)引擎焕妙。通過(guò)調(diào)用Pattern.matcher()方法可以生產(chǎn)一個(gè)Matcher對(duì)象。創(chuàng)建之后弓摘,Matcher對(duì)象就可以執(zhí)行三種匹配操作:
- The
matches
method attempts to match the entire input sequence against the pattern.(只有在整個(gè)輸入都匹配正則表達(dá)式時(shí)才會(huì)返回true) - The
lookingAt
method attempts to match the input sequence, starting at the beginning, against the pattern.(字符串開(kāi)始處匹配正則表達(dá)式就返回true焚鹊,否則返回false) - The
find
method scans the input sequence looking for the next subsequence that matches the pattern.
這三個(gè)方法都返回boolean標(biāo)志表明匹配成功或者失敗。如果成功了就可以調(diào)用其他方法進(jìn)行各種操作韧献。
一些常用的方法:
匹配操作
/**
*Attempts to match the entire region against the pattern.
*If the match succeeds then more information can be obtained via the start, end, and group methods.
*
*Returns:
*true if, and only if, the entire region sequence matches this matcher's pattern
*/
public boolean matches()
/**
*Attempts to find the next subsequence of the input sequence that matches the pattern.
*This method starts at the beginning of this matcher's region, or, if a previous invocation of the method was successful and the matcher has not since been reset, at the first character not matched by the previous match.
*
*If the match succeeds then more information can be obtained via the start, end, and group methods.
*
*Returns:
*true if, and only if, a subsequence of the input sequence matches this matcher's pattern
*/
public boolean find()
public boolean find(int start)
/**
*Attempts to match the input sequence, starting at the beginning of the region, against the pattern.
*Like the matches method, this method always starts at the beginning of the region; unlike that method, it does not require that the entire region be matched.
*
*If the match succeeds then more information can be obtained via the start, end, and group methods.
*
*Returns:
*true if, and only if, a prefix of the input sequence matches this matcher's pattern
*/
public boolean lookingAt()
組(Groups)
組是用括號(hào)劃分的正則表達(dá)式末患,可以根據(jù)組的編號(hào)來(lái)引用某個(gè)組。組號(hào)為0表示整個(gè)表達(dá)式锤窑,組號(hào)1表示被第一個(gè)括號(hào)括起的組璧针,以此類推。例如:
A(B(C))D
中有3個(gè)組:組0是ABCD渊啰,組1是BC探橱,組2是C。
//返回該匹配器的模式中的分組數(shù)目虽抄,第0組不包括在內(nèi)走搁。在group、start和end方法中可以使用小于此數(shù)目的數(shù)字作為numth參數(shù)
public int groupCount()
//返回前一次匹配(例如find())操作的第0組(整個(gè)匹配)
public String group()
//返回編號(hào)為num的捕獲型括號(hào)匹配的內(nèi)容迈窟,如果匹配成功私植,但是指定的組沒(méi)有匹配輸入字符串的任何部分,則將會(huì)返回null
public String group(int group)
public String group(String name)
//返回這個(gè)匹配起點(diǎn)的絕對(duì)偏移值车酣,start()就等于start(0)
public int start()
//返回編號(hào)為第group的捕獲型括號(hào)所匹配文本的起點(diǎn)在目標(biāo)字符串中的絕對(duì)偏移值——即從目標(biāo)字符串起始位置開(kāi)始計(jì)算的偏移值曲稼。如果匹配型括號(hào)沒(méi)有參與匹配索绪,則返回-1
public int start(int group)
public int end()
//放回在前一次匹配操作中尋找到的組的最后一個(gè)字符索引加一的值
public int end(int group)
查找與替換
public String replaceAll(String replacement)
public String replaceFirst(String replacement)
public static String quoteReplacement(String s)
設(shè)置與修改的方法
- MatchResult對(duì)象
//返回的MatchResult對(duì)象封裝了當(dāng)前匹配的信息。
public MatchResult toMatchResult()
- Pattern對(duì)象
//更改為新的Pattern對(duì)象
public Matcher usePattern(Pattern newPattern)
//返回目前所用的Pattern對(duì)象
public Pattern pattern()
- 目標(biāo)字符串(或其他CharSequence對(duì)象)
//重置回原來(lái)整個(gè)字符串
public Matcher reset()
//設(shè)置新的字符串
public Matcher reset(CharSequence input)
- 目標(biāo)字符串的檢索范圍
//設(shè)置新的字符檢索范圍
public Matcher region(int start, int end)
//返回當(dāng)前字符檢索范圍起始
public int regionStart()
//返回當(dāng)前字符檢索范圍結(jié)束
public int regionEnd()
- anchoring bounds標(biāo)志位
public Matcher useAnchoringBounds(boolean b)
public boolean hasAnchoringBounds()
- transparent bounds標(biāo)志位
public Matcher useTransparentBounds(boolean b)
public boolean hasTransparentBounds()
只讀屬性
- 目標(biāo)字符串中的match pointer或current pointer贫悄,用于支持“尋找下一個(gè)匹配”的操作瑞驱。
- 目標(biāo)字符串的append pointer,在查找-替換操作中窄坦,復(fù)制未匹配的文本部分時(shí)使用唤反。