Python 正則表達(dá)式（三）

前言

前面兩節(jié)已經(jīng)介紹了Python正則表達(dá)式的語法忠蝗，接下來我們來看看 re 模塊中各種函數(shù)的應(yīng)用

常用函數(shù)

1、search

介紹

re.search(pattern, string, flags=0)

pattern: 正則匹配規(guī)則
string: 目標(biāo)字符串
flags: 匹配模式

掃描整個 字符串 找到匹配樣式的第一個位置，并返回一個相應(yīng)的 匹配對象样漆。

如果沒有匹配到晌缘，就返回 None ；注意這和找到一個零長度匹配是不同的亿鲜。

示例

ans = re.search('abc', 'abcdd')
if ans:
    print('Search result: ', ans.group())
else:
    print('No match')
# out: Search result:  abc`

2允蜈、match

介紹

re.match(pattern, string, flags=0)

參數(shù)含義同上

如果 string 開始的0或者多個字符匹配到了正則表達(dá)式，就返回一個相應(yīng)的 匹配對象 蒿柳。

如果沒有匹配到饶套，就返回 None ；注意它跟零長度匹配是不同的垒探。

注意：即使在多行模式下凤跑， re.match()也只匹配字符串的開始位置，而不是匹配每行開始叛复。

如果想在 string 的任何位置搜索仔引，可以使用 search() 來替代

示例

ans = re.match('abc', 'abcdd')
if ans:
    print('match result: ', ans.group())
else:
    print('No match')
# out: Match result:  abc

ans = re.match('abc', 'babcdd')
if ans:
    print('match result: ', ans.group())
else:
    print('No match')
# out: No match`

3扔仓、fullmatch

介紹

re.fullmatch(pattern, string, flags=0)

整個 string 都要匹配到正則表達(dá)式

匹配到就返回一個相應(yīng)的 匹配對象 。否則就返回一個 None

示例

ans = re.fullmatch('abc.dd', 'abcddd')

if ans:
    print('Match result: ', ans.group())
else:
    print('No match')
# out: Match result:  abcddd`

4咖耘、split

介紹

re.split(pattern, string, maxsplit=0, flags=0)

用 pattern 去分割 string 翘簇。

如果在 pattern 中捕獲到括號，那么所有的組里的文字也會包含在列表里儿倒。

maxsplit 設(shè)定最多分隔次數(shù)版保，剩下的字符全部返回到列表的最后一個元素。

示例

# 用非文本字符（字母數(shù)字下劃線）分割
re.split(r'\W+', 'Words, words, words.')
# out: ['Words', 'words', 'words', '']

# 分割字符串也會保留在結(jié)果列表中
re.split(r'(\W+)', 'Words, words, words.')
# out: ['Words', ', ', 'words', ', ', 'words', '.', '']

# 切割一次
re.split(r'\W+', 'Words, words, words.', 1)
# out: ['Words', 'words, words.']

# 以[a-f]之間的字符分割夫否，且不區(qū)分大小寫
re.split('(?i)[a-f]+', '0a3aB9')
re.split('[a-f]+', '0a3aB9', flags=re.IGNORECASE)
# out: ['0', '3', '9']`

5彻犁、findall

介紹

re.findall(pattern, string, flags=0)

從左到右進(jìn)行掃描，匹配按找到的順序返回凰慈。

如果樣式里存在一個或多個組汞幢，就返回一個組合列表

空匹配也會包含在結(jié)果里。

前面兩節(jié)都是使用 findall 微谓，這里便不再舉例啦森篷。

6、finditer

介紹

re.finditer(pattern, string, flags=0)

與 findall 差不多豺型，不一樣的地方是：返回一個包含 匹配對象 的迭代器

示例

for ans in re.finditer(r'\w+', 'Words, words, words.'):
    print(ans.group(), end='\t')
# out: Words words words`

7仲智、sub

介紹

re.sub(pattern, repl, string, count=0, flags=0)

使用 repl 替換 string 中匹配的子串，并返回替換后的字符串姻氨。

如果樣式?jīng)]有找到钓辆，則原樣返回 string。

repl 可以是字符串或函數(shù)

字符串：任何反斜杠轉(zhuǎn)義序列都會被處理肴焊，如 \n 會被轉(zhuǎn)換為一個換行符岩馍，其他未知轉(zhuǎn)義序列例如 \& 會保持原樣。向后引用像是 \2 會用樣式中第 2 組所匹配到的子字符串來替換抖韩。
函數(shù)：那它會對每個非重復(fù)的 pattern 進(jìn)行調(diào)用蛀恩。這個函數(shù)只有一個 匹配對象 參數(shù)，并返回一個替換后的字符串茂浮。

可選參數(shù) count 是要替換的最大次數(shù)双谆，非負(fù)，默認(rèn)全部匹配

示例

re.sub('\w+', '123', 'hello, world, hello python')
# out: '123, 123, 123 123'

re.sub(r'def\s+([a-zA-Z_][a-zA-Z_0-9]*)\s*\(\s*\):',
       r'static PyObject*\npy_\1(void)\n{',
       'def myfunc():')
# out: 'static PyObject*\npy_myfunc(void)\n{'
"""
pattern：匹配 Python 函數(shù)定義
repl: 其中 \1 引用了捕獲的函數(shù)名 myfunc席揽，其他原樣輸出
"""

def dashrepl(matchobj):
    if matchobj.group(0) == '-': 
        return ' '
    else: 
        return '-'

re.sub('-{1,2}', dashrepl, 'pro----gram-files')
# out: 'pro--gram files'`

8顽馋、subn

介紹

re.subn(pattern, repl, string, count=0, flags=0)

與 sub() 相同，但是返回一個元組 (字符串, 替換次數(shù)).

示例

re.subn('\w+', '123', 'hello, world, hello python')
# out: ('123, 123, 123 123', 4)`

總結(jié)

好了好了幌羞，一下子講了這么多函數(shù)寸谜，還沒消化呢吧

今天就先講到這里吧。

咱們明天見吧属桦。

image