Python CookBook總結(jié)
用Shell 通配符匹配字符串
你想使用Unix Shell 中常用的通配符(比如.py , Dat[0-9].csv 等) 去匹配文
本字符串
>>> from fnmatch import fnmatch, fnmatchcase
>>> fnmatch('foo.txt', '*.txt')
True
>>> fnmatch('foo.txt', '?oo.txt')
True
>>> fnmatch('Dat45.csv', 'Dat[0-9]*')
True
>>> names = ['Dat1.csv', 'Dat2.csv', 'config.ini', 'foo.py']
>>> [name for name in names if fnmatch(name, 'Dat*.csv')]
['Dat1.csv', 'Dat2.csv']
fnmatch() 函數(shù)使用底層操作系統(tǒng)的大小寫敏感規(guī)則(不同的系統(tǒng)是不一樣的) 來(lái)
匹配模式。比如:
>>> # On OS X (Mac)
>>> fnmatch('foo.txt', '*.TXT')
False
>>> # On Windows
>>> fnmatch('foo.txt', '*.TXT')
True
字符串匹配和搜索
如果你想匹配的是字面字符串钉跷,那么你通常只需要調(diào)用基本字符串方法就行
>>> text = 'yeah, but no, but yeah, but no, but yeah'
>>> # Exact match
>>> text == 'yeah'
False
>>> # Match at start or end
>>> text.startswith('yeah')
True
>>> text.endswith('no')
False
>>> # Search for the location of the first occurrence
>>> text.find('no')
10
更復(fù)雜一些答渔,需要使用正則表達(dá)式模塊re
>>> text1 = '11/27/2012'
>>> import re
>>> re.match(r'\d+/\d+/\d+', text1):
>>> datepat = re.compile(r'\d+/\d+/\d+')
>>> datepat.match(text1)
>>> text = 'Today is 11/27/2012. PyCon starts 3/13/2013.'
>>> datepat.findall(text)
['11/27/2012', '3/13/2013']
字符串搜索和替換
對(duì)于簡(jiǎn)單的字面模式勤篮,直接使用str.repalce() 方法即可,比如:
>>> text = 'yeah, but no, but yeah, but no, but yeah'
>>> text.replace('yeah', 'yep')
'yep, but no, but yep, but no, but yep'
對(duì)于復(fù)雜的模式乓旗,請(qǐng)使用re 模塊中的sub() 函數(shù)呵哨。為了說(shuō)明這個(gè),假設(shè)你想將形
式為11/27/2012 的日期字符串改成2012-11-27 为鳄。示例如下:
>>> text = 'Today is 11/27/2012. PyCon starts 3/13/2013.'
>>> import re
>>> re.sub(r'(\d+)/(\d+)/(\d+)', r'\3-\1-\2', text)
'Today is 2012-11-27. PyCon starts 2013-3-13.'
#如果你打算用相同的模式做多次替換,考慮先編譯它來(lái)提升性能
>>> datepat = re.compile(r'(\d+)/(\d+)/(\d+)')
>>> datepat.sub(r'\3-\1-\2', text)
'Today is 2012-11-27. PyCon starts 2013-3-13.'
#對(duì)于更加復(fù)雜的替換腕让,可以傳遞一個(gè)替換回調(diào)函數(shù)來(lái)代替
>>> from calendar import month_abbr
>>> def change_date(m):
... mon_name = month_abbr[int(m.group(1))]
... return '{} {} {}'.format(m.group(2), mon_name, m.group(3))
...
>>> datepat.sub(change_date, text)
'Today is 27 Nov 2012. PyCon starts 13 Mar 2013.'
如果除了替換后的結(jié)果外孤钦,你還想知道有多少替換發(fā)生了,可以使用re.subn()來(lái)代替记某。比如:
>>> newtext, n = datepat.subn(r'\3-\1-\2', text)
>>> newtext
'Today is 2012-11-27. PyCon starts 2013-3-13.'
>>> n
2
你需要以忽略大小寫的方式搜索與替換文本字符串
>>> text = 'UPPER PYTHON, lower python, Mixed Python'
>>> re.findall('python', text, flags=re.IGNORECASE)
['PYTHON', 'python', 'Python']
>>> re.sub('python', 'snake', text, flags=re.IGNORECASE)
'UPPER snake, lower snake, Mixed snake'
最短匹配模式
#*是貪婪的,會(huì)盡可能多的匹配
>>> str_pat = re.compile(r'\"(.*)\"')
>>> text1 = 'Computer says "no."'
>>> str_pat.findall(text1)
['no.']
>>> text2 = 'Computer says "no." Phone says "yes."'
>>> str_pat.findall(text2)
['no." Phone says "yes.']
#?是不貪婪的构捡,盡可能少的匹配
>>> str_pat = re.compile(r'\"(.*?)\"')
>>> str_pat.findall(text2)
['no.', 'yes.']
字符串對(duì)齊
#使用字符串的ljust() , rjust() 和center()方法
>>> text = 'Hello World'
>>> text.ljust(20)
'Hello World '
>>> text.rjust(20)
' Hello World'
>>> text.center(20)
' Hello World '
>>> text.rjust(20,'=')
'=========Hello World'
>>> text.center(20,'*')
'****Hello World*****'
#函數(shù)format() 同樣可以用來(lái)很容易的對(duì)齊字符串液南。
#你要做的就是使用<,> 或者? 字符后面緊跟一個(gè)指定的寬度。
>>> format(text, '>20')
' Hello World'
>>> format(text, '<20')
'Hello World '
>>> format(text, '^20')
' Hello World '
>>> format(text, '=>20s')
'=========Hello World'
>>> format(text, '*^20s')
'****Hello World*****'
以指定列寬格式化字符串
s = "Look into my eyes, look into my eyes, the eyes, the eyes, \
the eyes, not around the eyes, don't look around the eyes, \
look into my eyes, you're under."
>>> import textwrap
>>> print(textwrap.fill(s, 70))
Look into my eyes, look into my eyes, the eyes, the eyes, the eyes,
not around the eyes, don't look around the eyes, look into my eyes,
you're under.
>>> print(textwrap.fill(s, 40))
Look into my eyes, look into my eyes,
the eyes, the eyes, the eyes, not around
the eyes, don't look around the eyes,
look into my eyes, you're under.
>>> print(textwrap.fill(s, 40, initial_indent=' '))