目標(biāo)訴求,從類似"我的身份證***,我的電話是******,請不要找我苇经。"這樣的字符串提取身份證和電話回怜。找遍中英文網(wǎng)絡(luò),發(fā)現(xiàn)都是validate(驗(yàn)證)而不是extract(提扔嗟琛)。最后在PHP官方文檔的正則說明中找到解答PHP: preg_match - Manual
先給解決辦法
$string="我的身份證***,我的電話是******,請不要找我跨算。";
$string = preg_replace('/\s+/', '', $string); //remove all whitespace including tabs and line ends
$patternForPhone = '/(?:\+?86)?1(?:3\d{3}|5[^4\D]\d{2}|8\d{3}|7(?:[235-8]\d{2}|4(?:0\d|1[0-2]|9\d))|9[0-35-9]\d{2}|66\d{2})\d{6}/'; //pattern for chinese mobile phone爆土,移動(dòng)電話(手機(jī))的正則表達(dá)式
$patternForID = '/([1-6][1-9]|50)\d{4}(18|19|20)\d{2}((0[1-9])|10|11|12)(([0-2][1-9])|10|20|30|31)\d{3}[0-9Xx]/i'; //pattern for ID, 身份證的正則表達(dá)式
preg_match($patternForPhone, $string, $phones);
preg_match($patternForID, $string, $IDs);
var_dump($phones, $IDs);
再引用原文:元字符^
和 $
是產(chǎn)生只能驗(yàn)證
而不能提取
的根本原因。
Simple regex
Regex quick reference
[abc] A single character: a, b or c
[^abc] Any single character but a, b, or c
[a-z] Any single character in the range a-z
[a-zA-Z] Any single character in the range a-z or A-Z
^ Start of line
$ End of line
\A Start of string
\z End of string
. Any single character
\s Any whitespace character
\S Any non-whitespace character
\d Any digit
\D Any non-digit
\w Any word character (letter, number, underscore)
\W Any non-word character
\b Any word boundary character
(...) Capture everything enclosed
(a|b) a or b
a? Zero or one of a
a* Zero or more of a
a+ One or more of a
a{3} Exactly 3 of a
a{3,} 3 or more of a
a{3,6} Between 3 and 6 of a
options: i case insensitive m make dot match newlines x ignore whitespace in regex o perform #{...} substitutions only once