python 正则表达式

语法
一些函数
工具网站

语法

.	any character except newline
\w \d \s	word, digit, whitespace
\W \D \S	not word, digit, whitespace
[abc]	any of a, b, or c
[^abc]	not a, b, or c
[a-g]	character between a & g
Anchors
^abc$	start / end of the string
\b	word boundary
Escaped characters
\. \* \\	escaped special characters
\t \n \r	tab, linefeed, carriage return
\u00A9	unicode escaped ©
Groups & Lookaround
(abc)	capture group
\1	backreference to group #1
(?:abc)	non-capturing group
(?=abc)	positive lookahead
(?!abc)	negative lookahead
Quantifiers & Alternation
a* a+ a?	0 or more, 1 or more, 0 or 1
a{5} a{2,}	exactly five, two or more
a{1,3}	between one & three
a+? a{2,}?	match as few as possible
ab|cd	match ab or cd

一些函数

re.findall(pattern, str) 返回匹配的字符串列表，pattern中使用( )可以只返回( )中的部分
re.match(pattern, string , flags) 从字符串的开始匹配，返回一个匹配对象，失败返回 None，常用于整句匹配（从头匹配）
re.search(pattern, str) 返回第一个匹配对象，可以使用( ), res.group(n), res.span(n), res.start(n), res.end(n)
re.sub(pattern, repl, str) 替换，可以使用 ( ) + \1 反向引用
re.compile(pattern, flags) 编译正则表达式，保存以便后用，避免重复编写，设置参数 flags 如下，可多选 flags = re.M|re.S
- re.I 忽略大小写
- re.M 多行模式，换行重新开始匹配
- re.S .号可匹配一切字符包括换行符
- re.DOTALL 效果同 re.S 适用于跨行匹配的情况
re.split(pattern, string, maxsplit, flags) 匹配的子串来分割字符串，返回 List，maxsplit 设置分隔次数（分隔）

re.sub 示例

s = "don ' t be , shy ."
s = re.sub(r'\s\'\s', "'", s)
s = re.sub(r'\s([\.\!\?\,])', r"\1", s)
print(s)

don't be, shy.

工具网站

https://rubular.com/r/hinmQKDuRO

python 正则表达式

语法

一些函数

工具网站

Similar Posts

Comments