《正则表达式》（正则表达式语法大全）

手机扫一扫

文彬编程网编码文章《正则表达式》（正则表达式语法大全）

《正则表达式》（正则表达式语法大全）

编码文章call10242025-08-03 1:05:2823A⁺A^-

一、正则表达式基础

1. 什么是正则表达式？

正则表达式（Regular Expression，简称regex）是一种用于匹配和处理文本的强大工具，通过特殊符号和语法描述字符串模式。
典型场景：

验证邮箱、手机号格式
提取URL、IP地址
替换敏感词
分割字符串

二、Python中的正则表达式模块：re

Python通过re模块支持正则表达式操作，核心方法包括：

re.match()：从字符串开头匹配
re.search()：搜索整个字符串的第一个匹配
re.findall()：返回所有匹配的列表
re.sub()：替换匹配的文本
re.compile()：预编译正则表达式（提升性能）

三、正则表达式语法

1. 基础符号

符号	作用	示例	匹配结果
.	匹配任意字符（除换行符）	a.c → 匹配abc	abc
*	匹配前一个字符0次或多次	ab* → 匹配a、ab、abb	abb
+	匹配前一个字符1次或多次	ab+ → 匹配ab、abb	ab
?	匹配前一个字符0次或1次	colou?r → 匹配color或colour	color
{n}	匹配前一个字符恰好n次	a{3} → 匹配aaa	aaa
{n,m}	匹配前一个字符n到m次	a{2,4} → 匹配aa、aaa、aaaa	aaa
^	匹配字符串开头	^Hello → 匹配以Hello开头的字符串	Hello World
$	匹配字符串结尾	World$ → 匹配以World结尾的字符串	Hello World
[]	匹配字符集	[abc] → 匹配a、b或c	a
[^...]	匹配非字符集	[^0-9] → 匹配非数字字符	a

2. 特殊字符转义

若需匹配特殊字符本身（如.、*），需使用反斜杠\转义：

pattern = r"\."  # 匹配字面意义的点号

3. 分组与捕获

分组：用()包裹子模式，便于提取或重复。

pattern = r"(\d{4})-(\d{2})-(\d{2})"  # 分组匹配日期（年-月-日）
match = re.match(pattern, "2025-07-19")
print(match.group(1))  # 输出：2025

命名分组：为分组命名，便于引用。

pattern = r"(?P<year>\d{4})-(?P<month>\d{2})-(?P<day>\d{2})"
match = re.match(pattern, "2025-07-19")
print(match.group("year"))  # 输出：2025

4. 贪婪与非贪婪匹配

贪婪模式（默认）：尽可能匹配更多字符。

pattern = r"<.*>"  # 匹配`<html>content</html>`中的`<html>content</html>`

非贪婪模式：尽可能匹配更少字符（在量词后加?）。

pattern = r"<.*?>"  # 匹配`<html>content</html>`中的`<html>`和`</html>`

四、实战案例

案例1：验证邮箱格式

import re

email = "test@example.com"
pattern = r"^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+#34;

if re.match(pattern, email):
    print("邮箱格式正确！")
else:
    print("邮箱格式错误！")

案例2：提取URL链接

text = "访问官网： https://www.example.com  或 http://blog.example.com "
pattern = r"https?://\S+"

urls = re.findall(pattern, text)
print(urls)  # 输出：[' https://www.example.com ', ' http://blog.example.com ']

案例3：替换敏感词

text = "这个产品非常棒，但是价格太贵了！"
pattern = r"\b(贵|棒)\b"
replacement = "*" * len(r"\1")

censored_text = re.sub(pattern, replacement, text, flags=re.IGNORECASE)
print(censored_text)  # 输出：这个产品非常**，但是价格太*了！

案例4：分割字符串

text = "apple,banana;orange,grape"
pattern = r"[,;]"  # 用逗号或分号分割

fruits = re.split(pattern, text)
print(fruits)  # 输出：['apple', 'banana', 'orange', 'grape']

五、常见问题与解决方案

1. 如何匹配换行符？

默认情况下，.不匹配换行符。使用re.DOTALL标志启用换行匹配：

pattern = re.compile(r".*", re.DOTALL)
match = pattern.match("line1\nline2")
print(match.group())  # 输出：line1\nline2

2. 如何忽略大小写？

使用re.IGNORECASE（或re.I）标志

pattern = re.compile(r"python", re.IGNORECASE)
match = pattern.match("PYTHON")  # 匹配成功

3. 如何处理中文字符？

直接在正则表达式中包含中文字符即可：

pattern = r"\b(你好|谢谢)\b"
text = "你好，今天天气不错，谢谢！"
matches = re.findall(pattern, text)
print(matches)  # 输出：['你好', '谢谢']

4. 性能优化

预编译正则表达式：对重复使用的模式预编译。

pattern = re.compile(r"\d{4}-\d{2}-\d{2}")
result = pattern.match("2025-07-19")

避免过度使用通配符：减少.*的滥用，提高匹配效率。

六、总结与下一步

核心收获：

掌握正则表达式基础语法（字符集、量词、分组）。
熟悉Python re模块的核心方法（match、search、findall、sub）。
能解决实际问题（邮箱验证、URL提取、敏感词替换）。

点击这里复制本文地址以上内容由文彬编程网整理呈现，请务必在转载分享时注明本文地址！如对内容有疑问，请联系我们，谢谢！

re.find

上一篇：python之re模块（python中re模块的主要方法）

下一篇：Python中re模块详解（python re模块详解）

《正则表达式》（正则表达式语法大全）

一、正则表达式基础

1. 什么是正则表达式？

二、Python中的正则表达式模块：re

三、正则表达式语法

1. 基础符号

2. 特殊字符转义

3. 分组与捕获

4. 贪婪与非贪婪匹配

四、实战案例

案例1：验证邮箱格式

案例2：提取URL链接

案例3：替换敏感词

案例4：分割字符串

五、常见问题与解决方案

1. 如何匹配换行符？

2. 如何忽略大小写？

3. 如何处理中文字符？

4. 性能优化

六、总结与下一步

相关文章