python的re模块的使用（正则表达式）

（python）

import re
re.match('the','the apple') #在'the apple'中正则过滤以the开头的内容

（linux）

>>> import re
>>> re.match('the','the apple') #match 只从开头匹配
<_sre.SRE_Match object at 0x7ff18a8c9a58> #有回应说明已经找到，保存在内存中
>>> re.match('the','apple')
>>> #没有回应说明没有找到
>>> x=re.match('the','the apple')
>>> x.group()
'the'

>>> y=re.search('the','hello the world') #search可以匹配任意位置，但是仅匹配第一个
>>> y.group()
'the'

>>> z=re.findall('the','hello the world the') #findall可以匹配任意位置的所有
>>> z #得到的是列表,没有group这个属性，所以不用加group()
['the', 'the']

用正则表达式统计
>>> abc={'firefox':1}          #定义了‘firefox’的值为1
>>> abc.get('firefox',22)      #查看‘firefox’的值，若是不存在，则返回22，有则返回它对应的值
1     （结果）
>>> abc.get('uc',22)
22    （结果）

统计不同浏览器被访问的次数：
[root@room8pc205 PycharmProjects]# vim access.log
192.168.4.5   /a.html windows   firefox
192.168.4.6   /a.html linux     uc
192.168.4.5   /b.html linux     firefox
192.168.4.7   /a.html windows   ie
192.168.4.5   /b.html windows   firefox
192.168.4.7   /a.html unix      chmore

import re
x=open('access.log')
abc={}
for i in x:
    m=re.search('firefox|uc|ie',i)     #从每一行中查找'firefox|uc|ie',注意search只会找一次需要查找的内容
    if m:                              #当找到后进行判断
        key=m.group()                  #查看检查结果
        abc[key]=abc.get(key,0)+1      #
print(abc)

结果：/usr/bin/python2.7 /root/PycharmProjects/v002/zhengze.py
{'ie': 1, 'firefox': 3, 'uc': 1}

>>> import re
>>> patt=re.compile('foo')   #re.compile会将需要查找的内容编制为二进制，从而提高查找速度
>>> m=patt.match('food')
>>> print m.group()
foo

>>> mylist=re.split('\.|-','hello-world.data')     #re.split可以按照要求把字符分割为一个列表
>>> print mylist
['hello', 'world', 'data']

>>> m=re.sub('a','xxx','a ni hao a hehe') #re.sub用于正则表达式的替换
>>> print m
xxx ni hxxxo xxx hehe

秒客网

python的re模块的使用（正则表达式）

相关文章