python3 词法拆分

时间:2023-02-02 17:43:23

1.可以利用translate+string模块

2.可以利用jieba进行分词(结巴分词会分成词,但是我需要断句,所以这里不用)

3.利用python内置函数解决

仅仅只列出第3种方法,其他两种方法更加简单点所以就不列出来了。上代码:

 s = input("搜一搜:")
py = zw = sz = fh = ''
py_list = []
zw_list = []
sz_list = []
fh_list = []
py_num = zw_num = sz_num = fh_num = 0
for i in range(len(s)):
if s[i].encode('UTF-8').isalpha():
if(i == py_num or i-py_num==1 or py == ''):
py += s[i]
py_num = i
else:
py_list.append(py)
py = ''+s[i]
py_num = i
elif (s[i].isdigit()):
if (i == sz_num or i - sz_num == 1 or sz == ''):
sz += s[i]
sz_num = i
else:
sz_list.append(sz)
sz = '' + s[i]
sz_num = i
elif (s[i].isalpha()):
if (i == zw_num or i - zw_num == 1 or zw == ''):
zw += s[i]
zw_num = i
else:
zw_list.append(zw)
zw = '' + s[i]
zw_num = i
else:
if (i == fh_num or i - fh_num == 1 or fh == ''):
fh += s[i]
fh_num = i
else:
fh_list.append(fh)
fh = '' + s[i]
fh_num = i
if py not in py_list:
py_list.append(py)
if sz not in sz_list:
sz_list.append(sz)
if zw_list not in zw_list:
zw_list.append(zw)
if fh not in fh_list:
fh_list.append(fh)
print('数字:{}\n中文:{}\n拼音:{}\n符号:{}\n'.format('、'.join(sz_list),'、'.join(zw_list),'、'.join(py_list),''.join(fh_list)))