Python-图片文字识别

时间:2021-04-18 16:20:49

  百度AI接口(手写文字识别):https://ai.baidu.com/docs#/OCR-API/9ef46660

  实现效果:

      

Python-图片文字识别

  步骤一:接入接口

  进入上述网站申请账号,然后运行相关代码,获取 access_token 即算完成(由于百度json每30天更新一次,故代码中进行日期更新了的,如何获取accss_token也可见代码)

  

  步骤二:功能介绍:用户输入的图片路径可为网络上的url,也可为本机上的地址,为图省事,图片名称为 ValidateCode.jpg ,由于本人接入的的百度AI接口的手写文字识别,所以一般的验证码应该都可以通过,如果想加入其它功能,那么返回json数据就会有所改变,具体可以见API接口,本人是为了简化理解百度文档介绍

      request.urlretrieve(imagepath, 'ValidateCode.jpg') # 下载图片

    更新access_json:因为百度API规定:30天更新一次,所以我就把时间提前了。(别乱搞我的密钥呀,我也是为了分享呀QAQ)

 2   def accesjson():
flag = 0
fromtime = 1546061002 #起始时间
nowtime = int(time.time()) #2592000恰好为30天,故提前
if nowtime - fromtime > 2000000:
flag = 1
gcontext = ssl.SSLContext(ssl.PROTOCOL_TLSv1)
# client_id 为官网获取的AK, client_secret 为官网获取的SK
host = 'https://aip.baidubce.com/oauth/2.0/token?grant_' \
'type=client_credentials&client_id=Ooj730ZD0Rm7E1dmcPwoZX9s&client_secret=dr5T1icZGqK8ZFyTr4wi2AWbtNKMIsNd'
req = request.Request(host)
response = request.urlopen(req, context=gcontext).read().decode('UTF-8')
result = json.loads(response)
if flag == 1:
return result
else:
return None

    图片正式识别:注意,接入功能不一样,放回json数据不一样,具体看返回json就明白了

    

 #返回图片验证码
def vercode():
f = open('ValidateCode.jpg', 'rb')
img = base64.b64encode(f.read())
#不同百度API接口不一样,传递参数不一样,返回json也不一样
host = 'https://aip.baidubce.com/rest/2.0/ocr/v1/handwriting'
headers = {
'Content-Type': 'application/x-www-form-urlencoded'
}
#更换json
if accesjson() == None:
access_token = '24.18591b2e4c97956e0f830db9f66e5373.2592000.1548646630.282335-15301065'
else:
access_token = accesjson()
print('已更换最新json,欢迎继续使用!')
host = host + '?access_token=' + access_token data = {}
data['access_token'] = access_token
data['image'] = img
res = requests.post(url=host, headers=headers, data=data)
req = res.json()
return req['words_result'][0]['words']

    完整代码:目前可实现的功能就是网络上面的文字图片识别,或本机图片识别(和之前的抖音图片加载类似。)

    拓展:https://ai.qq.com/   (啥B腾讯的API接口,全是PHP,用都知道怎么用,凉凉。)

 #!/usr/bin/env python
# -*- coding: utf-8 -*-
# @Time : 2018/12/29 10:48
# @Author : Empirefree
# @File : 17-2-验证码.py
# @Software: PyCharm Community Edition import base64
import requests
from urllib import request
import os
import ssl
import json
import time
import re def IsHttp(imagepath):
if re.search('http', imagepath) != None:
return 1
else:
return 0 #下载验证码
def downloadpic(imagepath):
# imagepath = "http://210.42.38.26:84/jwc_glxt/ValidateCode.aspx"
if IsHttp(imagepath):
request.urlretrieve(imagepath, 'ValidateCode.jpg') # 下载图片 print(os.path.abspath('ValidateCode.jpg')) #百度限制,每30天更换一次access_json
def accesjson():
flag = 0
fromtime = 1546061002 #起始时间
nowtime = int(time.time()) #2592000恰好为30天,故提前
if nowtime - fromtime > 2000000:
flag = 1
gcontext = ssl.SSLContext(ssl.PROTOCOL_TLSv1)
# client_id 为官网获取的AK, client_secret 为官网获取的SK
host = 'https://aip.baidubce.com/oauth/2.0/token?grant_' \
'type=client_credentials&client_id=Ooj730ZD0Rm7E1dmcPwoZX9s&client_secret=dr5T1icZGqK8ZFyTr4wi2AWbtNKMIsNd'
req = request.Request(host)
response = request.urlopen(req, context=gcontext).read().decode('UTF-8')
result = json.loads(response)
if flag == 1:
return result
else:
return None #返回图片验证码
def vercode():
f = open('ValidateCode.jpg', 'rb')
img = base64.b64encode(f.read())
#不同百度API接口不一样,传递参数不一样,返回json也不一样
host = 'https://aip.baidubce.com/rest/2.0/ocr/v1/handwriting'
headers = {
'Content-Type': 'application/x-www-form-urlencoded'
}
#更换json
if accesjson() == None:
access_token = '24.18591b2e4c97956e0f830db9f66e5373.2592000.1548646630.282335-15301065'
else:
access_token = accesjson()
print('已更换最新json,欢迎继续使用!')
host = host + '?access_token=' + access_token data = {}
data['access_token'] = access_token
data['image'] = img
res = requests.post(url=host, headers=headers, data=data)
req = res.json()
return req['words_result'][0]['words'] def checkcode():
imagepath = input('请输入您的图片路径: ')
downloadpic(imagepath)
str = vercode()
return str if __name__ == '__main__': str = checkcode()
print(str)