使用Python Tesseract从图像中获取文本,但会得到错误

时间:2021-11-17 01:51:34

I'm attempting to use to Python Tesseract to get text fron an image on my macos desktop and am running into an error that I cannot figure out. I'm running macos High Sierra 10.3.2

我正在尝试使用Python Tesseract来在macos桌面的图像上获取文本fron,并遇到了一个我无法解决的错误。我运行的是macos High Sierra 10.2

My directory is set to my desktop (where the image lives) and I already specified the path to my tesseract executable.

我的目录被设置为我的桌面(图像所在的地方),我已经指定了到我的tesseract可执行文件的路径。

I'm running

我在跑步

print(pytesseract.image_to_string(Image.open('test.png')) 

and getting the following error:

并得到以下错误:

File "/Users/name/anaconda2/lib/python2.7/site-packages/pytesseract/pytesseract.py", line 140, in run_and_get_output
    run_tesseract(**kwargs)
  File "/Users/name/anaconda2/lib/python2.7/site-packages/pytesseract/pytesseract.py", line 116, in run_tesseract
    raise TesseractError(status_code, get_errors(error_string))
pytesseract.pytesseract.TesseractError: (1, u'File "/var/folders/cp/dg2snlxn2631h8jx1bwb7jk80000gn/T/tess_cK4lka.PNG", line 1 SyntaxError: Non-ASCII character \'\\x89\' in file /var/folders/cp/dg2snlxn2631h8jx1bwb7jk80000gn/T/tess_cK4lka.PNG on line 1, but no encoding declared; see http://python.org/dev/peps/pep-0263/ for details')

Any idea what might be causing this and how to get around it? Would be happy to provide any clarifying details.

你知道是什么引起的吗?我们很乐意提供任何澄清的细节。

Thanks!

谢谢!

2 个解决方案

#1


0  

Seems like you are trying to render a Non-ASCII character. Try adding this to the top of your .py file to ensure UTF-8 encoding:

似乎您正在尝试呈现非ascii字符。尝试将此添加到.py文件的顶部,以确保UTF-8编码:

# -*- coding: utf-8 -*- 

As stated by the error message, see this for more details.

如错误消息所述,请参见此了解更多细节。

#2


0  

User the unidecode library

用户unidecode图书馆

from unidecode import unidecode
    .
    .
    .
    print unidecode(pytesseract.image_to_string(Image.open('test.png')))

#1


0  

Seems like you are trying to render a Non-ASCII character. Try adding this to the top of your .py file to ensure UTF-8 encoding:

似乎您正在尝试呈现非ascii字符。尝试将此添加到.py文件的顶部,以确保UTF-8编码:

# -*- coding: utf-8 -*- 

As stated by the error message, see this for more details.

如错误消息所述,请参见此了解更多细节。

#2


0  

User the unidecode library

用户unidecode图书馆

from unidecode import unidecode
    .
    .
    .
    print unidecode(pytesseract.image_to_string(Image.open('test.png')))