从tesseract中的图像中获取文本的确切位置

时间:2021-11-17 01:51:16

Using GetHOCRText(0) method in tesseract I'm able to retrieve the text in html and on presenting the html in webview i'm able get the text but the postion of text in image is different from the output. Any idea is highly helpful.

在tesseract中使用GetHOCRText(0)方法我能够在html中检索文本并在webview中呈现html我能够获得文本但是图像中文本的位置与输出不同。任何想法都非常有帮助。

 tesseract->SetInputName("word");
tesseract->SetOutputName("xyz");
tesseract->Recognize(NULL);


char *utf8Text=tesseract->GetHOCRText(0);

从tesseract中的图像中获取文本的确切位置

and output image从tesseract中的图像中获取文本的确切位置

并输出图像

2 个解决方案

#1


1  

GetBoxText() method will return exact position of each characters in an array.

GetBoxText()方法将返回数组中每个字符的确切位置。

char *boxtext = _tesseract->GetBoxText(0);
NSString* aBoxText = [NSString stringWithUTF8String:boxtext];

#2


1  

If you have the hocr output, you should have a tag for each word. These tags should have class="ocrx_word" and name="bbox x1 y1 x2 y2" where the x and y are the top left and bottom right corner of the bounding box around the word. I don't think it's possible to automatically use this information to format a text document - would require translating pixel differences to number of tabs/spaces. But, you should be able to render text in the given location.

如果你有特定的输出,你应该为每个单词都有一个标记。这些标签应该有class =“ocrx_word”和name =“bbox x1 y1 x2 y2”,其中x和y是单词周围边界框的左上角和右下角。我不认为可以自动使用此信息来格式化文本文档 - 需要将像素差异转换为标签/空格的数量。但是,您应该能够在给定位置呈现文本。

#1


1  

GetBoxText() method will return exact position of each characters in an array.

GetBoxText()方法将返回数组中每个字符的确切位置。

char *boxtext = _tesseract->GetBoxText(0);
NSString* aBoxText = [NSString stringWithUTF8String:boxtext];

#2


1  

If you have the hocr output, you should have a tag for each word. These tags should have class="ocrx_word" and name="bbox x1 y1 x2 y2" where the x and y are the top left and bottom right corner of the bounding box around the word. I don't think it's possible to automatically use this information to format a text document - would require translating pixel differences to number of tabs/spaces. But, you should be able to render text in the given location.

如果你有特定的输出,你应该为每个单词都有一个标记。这些标签应该有class =“ocrx_word”和name =“bbox x1 y1 x2 y2”,其中x和y是单词周围边界框的左上角和右下角。我不认为可以自动使用此信息来格式化文本文档 - 需要将像素差异转换为标签/空格的数量。但是,您应该能够在给定位置呈现文本。