admin管理员组

文章数量:1023596

I want to recognize a hadwritten digit in a binary image.

I planned on using tesseract ocr, but I could never get the accuracy above 50%. Here is a part of the code I used:

plt.imshow(roi,cmap='gray')
plt.axis('off')
plt.show()

text = pytesseract.image_to_string(roi, config='--psm 10')
print(text)

And the image drawn: digit_4

The text was incorrect most of the time - in the case above, it was '+'. Other incorrect answers included '4.', 'UL', and 'A'. I originally had text = pytesseract.image_to_string(roi, config='--psm 10 digits') but removed the digits setting after seeing half of the text come out as blank.

  1. How can I improve the accuracy?
  2. Why does the ocr output multiple characters when it is set to recognize a single character?

(I am using version 4.1.1 pytesseract)

I want to recognize a hadwritten digit in a binary image.

I planned on using tesseract ocr, but I could never get the accuracy above 50%. Here is a part of the code I used:

plt.imshow(roi,cmap='gray')
plt.axis('off')
plt.show()

text = pytesseract.image_to_string(roi, config='--psm 10')
print(text)

And the image drawn: digit_4

The text was incorrect most of the time - in the case above, it was '+'. Other incorrect answers included '4.', 'UL', and 'A'. I originally had text = pytesseract.image_to_string(roi, config='--psm 10 digits') but removed the digits setting after seeing half of the text come out as blank.

  1. How can I improve the accuracy?
  2. Why does the ocr output multiple characters when it is set to recognize a single character?

(I am using version 4.1.1 pytesseract)

本文标签: pythonPytesseract OCR not recognizing digits in clean binary imageStack Overflow