[sane-devel] Using Find Command in XSane PDF File

Johannes Meixner jsmeix at suse.de
Wed Feb 1 09:00:30 UTC 2017


Hello,

On Jan 31 22:22 Raymond Hanslits wrote (excerpt):
> I have been scanning and creating PDF files in XSane.
> However, I cannot get the Find Command to search for
> words and phrases in PDF files.

Because the scanner device does not produce characters
(no scanner "understands" what it scans) but only pixels,
the data of what is scanned does not contain characters
(or even words or phrases) but it contains only pixels
regardless what data (container) format is used.

See "XSane - Saving scan to text" at
http://lists.alioth.debian.org/pipermail/sane-devel/2017-January/035005.html

The crucial part is the OCR software, cf.
https://en.wikipedia.org/wiki/Optical_character_recognition

Personally I do not use OCR software but as far as I noticed
it makes a difference regarding how good the OCR result is
that appropriate scanning parameters are used specially for OCR.
For example things like black and white scanning at a relatively
low resolution could help to get better OCR results compared to
high resolution photo scanning modes.
Perhaps also the data format of what is scanned could make
a difference (e.g. PNG versus JPEG or even PDF) for OCR.


Kind Regards
Johannes Meixner
-- 
SUSE LINUX GmbH - GF: Felix Imendoerffer, Jane Smithard,
Graham Norton - HRB 21284 (AG Nuernberg)




More information about the sane-devel mailing list