[sane-devel] ghostscript so much better with pdf files than xsane

jazz_johnson at verizon.net jazz_johnson at verizon.net
Sat Dec 1 00:33:27 UTC 2007


> The filesize of as pdf saved scans is enourmous: one 
page scans often
> take more than 14 Mb of filespace.

I routinely routinely use Xsane to scan documents 
directly to pnm
(Portable aNy Map) format, which I then convert to 
muli-page pdf
files using netpbm tools. If the ink on a document is 
faded or non-uniform,
I ususally will scan @256 shades [1byte/pixel] of gray 
and then convert
to black & white [2bits/pixel]. A typical scan would 
be ~40-60K/page.
I wrote a shell script "scans2pdf" to filter the pnm 
to pdf.

http://www.acjlaw.net:8080/~jeremy/Ricoh/scripts/scans2pdf

If I have a series of grayscale scans in pnm format 
named out.%04d.pnm
I would convert the pnm images to a single multi-page 
pdf thus:
#scans2pdf -bw 0.6 out
which would create out.pdf and then remove the 
out.*.pnm files

There is also a program "tic98" which will compress 
pnm text documents
using a dictionary of the scanned characters. It 
reportedly offers the
best compression for text-based images. The compressed 
pnm can then be
uncompressed (using the same settings used for 
compression) and then
converted to pdf. Unfortunately, if you are emailing 
the pnm.tic98 file
to someone, they must install untic98 to decompress 
the file and pip to
read/write file descriptors -- so it's not as easy as 
sharing a pdf.




More information about the sane-devel mailing list