[sane-devel] [ANN] Unpaper - post-processing scanned book-pages

Jens Gulden mail@jensgulden.de
Wed, 02 Mar 2005 16:03:29 +0100


Hello,

unpaper removes black edges and other photocopy artefacts from scanned 
images. It also deskews book pages (auto-rotates them to a straight 
alignment), and centers them on the sheet.
Old photocopies can become well-readable PDFs again.

Available at http://unpaper.berlios.de/.

Hope it's useful. Enjoy,
Jens

 From the Readme:
------
unpaper is a post-processing tool for scanned sheets of paper, 
especially for book pages that have been scanned from previously created 
photocopies.
The main purpose is to make scanned book pages better readable on screen
after conversion to PDF. Additionally, unpaper might be useful to 
enhance the quality of scanned pages before performing optical character 
recognition (OCR).

unpaper tries to clean scanned images by removing dark edges that 
appeared through scanning or copying on areas outside the actual page 
content (e.g. dark areas between the left-hand-side and the 
right-hand-side of a double-sided book-page scan).
The program also tries to detect disaligned centering and rotation of 
pages and will automatically straighten each page by rotating it to the 
correct angle. This is called "deskewing".
Note that the automatic processing will sometimes fail. It is always a 
good idea to manually control the results of unpaper and adjust the 
parameter settings according to the requirements of the input. Each 
processing step can also be disabled individually for each sheet.

Input and output files can be in either .pbm or .pgm format, as also 
used by the Linux scanning tools scanimage and scanadf.
Conversion to PDF can e.g. be achieved with the Linux tools pgm2tiff, 
tiffcp and tiff2pdf.
------

(It's a small program with 1 single source file only. Almost too small 
for being an open-source project on its own. If you have ideas to 
integrate it into other Linux scanning/graphics projects instead, please 
let me know.)