This is a tool for processing scanned documents in tiff format.
It works with multipage tiffs and offers two functions:
The tool builds on LibTIFF.
- extracting two text pages (or one text page) from one image
as shown above,
- noise filtering so that the documents get smaller
This is a hack (I have modified tiffcp)
with limited functioning and no
working guarantee at all! In particular, it requires
CCITT-compressed b/w images (the standard way
how monochrome images such as scans are stored).
It works with Windows (precompiled
available) and Unix/Linux. The interface is
only command line.
- Windows binary
- Complete source building
on LibTIFF-3.7.0 (tiffprocess is in the tools directory, it is not an
official and maintained contribution in the contrib folder)
Compiling on Unix/Linux:
Compiling with MSVC/Windows: input "nmake /f Makefile.vc tiffprocess"
- execute "./configure"
- execute "make" or "make tiffprocess",
- and make sure that library paths are correctly set
Usage: tiffprocess input-file output-file
(without arguments, help is displayed)
Tiffprocess prompts the following inputs from the user:
- split pages or not
- source height, e.g. height of the scanned document
(this is mainly for increasing robustness and can be
left blank in many cases
- target height, i.e. normally height of the text area on the pages
(admit a little excess since pages might not be straight)
- target width (including excess)
- resolution in DPI
- 180 degree rotation yes or no (for correct page order)
- noise reduction threshold in #pixels,
this means that black pixels with less black neighbors than the
specified number will be removed