Sunday, June 5, 2011

Scan old black and white books

I had to digitize an old book with yellowed pages and the following bash command helped me to do the job.

convert is part of ImageMagick package. pdftk is a utility to manipulate pdf files.

-threshold split all colors into black and white. +delete removes the extra page created from cropping.


mkdir work
for i in *[13579].png; do convert $i -crop 2544x3250 +repage -threshold 40% +delete work/$i.pdf; done
for i in *[02468].png; do convert $i -crop 2544x3250 +repage -threshold 40% +delete -flip -flop work/$i.pdf; done
cd work
pdftk *pdf output ../all.pdf