Mac Scan

Tonight I finally got around to change from windows to mac for the scanning of documents. I tend to scan important documents and store them in electronic format in a safe place. It took quite some time to find the correct settings to use.

On windows I used the office document manager that you can install as an optional component in office. I store the contents into tiff. The nice thing was that it automatically performed OCR on the documents (even in Dutch) and I could enable an indexer for tif (or tiff) files. This made it possible to search on these files and also copy-paste directly from the documents.

So the first thing I tried on the Mac was to scan documents. OCR does not seem to come out of the box for the Mac. For my HP 4200 printer I downloaded the HP device manager/HP Scan Pro.

Then I scanned some documents, you can select PDF or TIF. The documents that I saved were rather large in size: more than 500kb each for grayscale (8 bit) at 100 resolution. The “TIFF to preview” option in HP Scan Pro with grayscale at 150 resolution and ‘medium’ sharpness is being calculated at 2.6Mb. That was a big contrast with the scans from windows, where 1 page documents were around 40kb each.

Just now I found the trick. When scanning is done and the tiff file is opened in Preview, you can choose “save as” and there you select the compression to use when saving the tiff file. Choosing “LZW” compresses the tiff file from 2.1Mb into a 144kb file. Quite acceptable. This is for 8-bit grayscale at 150 resolution.

Actually, for black-on-white text, sharper results you get with 1-bit, 200 resolution. Then save with the PackBits compression. This results in a 61Kb file.

Choosing “OCR to Text” gives me the warning that OCR is not installed. Reading from the HP site, the OCR software is only provided on the original CD that came with the printer. I have to see if I still have that: “readiris”, an OCR solution for the Mac, is supposed to be included. That’s for another day…