I’ve covered DEVONthink heavily recently. That’s because it has been an invaluable tool and a huge time saver for me over the last few months. The latest trick I discovered has been DEVONthink’s ability to easily replace non-searchable PDF’s with searchable (OCR’d) versions of the same document. It’s pretty simple, actually.
First, you will need DEVONthink Pro Office, which is the version of DEVONthink that can perform OCR. That’s the most expensive version of the app. If you have that version, there are two steps to take, with a third optional step.
- Create a new Smart Group, using one of the built in templates. Go to Data > New from Template > Smart Groups > PDFs (not searchable).
Go to the Smart Group you just created, which should now appear among all your other groups. Select all of the documents in that group (select one, and then Command-A if you want to select them all), right click, and pick Convert > to Searchable PDF.
DEVONthink will then proceed to OCR all of those documents. You can also set DEVONthink to automatically trash the old, non-OCR’d version automatically. Do that BEFORE you OCR by going to Preferences > OCR, and checking the “Move to Trash” box next to “Original Document.” You only need to do this once, and the setting sticks until you change it.
The beauty of this process is that the OCR’d version will be located in the proper group – the group in which the non-OCR’d version was located before you trashed it. I had been worried that I’d need to refile everything, which turned out not to be the case.
I’m currently readying a case for trial with several thousand documents. My office scans and OCR’s everything that comes in the door, but some documents have been sent to us digitally, and some of those hadn’t been OCR’d. I can now easily make sure all documents in my trial database are searchable.