I’ve covered DEVONthink heavily recently. That’s because it has been an invaluable tool and a huge time saver for me over the last few months. The latest trick I discovered has been DEVONthink’s ability to easily replace non-searchable PDF’s with searchable (OCR’d) versions of the same document. It’s pretty simple, actually.
First, you will need DEVONthink Pro Office, which is the version of DEVONthink that can perform OCR. That’s the most expensive version of the app. If you have that version, there are two steps to take, with a third optional step.
- Create a new Smart Group, using one of the built in templates. Go to Data > New from Template > Smart Groups > PDFs (not searchable).
-
Go to the Smart Group you just created, which should now appear among all your other groups. Select all of the documents in that group (select one, and then Command-A if you want to select them all), right click, and pick Convert > to Searchable PDF.
DEVONthink will then proceed to OCR all of those documents. You can also set DEVONthink to automatically trash the old, non-OCR’d version automatically. Do that BEFORE you OCR by going to Preferences > OCR, and checking the “Move to Trash” box next to “Original Document.” You only need to do this once, and the setting sticks until you change it.
The beauty of this process is that the OCR’d version will be located in the proper group – the group in which the non-OCR’d version was located before you trashed it. I had been worried that I’d need to refile everything, which turned out not to be the case.
I’m currently readying a case for trial with several thousand documents. My office scans and OCR’s everything that comes in the door, but some documents have been sent to us digitally, and some of those hadn’t been OCR’d. I can now easily make sure all documents in my trial database are searchable.
Heiko says:
That description helped me a lot. Thank you, works fine!
December 21, 2016 — 1:21 am
RRK says:
I am very sorry, but OCR with Devonthink does not work: no recognition at all. You must use other software that does not use ABBEY fine reader: completely useless software.
January 2, 2017 — 2:04 pm
Evan Kline says:
Not quite sure what you mean? OCR works fine for me. The one complaint I’ve seen is that sometimes file sizes are big, but it’s always worked fine for me.
January 2, 2017 — 3:25 pm
Menno says:
The article stated correctly you need devinthink pro office for ocr support.
“First, you will need DEVONthink Pro Office, which is the version of DEVONthink that can perform OCR.”
Sure you have pro office?
Menno
January 5, 2017 — 10:34 am
Kim says:
Thank you VERY MUCH!!!
May 11, 2017 — 7:00 pm
Timothy says:
I don’t see that specific template under smart groups (“PDFs (not searchable)”). I have the Pro Office version; did you create that smart group yourself because I can’t seem find the rules to make it. Thanks!
April 8, 2018 — 3:42 pm
Evan Kline says:
I think it was a menu option. I went to Data > New from Template > Smart Groups > PDFs (not searchable), and it was there. I could be wrong, but I don’t think I had to add it.
April 9, 2018 — 12:40 pm