The Microsoft Office Document Imaging library that is provided with Office 2003 and Office 2007, has Optical Character Recognition (OCR) that can extract text and layout from image files. As you can see from the code below, it doesnt need much to use it. So to start first include the COM object ‘Microsoft Office Document [...]

