The Microsoft Office Document Imaging library that is provided with Office 2003 and Office 2007, has Optical Character Recognition (OCR) that can extract text and layout from image files. As you can see from the code below, it doesnt need much to use it. So to start first include the COM object ‘Microsoft Office Document Imaging Type Library’, by opening the Add Reference dialog box and adding it. The version of the library installed depends on which office you have, version 11 is included with Office 2003 and version 12 is included with Office 2007. Create a label called progressLabel, a button called startButton, and a textbox called destinationTextBox, and then paste the code below. Read more…




