The Microsoft Office Document Imaging library that is provided with Office 2003 and Office 2007, has Optical Character Recognition (OCR) that can extract text and layout from image files. As you can see from the code below, it doesnt need much to use it. So to start first include the COM object 'Microsoft Office Document Imaging Type Library', by opening the Add Reference dialog box and adding it. The version of the library installed depends on which office you have, version 11 is included with Office 2003 and version 12 is included with Office 2007. Create a label called progressLabel, a button called startButton, and a textbox called destinationTextBox, and then paste the code below.

        private void Form1_Load(object sender, EventArgs e)
            modiDoc = new MODI.Document();

            //this event reports the progress of the OCR
            modiDoc.OnOCRProgress += new

        private void startButton_Click(object sender, EventArgs e)
            //display the label that will show the progress
            progressLabel.Visible = true;

            //creates the document, giving the image file in the parameter,
            //set the parameter to the image you want to extract the text from

            //perform the OCR on the entire document, specifying that the OCR
            //engine attempt to determine the orientation of the page, and attempt
            //to fix misalignments
            modiDoc.OCR(MiLANGUAGES.miLANG_ENGLISH, true, true);

            //loop thru the pages, and desplaying the text in each
            for (int i = 0; i < modiDoc.Images.Count; i++)
                MODI.Image image = (MODI.Image)modiDoc.Images[i];
                MODI.Layout layout = (MODI.Layout)image.Layout;

                destinationTextBox.Text += image.Layout.Text;

            //hide the label when finished
            progressLabel.Visible = false;

        private void TrackProgress(int progress, ref bool cancel)
            progressLabel.Text = "Progress: " + progress.ToString() + "%";

The MSDN reference page is