Reading Excel 2003 and 2007 Files Using OLEDB


One of the feature of the application I'm working on these days is allowing the user to import data from Excel 2003 and 2007 files. After some research I found out that OLEDB is the only way to read both Excel formats without any problems.

For this example I split the code into two functions. The first is OpenExcelFile which opens the Excel file and sets the string array data field workSheetNames to the names of the sheets in the file. This is done by reading the names of the tables in the schema, because OLEDB opens the whole file as a schema and the sheets in the file as tables in the schema. The function takes one argument called isOpenXMLFormat, that is set to true to open a 2007 file and false to open a 2003 or previous file. The only difference between the two formats is the connection string used.

public void OpenExcelFile(bool isOpenXMLFormat)
{
    //open the excel file using OLEDB
    OleDbConnection con;

    if (isOpenXMLFormat)
        //read a 2007 file
        connectionString = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" +
            fileName + ";Extended Properties=\"Excel 8.0;HDR=YES;\"";
    else
        //read a 97-2003 file
        connectionString = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" +
            fileName + ";Extended Properties=Excel 8.0;";

    con = new OleDbConnection(connectionString);
    con.Open();

    //get all the available sheets
    System.Data.DataTable dataSet = con.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);

    //get the number of sheets in the file
    workSheetNames = new String[dataSet.Rows.Count];
    int i = 0;
    foreach (DataRow row in dataSet.Rows)
    {
        //insert the sheet's name in the current element of the array
        //and remove the $ sign at the end
        workSheetNames[i] = row["TABLE_NAME"].ToString().Trim(new[] { '$' });
        i++;
    }

    if (con != null)
    {
        con.Close();
        con.Dispose();
    }

    if (dataSet != null)
        dataSet.Dispose();
}

The second function is GetWorksheet, and it takes the name of the sheet as an argument. Here the file is opened again using the connection string set by OpenExcelFile. And executes a SQL SELECT on the name of the sheet passed in the argument, and returns the result as a DataTable.

public System.Data.DataTable GetWorksheet(string worksheetName)
{
    OleDbConnection con = new System.Data.OleDb.OleDbConnection(connectionString);
    OleDbDataAdapter cmd = new System.Data.OleDb.OleDbDataAdapter(
        "select * from [" + worksheetName + "$]", con);

    con.Open();
    System.Data.DataSet excelDataSet = new DataSet();
    cmd.Fill(excelDataSet);
    con.Close();

    return excelDataSet.Tables[0];
}