Tuesday, June 17, 2008

Reading Excel content without office using OleDbConnection in .NET

I needed to open an Excel doc and read the content its tables.

First we need to open the Excel file:

string cs = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + filename + ";Extended Properties=\"Excel 12.0 Xml;HDR=No\";";
using (OleDbConnection connection = new OleDbConnection(cs))
{
   connection.Open();
   string[] tableNames = GetTableNames(connection);
}

 

For advances connection strings refer to  http://www.connectionstrings.com/

 

Then we use the connection to get the table names:

string[] GetTableNames(OleDbConnection connection)
{
    List<string> tableNames = new List<string>();
    DataTable dt = connection.GetSchema("Tables");
    foreach (DataRow row in dt.Rows)
    {
        string tableName = row["TABLE_NAME"].ToString();

        // Table names ends with a $, or with $'
        if (tableName.EndsWith("$") || tableName.EndsWith("$\'"))
        {
            tableNames.Add(tableName);
        }
    }

    return tableNames.ToArray();
}

The name of the columns in the Excel are named Fn where n is the column index (1 indexed)

The following code iterates over the tables and prints the 3rd column of the first 10 rows of each worksheet.

for (int i = 0; i < tableNames.Length; i++)
{
    OleDbCommand command = connection.CreateCommand();
    command.CommandText = "select top 10 F3 from [" + tableNames[i] + "]";
    command.CommandType = CommandType.Text;

    OleDbDataReader reader = command.ExecuteReader();

    Console.WriteLine("\n\nReading from table: " + tableNames[i]);

    while (reader.Read())
    {
        for (int j = 0; j < reader.FieldCount; j++)
        {
            Console.Write(reader.GetValue(j) + ", ");
        }
        Console.WriteLine("");
    }

    reader.Dispose();
    command.Dispose();
}

As you can see I didn't change the names of the tables although sometimes they are not the same as they appear in the Excel.

A worksheet named A.B will be converted to 'A#B$'

A worksheet named A B C will be converted to 'A B C$'

I didn't have time to check all the conversions, but it works with what you get as a table name.

 

Ami