Here I am explaining how to read excel sheet and import it using OLEDB Data Provider Connection and C# .Net. Excel Workbook is just like database with sheets corresponding to tables. See the mapping below.
Database <—————> Excel Workbook
Sheet <—————-> Table
Connection String for Excel 97-2003 Format (.XLS)
For Excel 97-2003 format Microsoft Jet OLEDB Driver 4.0 is used. A sample connection string as follows.
"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=C:\Book1.xls;Extended Properties='Excel 8.0;HDR=Yes'"
Connection String for Excel 2007 Format (.XLSX)
For Excel 2007 format the new Microsoft Ace OLEDB Driver 4.0 is used. A sample connection string as follows.
"Provider=Microsoft.ACE.OLEDB.12.0;Data Source=C:\Book1.xlsx;Extended Properties='Excel 8.0;HDR=Yes'"
Rest everything is same for both versions. One thing to note Microsoft Ace OLEDB Driver 12.0 works for both Excel 2003 and Excel 2007 formats.
You can specify whether your Excel file has Headers or not using the HDR property.
When HDR is set to Yes the First Row is considered as the Header of the Excel file.
For this tutorial I have used the following Excel Sheet
ID
|
Name
|
1
|
John
|
2
|
Smith
|
3
|
Rob
|
Establish a Connection
String strExcelConn = "Provider=Microsoft.Jet.OLEDB.4.0;"
+ "Data Source=Book1.xls;"
+ "Extended Properties='Excel 8.0;HDR=Yes'";
OleDbConnection connExcel = new OleDbConnection(strExcelConn);
OleDbCommand cmdExcel = new OleDbCommand();
cmdExcel.Connection = connExcel;
Accessing Sheets
connExcel.Open();
DataTable dtExcelSchema;
dtExcelSchema = connExcel.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
connExcel.Close();
The dtExcelSchema contains all the Sheets present in your Excel Workbook
You access them in the following way
string sheetName = dtExcelSchema.Rows[0]["TABLE_NAME"];
This will give the name of the first sheet. i.e. Sheet1$
Running a Select Query by Specifying Column Names
DataSet ds = new DataSet();
string SheetName = dtExcelSchema.Rows[0]["TABLE_NAME"].ToString();
cmdExcel.CommandText = "SELECT ID, Name From [" + SheetName + "]";
da.SelectCommand = cmdExcel;
da.Fill(ds);
Note the above query will only work when Header Row is present in the Excel Sheet.
XML Output of the Query
<?xml version="1.0" standalone="yes"?>
<NewDataSet>
<Table>
<ID>1</ID>
<Name>John</Name>
</Table>
<Table>
<ID>2</ID>
<Name>Smith</Name>
</Table>
<Table>
<ID>3</ID>
<Name>Rob</Name>
</Table>
</NewDataSet>
Running a Query without Specifying Column Names
string SheetName = dtExcelSchema.Rows[0]["TABLE_NAME"].ToString();
cmdExcel.CommandText = "SELECT * From [" + SheetName + "]";
da.SelectCommand = cmdExcel;
da.Fill(ds);
connExcel.Close();
Above query will work in both cases if Header row is specified and when not specified
Output of the Above Query with Header = Yes
<?xml version="1.0" standalone="yes"?>
<NewDataSet>
<Table>
<ID>1</ID>
<Name>John</Name>
</Table>
<Table>
<ID>2</ID>
<Name>Smith</Name>
</Table>
<Table>
<ID>3</ID>
<Name>Rob</Name>
</Table>
</NewDataSet>
Output of the Above Query with Header = No
<?xml version="1.0" standalone="yes"?>
<NewDataSet>
<Table>
<F2>Name</F2>
</Table>
<Table>
<F1>1</F1>
<F2>John</F2>
</Table>
<Table>
<F1>2</F1>
<F2>Rob</F2>
</Table>
<Table>
<F1>3</F1>
<F2>Smith</F2>
</Table>
</NewDataSet>
Running a Query on a Range of Cells
string SheetName = dtExcelSchema.Rows[0]["TABLE_NAME"].ToString();
cmdExcel.CommandText = "SELECT * From [" + SheetName + "A3:B5]";
da.SelectCommand = cmdExcel;
da.Fill(ds);
connExcel.Close();
Many times you need to select all data from a range of Cells. The above query selects all the data within the cell range A3:B5
Output of the Above Query
<?xml version="1.0" standalone="yes"?>
<NewDataSet>
<Table>
<F1>2</F1>
<F2>Rob</F2>
</Table>
<Table>
<F1>3</F1>
<F2>Smith</F2>
</Table>
</NewDataSet>
You can download the sample source code here.
Download Code (10.37 kb)