Best /Fastest way to read an Excel Sheet into a DataTable?
I have always used OLEDB
for this, something like...
Dim sSheetName As String
Dim sConnection As String
Dim dtTablesList As DataTable
Dim oleExcelCommand As OleDbCommand
Dim oleExcelReader As OleDbDataReader
Dim oleExcelConnection As OleDbConnection
sConnection = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=C:\Test.xls;Extended Properties=""Excel 12.0;HDR=No;IMEX=1"""
oleExcelConnection = New OleDbConnection(sConnection)
oleExcelConnection.Open()
dtTablesList = oleExcelConnection.GetSchema("Tables")
If dtTablesList.Rows.Count > 0 Then
sSheetName = dtTablesList.Rows(0)("TABLE_NAME").ToString
End If
dtTablesList.Clear()
dtTablesList.Dispose()
If sSheetName <> "" Then
oleExcelCommand = oleExcelConnection.CreateCommand()
oleExcelCommand.CommandText = "Select * From [" & sSheetName & "]"
oleExcelCommand.CommandType = CommandType.Text
oleExcelReader = oleExcelCommand.ExecuteReader
nOutputRow = 0
While oleExcelReader.Read
End While
oleExcelReader.Close()
End If
oleExcelConnection.Close()
The ACE.OLEDB
provider will read both .xls
and .xlsx
files and I have always found the speed quite good.
Importing Excel into a DataTable Quickly
Caling .Value2
is an expensive operation because it's a COM-interop call. I would instead read the entire range into an array and then loop through the array:
object[,] data = Range.Value2;
// Create new Column in DataTable
for (int cCnt = 1; cCnt <= Range.Columns.Count; cCnt++)
{
textBox3.Text = cCnt.ToString();
var Column = new DataColumn();
Column.DataType = System.Type.GetType("System.String");
Column.ColumnName = cCnt.ToString();
DT.Columns.Add(Column);
// Create row for Data Table
for (int rCnt = 1; rCnt <= Range.Rows.Count; rCnt++)
{
textBox2.Text = rCnt.ToString();
string CellVal = String.Empty;
try
{
cellVal = (string)(data[rCnt, cCnt]);
}
catch (Microsoft.CSharp.RuntimeBinder.RuntimeBinderException)
{
ConvertVal = (double)(data[rCnt, cCnt]);
cellVal = ConvertVal.ToString();
}
DataRow Row;
// Add to the DataTable
if (cCnt == 1)
{
Row = DT.NewRow();
Row[cCnt.ToString()] = cellVal;
DT.Rows.Add(Row);
}
else
{
Row = DT.Rows[rCnt + 1];
Row[cCnt.ToString()] = cellVal;
}
}
}
Get Excel data in DataTable
While the JET/Access driver can read Excel files as if they were databases, it has several problems, especially in the .NET Core era:
- It's Windows-only
- It needs to be installed. It can't be packaged with your application
- The installed version must match the bitness (32/64-bit) of any Office components. This in turn means that your application must match Office's bitness.
There are libraries that can read Excel files directly. One such option is ExcelDataReader, which opens a DbDataReader
over an Excel sheet. It can handle both the obsolete xls
format and the 16 year old xlsx
format (yes, the "new" xlsx format was introduce in 2006, 16 years ago).
The generated data reader can be used to read the data or load a DataTable the same as any other data reader.
using (var stream = File.Open(filePath, FileMode.Open, FileAccess.Read))
{
using (var reader = ExcelReaderFactory.CreateReader(stream))
{
var table=new DataTable();
table.Load(reader);
...
}
}
ExcelDataReader has an extension that allows reading all sheets in a workbook into a DataSet, with one DataTable for each sheet.
using (var stream = File.Open(filePath, FileMode.Open, FileAccess.Read))
{
using (var reader = ExcelReaderFactory.CreateReader(stream))
{
var dataset= reader.AsDataSet();
// The result of each spreadsheet is in dataset.Tables
}
}
Faster way of storing Excel worksheet to system.data.datatable using C#
For others who are encountering the same issue, what I did was:
- save the xml as an xlsx file
- use oledb to read the xlsx file
- store in dataset using OleDbAdapter (Fill() method)
- bulk insert
Here is the code that I used to do this (change the connection string):
Stopwatch s = new Stopwatch();
s.Start();
string sSheetName = null;
string sConnection = null;
System.Data.DataTable sheetData = new System.Data.DataTable();
System.Data.DataTable dtTablesList = default(System.Data.DataTable);
OleDbConnection oleExcelConnection = default(OleDbConnection);
sConnection = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + @"C:\Users\YOURUSERNAME\Documents\Visual Studio 2012\Projects\TestXmlParser\TestXmlParser\bin\Debug\ConsolidatedSSMFiles.xlsx" + ";Extended Properties=\"Excel 8.0;HDR=Yes;IMEX=1\"";
oleExcelConnection = new OleDbConnection(sConnection);
oleExcelConnection.Open();
dtTablesList = oleExcelConnection.GetSchema("Tables");
if (dtTablesList.Rows.Count > 0)
{
sSheetName = dtTablesList.Rows[0]["TABLE_NAME"].ToString();
}
dtTablesList.Clear();
dtTablesList.Dispose();
if (!string.IsNullOrEmpty(sSheetName))
{
OleDbDataAdapter sheetAdapter = new OleDbDataAdapter("select * from [TEST$]", oleExcelConnection);
sheetAdapter.Fill(sheetData);
} s.Stop();
var duration = s.Elapsed;
oleExcelConnection.Close();
dataGridView1.DataSource = sheetData;
MessageBox.Show(sheetData.Rows.Count.ToString()+"rows - "+ duration.ToString());
This reads 25000+ rows of excel data to a datable in approx. 1.9 to 2.0 seconds.
Read excel spreadsheet into a datatable
Here is example how to get all columns and rows from special Sheet from xlsx file. This code takes all data from Sheet2
from xlsx file and fill the DataTable with that values.
Hopefully this will help you.
using System;
using System.Data;
using System.Data.OleDb;
namespace ConsoleApp4
{
class Program
{
static void Main(string[] args)
{
DataTable rs = new DataTable();
using (var odConnection = new OleDbConnection(@"Provider=Microsoft.ACE.OLEDB.12.0;Data Source=c:\Users\IIG\Desktop\test.xlsx;Extended Properties='Excel 12.0;HDR=YES;IMEX=1;';"))
{
odConnection.Open();
using (OleDbCommand cmd = new OleDbCommand())
{
cmd.Connection = odConnection;
cmd.CommandType = CommandType.Text;
cmd.CommandText = "SELECT * FROM [Sheet2$]";
using (OleDbDataAdapter oleda = new OleDbDataAdapter(cmd))
{
oleda.Fill(rs);
}
}
odConnection.Close();
}
foreach(DataRow row in rs.Rows)
{
foreach(object item in row.ItemArray)
{
Console.Write(item +"\t");
}
Console.WriteLine();
}
}
}
}
How to read all data rows in a specific Excel sheet into an ADO.NET Datatable with out using ODBC in C#
in case ms office is installed on the server, the fastest way would be to export the excel sheet into a CSV file and then using file IO to import it into your database.
you could open excel via COM Interop and export the workbook to CSV. others here on stackoverflow discussed this proceeding: Save an excel file to a csv file in C# code
i've got the experience that getting data out of excel directly via com interop is very slow. excel exports csv really fast instead.
[edit]
a very good office wrapper is NetOffice. It works similar like the COM Interop interface but with more speed and more function: http://netoffice.codeplex.com/
as you prefer stream reader, there's an excel binary reader on codeplex too: http://exceldatareader.codeplex.com/
[/edit]
Related Topics
Binding List<T> to Datagridview in Winform
Validate Image from File in C#
Check for Column Name in a SQLdatareader Object
Generating Permutations of a Set (Most Efficiently)
Populating a Razor Dropdownlist from a List<Object> in MVC
How to Restart My C# Winform Application
Adding Placeholder Text to Textbox
Dynamically Add C# Properties at Runtime
.Net Out of Memory Exception - Used 1.3Gb But Have 16Gb Installed
C# Listview, How to Add Items to Columns 2, 3 and 4 etc
Convert an Image (Selected by Path) to Base64 String
How to Enable Http Put and Delete for ASP.NET MVC in Iis
How to Get the Title of the Current Active Window Using C#
Is There Any Async Equivalent of Process.Start
Rendering Partial View on Button Click in ASP.NET MVC