Open Xml Reading from Excel File

open xml reading from excel file

Your approach seemed to work ok for me - in that it did "enter the loop".
Nevertheless you could also try something like the following:

void Main()
{
string fileName = @"c:\path\to\my\file.xlsx";

using (FileStream fs = new FileStream(fileName, FileMode.Open, FileAccess.Read, FileShare.ReadWrite))
{
using (SpreadsheetDocument doc = SpreadsheetDocument.Open(fs, false))
{
WorkbookPart workbookPart = doc.WorkbookPart;
SharedStringTablePart sstpart = workbookPart.GetPartsOfType<SharedStringTablePart>().First();
SharedStringTable sst = sstpart.SharedStringTable;

WorksheetPart worksheetPart = workbookPart.WorksheetParts.First();
Worksheet sheet = worksheetPart.Worksheet;

var cells = sheet.Descendants<Cell>();
var rows = sheet.Descendants<Row>();

Console.WriteLine("Row count = {0}", rows.LongCount());
Console.WriteLine("Cell count = {0}", cells.LongCount());

// One way: go through each cell in the sheet
foreach (Cell cell in cells)
{
if ((cell.DataType != null) && (cell.DataType == CellValues.SharedString))
{
int ssid = int.Parse(cell.CellValue.Text);
string str = sst.ChildElements[ssid].InnerText;
Console.WriteLine("Shared string {0}: {1}", ssid, str);
}
else if (cell.CellValue != null)
{
Console.WriteLine("Cell contents: {0}", cell.CellValue.Text);
}
}

// Or... via each row
foreach (Row row in rows)
{
foreach (Cell c in row.Elements<Cell>())
{
if ((c.DataType != null) && (c.DataType == CellValues.SharedString))
{
int ssid = int.Parse(c.CellValue.Text);
string str = sst.ChildElements[ssid].InnerText;
Console.WriteLine("Shared string {0}: {1}", ssid, str);
}
else if (c.CellValue != null)
{
Console.WriteLine("Cell contents: {0}", c.CellValue.Text);
}
}
}
}
}
}

I used the filestream approach to open the workbook because this allows you to open it with shared access - so that you can have the workbook open in Excel at the same time. The Spreadsheet.Open(... method won't work if the workbook is open elsewhere.

Perhaps that is why your code didn't work.

Note, also, the use of the SharedStringTable to get the cell text where appropriate.

EDIT 2018-07-11:

Since this post is still getting votes I should also point out that in many cases it may be a lot easier to use ClosedXML to manipulate/read/edit your workbooks. The documentation examples are pretty user friendly and the coding is, in my limited experience, much more straight forward. Just be aware that it does not (yet) implement all the Excel functions (for example INDEX and MATCH) which may or may not be an issue. [Not that I would want to be trying to deal with INDEX and MATCH in OpenXML anyway.]

Reading Excel files using OpenXML

Edit:

Using Open XML SDK for Microsoft Office

install V2 from :https://www.microsoft.com/en-eg/download/details.aspx?id=5124&wa=wsignin1.0

(or V2.5)

The following class convert excel sheet to CSV file with delimeter

 //reference library
using DocumentFormat.OpenXml.Packaging;
using DocumentFormat.OpenXml.Spreadsheet;

public class OpenXmlExcel
{
public void ExcelToCsv(string source, string target, string delimiter = ";", bool firstRowIsHeade = true)
{
var dt = ReadExcelSheet(source, firstRowIsHeade);
DatatableToCsv(dt, target, delimiter);

}

private void DatatableToCsv(DataTable dt, string fname, string delimiter = ";")
{

using (StreamWriter writer = new StreamWriter(fname))
{
foreach (DataRow row in dt.AsEnumerable())
{
writer.WriteLine(string.Join(delimiter, row.ItemArray.Select(x => x.ToString())) + delimiter);
}
}

}

List<string> Headers = new List<string>();

private DataTable ReadExcelSheet(string fname, bool firstRowIsHeade)
{

DataTable dt = new DataTable();
using (SpreadsheetDocument doc = SpreadsheetDocument.Open(fname, false))
{
//Read the first Sheets
Sheet sheet = doc.WorkbookPart.Workbook.Sheets.GetFirstChild<Sheet>();
Worksheet worksheet = (doc.WorkbookPart.GetPartById(sheet.Id.Value) as WorksheetPart).Worksheet;
IEnumerable<Row> rows = worksheet.GetFirstChild<SheetData>().Descendants<Row>();

foreach (Row row in rows)
{
//Read the first row as header
if (row.RowIndex.Value == 1)
{
var j = 1;
foreach (Cell cell in row.Descendants<Cell>())
{
var colunmName = firstRowIsHeade ? GetCellValue(doc, cell) : "Field" + j++;
Console.WriteLine(colunmName);
Headers.Add(colunmName);
dt.Columns.Add(colunmName);
}
}
else
{
dt.Rows.Add();
int i = 0;
foreach (Cell cell in row.Descendants<Cell>())
{
dt.Rows[dt.Rows.Count - 1][i] = GetCellValue(doc, cell);
i++;
}
}
}

}
return dt;
}

private string GetCellValue(SpreadsheetDocument doc, Cell cell)
{
string value = cell.CellValue.InnerText;
if (cell.DataType != null && cell.DataType.Value == CellValues.SharedString)
{
return doc.WorkbookPart.SharedStringTablePart.SharedStringTable.ChildElements.GetItem(int.Parse(value)).InnerText;
}
return value;
}
}

How to use:

new OpenXmlExcel().ExcelToCsv("f1.xlsx","f1.csv",";",true);
or
//use default: separator=";" ,first row is header
new OpenXmlExcel().ExcelToCsv("f1.xlsx","f1.csv");

Read Excel file using OpenXML

The actual strings are stored in the SharedStringTable. What you are getting are only the references to the elements in that string table.

Here's your sample modified to retrieve the values from the string table:

using (SpreadsheetDocument doc = SpreadsheetDocument.Open(filePatah + "\\" + fileName, false))
{
WorkbookPart workbookPart = doc.WorkbookPart;
WorksheetPart worksheetPart = workbookPart.WorksheetParts.First();
SheetData sheetData = worksheetPart.Worksheet.Elements<SheetData>().First();

ArrayList data = new ArrayList();
foreach (Row r in sheetData.Elements<Row>())
{
foreach (Cell c in r.Elements<Cell>())
{
if (c.DataType != null && c.DataType == CellValues.SharedString)
{
var stringId = Convert.ToInt32(c.InnerText); // Do some error checking here
data.Add(workbookPart.SharedStringTablePart.SharedStringTable.Elements<SharedStringItem>().ElementAt(stringId).InnerText);
}
}
}
}

Please note that this is just a crude example. For a little more complete examples you can look here.

Also, depending on what you need, you might find using a library such as EPPlus much easier (you can read and write directly to cells without worrying about the actual document format) than OpenXML SDK.

Read Excel cell values of specific columns using open xml sdk in c#

From Below method you can map your excel sheet data to Dictionary<string, List<KeyValuePair<string, string>>>.

public void MapExcelToDictionary()
{
var fileName = @"C:\XML\Vehicles.xlsx";
using (SpreadsheetDocument spreadsheetDocument = SpreadsheetDocument.Open(fileName, false))
{
WorkbookPart workbookPart = spreadsheetDocument.WorkbookPart;

//Get sheet from excel
var sheets = workbookPart.Workbook.Descendants<Sheet>();

//First sheet from excel
Sheet sheet = sheets.FirstOrDefault();

var worksheetPart = (WorksheetPart)workbookPart.GetPartById(sheet.Id);
var rows = worksheetPart.Worksheet.Descendants<Row>().ToList();

//Get all data rows from sheet
Row headerRow = rows.First();
var headerCells = headerRow.Elements<Cell>();
int totalColumns = headerCells.Count();

List<string> lstHeaders = new List<string>();
foreach (var value in headerCells)
{
var stringId = Convert.ToInt32(value.InnerText);
lstHeaders.Add(workbookPart.SharedStringTablePart.SharedStringTable.Elements<SharedStringItem>().ElementAt(stringId).InnerText);
}

// Remove the header row
rows.RemoveAt(0);

//Dictionary to map row data into key value pair
Dictionary<string, List<KeyValuePair<string, string>>> dict = new Dictionary<string, List<KeyValuePair<string, string>>>();

var productID = string.Empty;

//Iterate to all rows
foreach (Row r in rows)
{
List<KeyValuePair<string, string>> keyValuePairs = new List<KeyValuePair<string, string>>();

//Iterate to all cell in current row
foreach (Cell c in r.Elements<Cell>())
{
if (c.DataType != null && c.DataType == CellValues.SharedString)
{
var stringId = Convert.ToInt32(c.InnerText);
string val = workbookPart.SharedStringTablePart.SharedStringTable.Elements<SharedStringItem>().ElementAt(stringId).InnerText;

//Find cell index and map each cell and add in key value pair
switch (GetColumnIndex(c.CellReference))
{
case 1:
productID = val;
break;

case 2:
keyValuePairs.Add(new KeyValuePair<string, string>("Model", val));
break;

case 3:
keyValuePairs.Add(new KeyValuePair<string, string>("Type", val));
break;

case 4:
keyValuePairs.Add(new KeyValuePair<string, string>("Color", val));
break;

case 5:
keyValuePairs.Add(new KeyValuePair<string, string>("MaSpeed", val));
break;

case 6:
keyValuePairs.Add(new KeyValuePair<string, string>("Manufacturer", val));
break;
}

}
else if (c.InnerText != null || c.InnerText != string.Empty)
{
//Do code here
}
}

//Add productId and its repsective data to dictionary
dict.Add(productID, keyValuePairs);
}

Console.ReadKey();
}
}

And below method can find column index from cell reference in excel sheet.

private static int? GetColumnIndex(string cellReference)
{
if (string.IsNullOrEmpty(cellReference))
{
return null;
}

string columnReference = Regex.Replace(cellReference.ToUpper(), @"[\d]", string.Empty);

int columnNumber = -1;
int mulitplier = 1;

foreach (char c in columnReference.ToCharArray().Reverse())
{
columnNumber += mulitplier * ((int)c - 64);

mulitplier = mulitplier * 26;
}

return columnNumber + 1;
}

Read excel by sheet name with OpenXML

I've done it like in the code snippet below. It's basically Workbook->Spreadsheet->Sheet then getting the Name attribute of the sheet.

The basic underling xml looks like this:

<x:workbook>
<x:sheets>
<x:sheet name="Sheet1" sheetId="1" r:id="rId1" />
<x:sheet name="TEST sheet Name" sheetId="2" r:id="rId2" />
</x:sheets>
</x:workbook>

The id value is what the Open XML package uses internally to identify each sheet and link it with the other XML parts. That's why the line of code that follows identifying the name uses GetPartById to pick up the WorksheetPart.

using (SpreadsheetDocument doc = SpreadsheetDocument.Open(path, false))
{
WorkbookPart bkPart = doc.WorkbookPart;
DocumentFormat.OpenXml.Spreadsheet.Workbook workbook = bkPart.Workbook;
DocumentFormat.OpenXml.Spreadsheet.Sheet s = workbook.Descendants<DocumentFormat.OpenXml.Spreadsheet.Sheet>().Where(sht => sht.Name == "Sheet1").FirstOrDefault();
WorksheetPart wsPart = (WorksheetPart)bkPart.GetPartById(s.Id);
DocumentFormat.OpenXml.Spreadsheet.SheetData sheetdata = wsPart.Worksheet.Elements<DocumentFormat.OpenXml.Spreadsheet.SheetData>().FirstOrDefault();

foreach (DocumentFormat.OpenXml.Spreadsheet.Row r in sheetdata.Elements<DocumentFormat.OpenXml.Spreadsheet.Row>())
{
DocumentFormat.OpenXml.Spreadsheet.Cell c = r.Elements<DocumentFormat.OpenXml.Spreadsheet.Cell>().First();
txt += c.CellValue.Text + Environment.NewLine;
}
this.txtMessages.Text += txt;
}

Open XML Reading Excel file does not enter loop to read excel sheet

I have been using following code which works fine for me.

using (SpreadsheetDocument spreadsheetDocument = SpreadsheetDocument.Open(sFileNameWithPath, false))
{
WorkbookPart workbookPart = spreadsheetDocument.WorkbookPart;
WorksheetPart worksheetPart = GetWorksheetPart(workbookPart, sSheetName);

SheetData sheetData = worksheetPart.Worksheet.Elements<SheetData>().First();

bool bHasChildren = sheetData.HasChildren;
if (bHasChildren)
{
for (int iCounter1 = 1; iCounter1 < sheetData.Elements<Row>().Count(); iCounter1++)
{
Row rDataRow = sheetData.Elements<Row>().ElementAt(iCounter1);
for (int iCounter = 0; iCounter < rDataRow.ChildElements.Count; iCounter++)
{
Cell oCell = (Cell)rDataRow.ChildElements[iCounter];
}
}
}
}

Let me know if this helps.

Or you can use your code with following change

using (SpreadsheetDocument doc = SpreadsheetDocument.Open(sFileNameWithPath, false))
{
WorkbookPart workbookPart = doc.WorkbookPart;

string relId = workbookPart.Workbook.Descendants<Sheet>().First(s => "Claims".Equals(s.Name)).Id;
WorksheetPart worksheetPart = (WorksheetPart)workbookPart.GetPartById(relId);

SheetData sheetData = worksheetPart.Worksheet.Elements<SheetData>().First();

foreach (Row r in sheetData.Elements<Row>())
{
foreach (Cell c in r.Elements<Cell>())
{
string text = c.CellValue.Text;
}
}
}

Note that I have used the excel sheet name "Claims", so check whether it works and if yes; put it in another function to make it generic

Streaming through Excel data with Open XML SDK

A suitable solution here is to use the OpenXmlReader XML reader. The other key thing is to use Elements instead of Decendents to avoid looking too deep in the XML structure.

using (var reader = OpenXmlReader.Create(worksheetPart))
{
while (reader.Read())
{
if (typeof(Row).IsAssignableFrom(reader.ElementType))
{
var row = (Row)reader.LoadCurrentElement();
foreach (var cell in row.Elements<Cell>())
{
var (_, value) = ParseCell(cell);
}
}
}
}

This does indeed "stream" the elements and memory usage is minimal.

Write hidden information (like software name & version) in Excel file using OpenXML

Set a custom property for excel document via Openxml

Find method SetCustomProperty() at the bottom of page. This function is written for Word document so change open-file line to below one for Excel document

using (var document = SpreadsheetDocument.Open(fileName, true))

And you are good to add any property to your file.

How to Hide properties

The properties will be visible in Excel through File-> Info -> Properties -> Advanced Properties window. Users will be able to delete them there. If this is not desired, a property won't be visible in excel if instead of this unique id

newProp.FormatId = "{D5CDD505-2E9C-101B-9397-08002B2CF9AE}";

another one is used:

newProp.FormatId = Guid.NewGuid().ToString("B");

Note: for saving string use VTLPWSTR type. Do not use type VTBString along with the unique ID given above as Excel automatically deletes your property when you edit it in there (just by experience, I don't know why!).

How to read the properties?

You saved your file. Then, open it again and loop over all properties

foreach (CustomDocumentProperty property in document.CustomFilePropertiesPart.Properties)
{
if (property.Name.Value == nameof(Product) &&
property.VTBString.Text == Product)
return true;
}

where Product is string property holds the name of software, and VTBString is used to save value of Product. As many properties as desired can be saved and read with this method.



Related Topics



Leave a reply



Submit