Multiple Approaches to Reading Excel Files in C#: From OLEDB to OpenXML

Nov 14, 2025 · Programming · 14 views · 7.8

Keywords: C# | Excel Reading | OLEDB | OpenXML | LINQ Queries

Abstract: This article provides a comprehensive exploration of various technical solutions for reading Excel files in C# programs. It focuses on the traditional approach using OLEDB providers, which directly access Excel files through ADO.NET connection strings, load worksheet data into DataSets, and support LINQ queries for data processing. Additionally, it introduces two parsing methods of the OpenXML SDK: the DOM approach suitable for small files with strong typing, and the SAX method employing stream reading to handle large Excel files while avoiding memory overflow. The article demonstrates practical applications and performance characteristics through complete code examples.

Introduction

In modern software development, interaction with Microsoft Excel files is a common requirement. Many business scenarios necessitate extracting data from Excel worksheets and performing automated processing. While traditional text export and parsing methods are feasible, they suffer from numerous manual steps and low efficiency. This article systematically introduces multiple technical solutions for reading Excel files in the C# environment.

Reading Excel Using OLEDB Providers

OLEDB (Object Linking and Embedding Database) provides a standardized method for directly accessing Excel files. Through Microsoft.Jet.OLEDB.4.0 or Microsoft.ACE.OLEDB.12.0 providers, Excel files can be treated as database tables for querying.

The basic implementation steps are as follows: first, construct a connection string specifying the Excel file path and version information; then use OleDbDataAdapter to execute SQL queries; finally, populate the results into a DataSet.

var fileName = string.Format("{0}\\fileNameHere", Directory.GetCurrentDirectory()); var connectionString = string.Format("Provider=Microsoft.Jet.OLEDB.4.0; data source={0}; Extended Properties=Excel 8.0;", fileName); var adapter = new OleDbDataAdapter("SELECT * FROM [workSheetNameHere$]", connectionString); var ds = new DataSet(); adapter.Fill(ds, "anyNameHere"); DataTable data = ds.Tables["anyNameHere"];

LINQ Integration and Data Transformation

After converting DataTable to Enumerable, LINQ (Language Integrated Query) can be utilized for efficient data filtering and transformation. This approach is particularly suitable for scenarios requiring the construction of business objects from Excel data.

var data = ds.Tables["anyNameHere"].AsEnumerable(); var query = data.Where(x => x.Field<string>("phoneNumber") != string.Empty).Select(x => new MyContact { firstName= x.Field<string>("First Name"), lastName = x.Field<string>("Last Name"), phoneNumber =x.Field<string>("Phone Number"), });

OpenXML SDK Methods

For scenarios requiring finer control or handling large Excel files, the Microsoft OpenXML SDK offers two distinct parsing strategies.

DOM Parsing Method

The DOM (Document Object Model) method loads the entire Excel document into memory, providing access through strongly-typed classes. This method features intuitive code and is suitable for small to medium-sized files.

WorkbookPart workbookPart = spreadsheetDocument.WorkbookPart ?? spreadsheetDocument.AddWorkbookPart(); WorksheetPart worksheetPart = workbookPart.WorksheetParts.First(); SheetData sheetData = worksheetPart.Worksheet.Elements<SheetData>().First(); string? text; foreach (Row r in sheetData.Elements<Row>()) { foreach (Cell c in r.Elements<Cell>()) { text = c?.CellValue?.Text; Console.Write(text + " "); } }

SAX Parsing Method

The SAX (Simple API for XML) method employs stream reading, processing Excel file content element by element. This approach has low memory footprint and is particularly suitable for handling large Excel files of hundreds of megabytes.

WorkbookPart workbookPart = spreadsheetDocument.WorkbookPart ?? spreadsheetDocument.AddWorkbookPart(); WorksheetPart worksheetPart = workbookPart.WorksheetParts.First(); OpenXmlReader reader = OpenXmlReader.Create(worksheetPart); string text; while (reader.Read()) { if (reader.ElementType == typeof(CellValue)) { text = reader.GetText(); Console.Write(text + " "); } }

Performance Comparison and Application Scenarios

The OLEDB method is suitable for rapid prototyping and simple data extraction tasks, offering concise code and a gentle learning curve. The OpenXML SDK provides richer functionality and control capabilities: the DOM method is appropriate for scenarios requiring complex operations on Excel files, while the SAX method is ideal for handling extremely large files.

In actual projects, the choice of method depends on specific requirements: file size, performance demands, functional complexity, and the development team's familiarity are all important considerations.

Best Practice Recommendations

Regardless of the chosen method, attention should be paid to error handling, resource release, and performance optimization. For the OLEDB method, ensure proper configuration of connection strings and permissions; for the OpenXML method, rationally select DOM or SAX strategies to avoid memory issues.

Additionally, it is recommended to perform data validation and type conversion when reading Excel data to ensure the robustness of business logic. For production environment applications, consideration should also be given to logging, monitoring, and exception handling mechanisms.

Copyright Notice: All rights in this article are reserved by the operators of DevGex. Reasonable sharing and citation are welcome; any reproduction, excerpting, or re-publication without prior permission is prohibited.