Efficiently Reading Excel Table Data and Converting to Strongly-Typed Object Collections Using EPPlus

Dec 03, 2025 · Programming · 12 views · 7.8

Keywords: EPPlus | Excel data reading | C# reflection | strongly-typed conversion | data integration

Abstract: This article explores in detail how to use the EPPlus library in C# to read table data from Excel files and convert it into strongly-typed object collections. By analyzing best-practice code, it covers identifying table headers, handling data type conversions (particularly the challenge of numbers stored as double in Excel), and using reflection for dynamic property mapping. The content spans from basic file operations to advanced data transformation, providing reusable extension methods and test examples to help developers efficiently manage Excel data integration tasks.

In modern software development, reading structured data from Excel files is a common requirement, especially in scenarios like data import, report generation, or system integration. EPPlus, as a powerful .NET library, offers rich APIs for manipulating Excel files, supporting reading, writing, and formatting. However, directly mapping Excel table data to strongly-typed object collections is not a built-in feature, requiring developers to implement it using reflection and custom logic. Based on a typical technical Q&A case, this article delves into how to efficiently read Excel tables with EPPlus and store data in List<T>, where T is a user-defined data class.

Problem Context and Core Challenges

The user's goal is to read an Excel table containing a header row (e.g., Id, Name, Gender) and multiple data rows. The data from each column needs to be categorized based on the header and stored in a List<ExcelData>, where ExcelData is a class with corresponding properties (e.g., Id, Name, Gender). This allows direct data access via property names, such as outputting data.Id + data.Name + data.Gender in a loop. The core challenges are: EPPlus does not provide a native method to convert tables directly to objects, and numbers in Excel are typically stored as double, requiring proper type conversion to avoid data loss or errors.

Solution Overview

The best answer provides a generic extension method ConvertTableToObjects<T> that dynamically maps Excel table data to objects of a specified type T using reflection. The method first obtains the table's address range, extracts cell data, groups it by rows, and determines column names and data types based on the header row. Then, it iterates through data rows, assigning values according to property names and types, with special handling for double to other types (e.g., int, DateTime). Finally, it returns an IEnumerable<T> collection ready for further operations.

Detailed Code Implementation

Below is a simplified code example illustrating the logic. For clarity, error handling and edge cases are omitted; refer to the original answer for the complete code.

public static IEnumerable<T> ConvertTableToObjects<T>(this ExcelTable table) where T : new()
{
    // Get the start and end positions of the table
    var start = table.Address.Start;
    var end = table.Address.End;
    var cells = new List<ExcelRangeBase>();

    // Collect all cell data
    for (var r = start.Row; r <= end.Row; r++)
        for (var c = start.Column; c <= end.Column; c++)
            cells.Add(table.WorkSheet.Cells[r, c]);

    // Group by row, assuming the first row is the header
    var groups = cells.GroupBy(cell => cell.Start.Row).ToList();
    var colnames = groups.First()
        .Select((hcell, idx) => new { Name = hcell.Value.ToString(), index = idx })
        .Where(o => typeof(T).GetProperties().Select(p => p.Name).Contains(o.Name))
        .ToList();

    // Skip the header row and process data rows
    var rowvalues = groups.Skip(1).Select(cg => cg.Select(c => c.Value).ToList());
    return rowvalues.Select(row =>
    {
        var obj = new T();
        colnames.ForEach(col =>
        {
            var prop = typeof(T).GetProperty(col.Name);
            var val = row[col.index];
            if (val is double) // Handle Excel numbers
            {
                var unboxedVal = (double)val;
                if (prop.PropertyType == typeof(int))
                    prop.SetValue(obj, (int)unboxedVal);
                else if (prop.PropertyType == typeof(DateTime))
                    prop.SetValue(obj, DateTime.FromOADate(unboxedVal));
                else
                    prop.SetValue(obj, unboxedVal);
            }
            else
                prop.SetValue(obj, val);
        });
        return obj;
    });
}

This method extends the ExcelTable class, allowing any table to be easily converted to an object collection. For example, with an ExcelData class, use it as: var dataList = worksheet.Tables.First().ConvertTableToObjects<ExcelData>().ToList();.

Practical Application and Testing

To validate the method's effectiveness, a simple test can be written. Assuming the Excel file contains a table as described earlier, test code might look like:

[TestMethod]
public void TestTableConversion()
{
    var file = new FileInfo(@"C:\ExcelFile.xlsx");
    using (var package = new ExcelPackage(file))
    {
        var worksheet = package.Workbook.Worksheets[1];
        var table = worksheet.Tables.First();
        var result = table.ConvertTableToObjects<ExcelData>().ToList();

        foreach (var item in result)
            Console.WriteLine($"{item.Id}{item.Name}{item.Gender}");
        // Output: 1JohnMale, 2MariaFemale, 3DanielUnknown
    }
}

This test demonstrates loading an Excel from a file, extracting the table, and converting the data. The output should match expectations, proving the method's correctness.

Supplementary References and Optimization Tips

Beyond the best answer, other responses provide basic methods, such as directly iterating cells (Answer 1), but lack type safety and structured output. In real projects, it's advisable to incorporate error handling (e.g., for null values or invalid types) and performance optimizations (like caching reflection results). Additionally, for large Excel datasets, consider streaming or asynchronous operations to avoid memory issues.

Conclusion

Reading Excel tables and converting them to object collections with EPPlus is an efficient data processing approach, especially suited for scenarios requiring strong typing. The method introduced here leverages reflection and extension methods, offering a flexible and reusable solution. Developers can adapt the code based on specific needs, such as supporting more complex data types or custom mapping rules. Overall, mastering these techniques will significantly enhance efficiency and code quality in handling Excel data.

Copyright Notice: All rights in this article are reserved by the operators of DevGex. Reasonable sharing and citation are welcome; any reproduction, excerpting, or re-publication without prior permission is prohibited.