Comprehensive Technical Analysis of Efficient Excel Data Import to Database in PHP

Dec 03, 2025 · Programming · 11 views · 7.8

Keywords: PHP | Excel import | database | PHPExcel | spreadsheet-reader | performance optimization

Abstract: This article provides an in-depth exploration of core technical solutions for importing Excel files (including xls and xlsx formats) into databases within PHP environments. Focusing primarily on the PHPExcel library as the main reference, it analyzes its functional characteristics, usage methods, and performance optimization strategies. By comparing with alternative solutions like spreadsheet-reader, the article offers a complete implementation guide from basic reading to efficient batch processing. Practical code examples and memory management techniques help developers select the most suitable Excel import solution for their project needs.

Technical Background and Requirements Analysis for Excel Data Import

In modern web application development, data migration and batch processing are common requirements, with Excel files serving as a widely used data exchange format. PHP developers frequently face the challenge of efficiently and accurately importing Excel data into databases such as MySQL. Traditional methods like manual copy-paste are not only inefficient but also error-prone, necessitating automated solutions.

PHPExcel: The Comprehensive Officially Recommended Solution

Based on extensive validation within the technical community, PHPExcel is currently the most highly regarded PHP Excel processing library. Maintained by the PHPOffice team, this library supports reading and writing multiple formats including Excel 2007+ xlsx and legacy xls files. Its core advantages lie in complete API documentation and active community support, where developers can quickly obtain answers to questions on the official forum.

From a technical architecture perspective, PHPExcel employs an object-oriented design, abstracting spreadsheets into a hierarchical structure of Workbook, Worksheet, and Cell. The following example demonstrates the basic data reading process:

<?php
require_once 'PHPExcel/PHPExcel.php';

// Create reader instance
$inputFileType = PHPExcel_IOFactory::identify($excelFile);
$objReader = PHPExcel_IOFactory::createReader($inputFileType);
$objPHPExcel = $objReader->load($excelFile);

// Get active worksheet
$worksheet = $objPHPExcel->getActiveSheet();
$highestRow = $worksheet->getHighestRow();
$highestColumn = $worksheet->getHighestColumn();

// Iterate through cell data
$dataArray = array();
for ($row = 1; $row <= $highestRow; $row++) {
    $rowData = $worksheet->rangeToArray('A' . $row . ':' . $highestColumn . $row, NULL, TRUE, FALSE);
    $dataArray[] = $rowData[0];
}

// Database insertion logic (example using PDO)
$pdo = new PDO('mysql:host=localhost;dbname=test', 'username', 'password');
$stmt = $pdo->prepare("INSERT INTO table_name (col1, col2, col3) VALUES (?, ?, ?)");

foreach ($dataArray as $row) {
    $stmt->execute(array_slice($row, 0, 3));
}
?>

The key to this implementation is the rangeToArray method, which converts specified cell ranges into PHP arrays for subsequent processing. It is important to note that PHPExcel loads the entire file into memory by default, which may cause performance issues with large files.

Memory Optimization and Streaming Processing Solutions

For large file processing scenarios, spreadsheet-reader provides a more efficient solution. This library adopts a row-by-row reading strategy, significantly reducing memory usage. Its technical implementation is based on an event-driven model, loading only the current row data when needed.

The following code demonstrates its core usage:

<?php
require('SpreadsheetReader.php');

$Reader = new SpreadsheetReader('large_file.xlsx');
foreach ($Reader as $rowIndex => $rowData) {
    // Process each row data in real-time
    if ($rowIndex == 0) continue; // Skip header row
    
    $sql = "INSERT INTO products (name, price, quantity) VALUES (?, ?, ?)";
    $stmt = $db->prepare($sql);
    $stmt->execute([$rowData[0], $rowData[1], $rowData[2]]);
}
?>

The advantage of this approach is avoiding loading the entire Excel file into memory, making it particularly suitable for processing files containing tens of thousands of rows. Technically, spreadsheet-reader internally uses an XML parser to read compressed xlsx file content in chunks, while xls files rely on php-excel-reader for binary parsing.

Data Type Handling and Error Prevention Mechanisms

During Excel data import, data type conversion is a common challenge. Date, number, and text formats are stored differently in Excel compared to databases. PHPExcel provides methods like getFormattedValue and getCalculatedValue to ensure data accuracy.

The following example demonstrates how to handle special data types:

<?php
$cellValue = $worksheet->getCell('A1')->getValue();
$cellType = $worksheet->getCell('A1')->getDataType();

// Handle Excel date format (stored as serial numbers)
if (PHPExcel_Shared_Date::isDateTime($worksheet->getCell('A1'))) {
    $timestamp = PHPExcel_Shared_Date::ExcelToPHP($cellValue);
    $dateValue = date('Y-m-d H:i:s', $timestamp);
}

// Handle formula calculation results
if ($cellType == 'f') {
    $calculatedValue = $worksheet->getCell('A1')->getCalculatedValue();
}
?>

Error handling mechanisms are equally important. Actual deployments should include file validation, format checking, and transaction rollback:

<?php
try {
    $db->beginTransaction();
    
    // Validate file type
    $allowedTypes = ['xlsx', 'xls', 'csv'];
    $fileExt = pathinfo($uploadedFile, PATHINFO_EXTENSION);
    if (!in_array(strtolower($fileExt), $allowedTypes)) {
        throw new Exception("Unsupported file format");
    }
    
    // Execute data import
    importExcelToDatabase($uploadedFile, $db);
    
    $db->commit();
    echo "Data import successful";
} catch (Exception $e) {
    $db->rollBack();
    echo "Import failed: " . $e->getMessage();
}
?>

Performance Optimization and Best Practice Recommendations

Selecting appropriate libraries and configurations based on actual application scenarios can significantly improve import performance:

  1. Small File Processing: PHPExcel provides complete functionality, suitable for scenarios requiring complex format handling
  2. Large File Batch Import: spreadsheet-reader's streaming reading is more memory-efficient
  3. Caching Strategy: PHPExcel supports serializing parsing results to avoid repeated parsing
  4. Batch Insert Optimization: Use MySQL's INSERT ... VALUES (...), (...), ... syntax to reduce query frequency

The following demonstrates optimized batch insertion implementation:

<?php
// Collect multiple rows for batch insertion
$batchSize = 100;
$values = [];
$placeholders = [];

foreach ($dataArray as $index => $row) {
    if ($index == 0) continue; // Skip header row
    
    $values = array_merge($values, array_slice($row, 0, 3));
    $placeholders[] = '(?, ?, ?)';
    
    // Execute batch insertion every 100 rows
    if (($index % $batchSize) == 0) {
        $sql = "INSERT INTO table_name (col1, col2, col3) VALUES " . 
               implode(', ', $placeholders);
        $stmt = $db->prepare($sql);
        $stmt->execute($values);
        
        // Reset parameters
        $values = [];
        $placeholders = [];
    }
}

// Process remaining rows
if (!empty($placeholders)) {
    $sql = "INSERT INTO table_name (col1, col2, col3) VALUES " . 
           implode(', ', $placeholders);
    $stmt = $db->prepare($sql);
    $stmt->execute($values);
}
?>

Technology Selection and Future Development Trends

Although PHPExcel is feature-rich, it is no longer maintained, with its successor PhpSpreadsheet offering a more modern API and better performance. For new projects, PhpSpreadsheet is recommended. Additionally, as web application complexity increases, consider using queue systems for asynchronous processing of large import tasks to avoid blocking user requests.

In practical development, trade-offs between functional completeness and performance requirements should be made based on specific needs. For simple data imports, spreadsheet-reader may be a more lightweight choice; for scenarios requiring complex Excel operations (such as style reading or chart processing), the PHPExcel/PhpSpreadsheet series remains the preferred option.

Copyright Notice: All rights in this article are reserved by the operators of DevGex. Reasonable sharing and citation are welcome; any reproduction, excerpting, or re-publication without prior permission is prohibited.