SQLite Database Corruption and Recovery: In-depth Analysis from 'Disk Full' to 'Malformed Database Image'

Dec 02, 2025 · Programming · 13 views · 7.8

Keywords: SQLite database corruption | database recovery | backup strategy

Abstract: This article provides a comprehensive analysis of the 'database or disk is full' and 'database disk image is malformed' errors in SQLite operations. Through examination of real-world cases, it explains the technical principles behind phenomena like unchanged database file size and backup failures. The discussion focuses on SQLite's page allocation mechanism, transaction integrity requirements, and repair methods based on the .dump command. It emphasizes the importance of proper backup strategies to avoid file-level copying during active database operations.

Problem Manifestation and Initial Analysis

In practical SQLite database usage scenarios, developers may encounter two seemingly contradictory error messages: "database or disk is full" and "database disk image is malformed." These errors typically occur with database files around 25MB in size, where queries fail despite ample disk space (e.g., only 13% utilization).

SQLite File Growth Mechanism

SQLite database files grow according to specific page allocation strategies. The file size typically increases in multiples of the database page size and does not automatically shrink due to delete operations. This means that when records are deleted, the freed space is only marked as available internally for reuse in subsequent inserts. Therefore, after performing an insert operation, the database file size may remain unchanged, which is normal behavior and should not be considered abnormal.

For example, consider the following code snippet that demonstrates the basic mechanism of space reuse in SQLite:

import sqlite3

# Create database and insert data
conn = sqlite3.connect('test.db')
cursor = conn.cursor()
cursor.execute('CREATE TABLE IF NOT EXISTS items (id INTEGER PRIMARY KEY, name TEXT)')

# Insert initial data
for i in range(1000):
    cursor.execute('INSERT INTO items (name) VALUES (?)', (f'Item {i}',))
conn.commit()

# Delete some data
cursor.execute('DELETE FROM items WHERE id % 2 = 0')
conn.commit()

# Insert new data - file size may remain unchanged
for i in range(500):
    cursor.execute('INSERT INTO items (name) VALUES (?)', (f'New Item {i}',))
conn.commit()

conn.close()

Analysis of Database Corruption Causes

When executing the pragma integrity_check command returns numerous error messages, it indicates that the database file is corrupted. Typical errors include duplicate page references (e.g., "2nd reference to page 26416") and inconsistent child page depths (e.g., "Child page depth differs"). These errors typically stem from the following causes:

Repairing Corrupted Databases

For corrupted SQLite databases, the most effective repair method is to use the .dump command to export SQL statements and then recreate the database. Below is an example of a repair script:

#!/bin/bash

# Repair corrupted SQLite database file
DB_FILE="corrupted.db"
FIXED_FILE="fixed.db"

# Export database structure and data (skipping corrupted portions)
sqlite3 "$DB_FILE" .dump | grep -v "^ROLLBACK" > dump.sql

# Add COMMIT statement to ensure transaction completion
echo "COMMIT;" >> dump.sql

# Create repaired database
sqlite3 "$FIXED_FILE" < dump.sql

echo "Repair completed. Repaired file: $FIXED_FILE"

The core principle of this method is that the .dump command attempts to export all readable data, skipping corrupted records. By replacing ROLLBACK with COMMIT, valid data is ensured to be persisted. Note that this process may lose some severely corrupted data, so it should be used in conjunction with backup strategies.

Proper Backup Strategies

To prevent database corruption, proper backup methods must be employed. SQLite provides a dedicated backup API that ensures database consistency during backup operations. Below is an example of using the backup API via Python:

import sqlite3

def backup_database(src_path, dst_path):
    """Safely copy database using SQLite backup API"""
    src_conn = sqlite3.connect(src_path)
    dst_conn = sqlite3.connect(dst_path)
    
    # Initialize backup object
    backup = src_conn.backup(dst_conn)
    
    try:
        # Execute backup operation
        backup.step(-1)  # -1 indicates backing up all pages
        backup.finish()
        print(f"Database backup successful: {dst_path}")
    except sqlite3.Error as e:
        print(f"Backup failed: {e}")
    finally:
        src_conn.close()
        dst_conn.close()

# Usage example
backup_database('production.db', 'backup.db')

Compared to traditional file copying tools, the SQLite backup API can handle active transactions, ensuring backup data consistency. This is crucial for database maintenance in production environments.

Preventive Measures and Best Practices

To reduce the risk of database corruption, the following measures are recommended:

  1. Regularly execute the VACUUM command: Defragment the database file, release unused space, and optimize storage structure.
  2. Implement monitoring mechanisms: Periodically run pragma integrity_check to detect potential issues early.
  3. Use transaction boundaries: Ensure all write operations are executed within explicit transactions to avoid inconsistencies from partial writes.
  4. Verify backup integrity: Immediately check database integrity after restoring backups.

By understanding SQLite's internal mechanisms and adopting proper maintenance strategies, the risk of database corruption can be significantly reduced, ensuring data persistence and reliability.

Copyright Notice: All rights in this article are reserved by the operators of DevGex. Reasonable sharing and citation are welcome; any reproduction, excerpting, or re-publication without prior permission is prohibited.