Keywords: PostgreSQL | UPSERT | ON CONFLICT | Database Operations | Concurrency Control
Abstract: This technical paper provides an in-depth exploration of UPSERT operations in PostgreSQL, focusing on the ON CONFLICT clause introduced in version 9.5. Through detailed comparisons with MySQL's ON DUPLICATE KEY UPDATE, the article examines PostgreSQL's conflict resolution mechanisms, syntax structures, and practical application scenarios. Complete code examples and performance analysis help developers master efficient conflict handling in PostgreSQL database operations.
Overview of UPSERT Operations
UPSERT (UPDATE or INSERT) is a common database operation requirement where insertion conflicts trigger updates instead of errors. Before PostgreSQL 9.5, developers had to implement UPSERT functionality through alternative methods due to the lack of native support.
Detailed PostgreSQL ON CONFLICT Syntax
PostgreSQL 9.5 introduced the ON CONFLICT clause, providing standardized UPSERT implementation. The basic syntax structure is as follows:
INSERT INTO table_name (column1, column2, ...)
VALUES (value1, value2, ...)
ON CONFLICT (conflict_target)
DO UPDATE SET column1 = excluded.column1, column2 = excluded.column2;
The conflict_target can be column names, expressions, or constraint names, specifying the basis for conflict detection. The excluded keyword represents a special table alias containing the row data that was excluded from insertion.
Comparison with MySQL Syntax
MySQL implements UPSERT functionality using ON DUPLICATE KEY UPDATE syntax:
INSERT INTO table (id, field, field2)
VALUES (1, 'A', 'X'), (2, 'B', 'Y'), (3, 'C', 'Z')
ON DUPLICATE KEY UPDATE field=VALUES(field), field2=VALUES(field2);
While conceptually similar, PostgreSQL's syntax has important differences:
- Explicit specification of conflict target (column or constraint) is mandatory
- Uses excluded table reference instead of VALUES() function
- Supports more complex conditional updates and partial index conflict detection
Practical Implementation Examples
Consider a user table with id as primary key:
CREATE TABLE users (
id SERIAL PRIMARY KEY,
username VARCHAR(50) UNIQUE,
email VARCHAR(100),
last_login TIMESTAMP
);
Batch insert user data, updating email and last login time if username already exists:
INSERT INTO users (username, email, last_login)
VALUES
('alice', 'alice@example.com', NOW()),
('bob', 'bob@example.com', NOW()),
('charlie', 'charlie@example.com', NOW())
ON CONFLICT (username)
DO UPDATE SET
email = excluded.email,
last_login = excluded.last_login;
Advanced Feature Capabilities
The ON CONFLICT clause supports multiple advanced features:
Conditional Updates
Add WHERE conditions in the UPDATE section for finer control:
INSERT INTO products (sku, price, stock)
VALUES ('ITEM001', 29.99, 100)
ON CONFLICT (sku)
DO UPDATE SET
price = excluded.price,
stock = products.stock + excluded.stock
WHERE excluded.price > products.price;
Partial Index Conflicts
Support for conflict detection based on partial unique indexes:
CREATE UNIQUE INDEX active_users_idx ON users (username)
WHERE status = 'active';
INSERT INTO users (username, email, status)
VALUES ('david', 'david@example.com', 'active')
ON CONFLICT (username) WHERE status = 'active'
DO UPDATE SET email = excluded.email;
Historical Implementation Methods
Before PostgreSQL 9.5, developers implemented UPSERT functionality through alternative approaches:
PL/pgSQL Function Approach
Using stored procedures and exception handling for UPSERT implementation:
CREATE OR REPLACE FUNCTION merge_user(
p_username VARCHAR,
p_email VARCHAR
) RETURNS VOID AS $$
BEGIN
LOOP
UPDATE users SET email = p_email
WHERE username = p_username;
IF FOUND THEN
RETURN;
END IF;
BEGIN
INSERT INTO users (username, email)
VALUES (p_username, p_email);
RETURN;
EXCEPTION WHEN unique_violation THEN
-- Loop to retry UPDATE
END;
END LOOP;
END;
$$ LANGUAGE plpgsql;
CTE Approach
Using writable CTEs for UPSERT in PostgreSQL 9.1+:
WITH new_data AS (
SELECT 'alice' AS username, 'new_email@example.com' AS email
),
updated AS (
UPDATE users
SET email = new_data.email
FROM new_data
WHERE users.username = new_data.username
RETURNING users.username
)
INSERT INTO users (username, email)
SELECT username, email
FROM new_data
WHERE NOT EXISTS (
SELECT 1 FROM updated
WHERE updated.username = new_data.username
);
Concurrency Control and Performance Considerations
ON CONFLICT DO UPDATE provides atomic UPSERT operations with stable performance under high concurrency. In comparison, earlier manual implementations suffer from several issues:
- PL/pgSQL approach may encounter retry loops under extreme concurrency
- CTE approach avoids lost updates but may have complex execution plans
- Manual UPDATE+INSERT combinations risk race conditions
Best Practice Recommendations
Based on practical project experience, the following best practices are recommended:
- Prioritize ON CONFLICT syntax for code simplicity and correctness
- Explicitly specify conflict targets to avoid reliance on implicit inference
- Use multi-row VALUES syntax for batch operations to enhance performance
- Appropriately utilize RETURNING clause to retrieve operation results
- Consider combining with triggers for complex business logic
Conclusion
PostgreSQL's ON CONFLICT clause delivers powerful and flexible UPSERT functionality, maintaining SQL's declarative nature while providing sufficient control capabilities. Through proper utilization of this feature, developers can create efficient and reliable database operation code that meets modern application requirements for both data consistency and performance.