Keywords: PostgreSQL | quoted identifiers | case sensitivity
Abstract: This article delves into the issues surrounding quoted identifiers in PostgreSQL, particularly the query errors that arise when table or column names are enclosed in quotes. By analyzing the behavior of the information_schema.tables view, it explains why unquoted names can lead to ERROR: 42P01. Based on the best answer, the article compares the pros and cons of using quotes versus not using quotes, emphasizing the importance of maintaining lowercase and case-insensitive identifiers. Practical code examples illustrate how to avoid common pitfalls. Finally, it summarizes best practices for managing object naming in PostgreSQL to enhance database operation stability and maintainability.
Introduction
In PostgreSQL database development, many developers encounter a perplexing error: ERROR: 42P01: relation "[Table]" does not exist. This error is often related to the use of quotes in table or column names, especially when objects are created with double quotes. This article explores the root cause of this issue through a specific case study and provides solutions based on community best practices.
Problem Description
Suppose we create a table in PostgreSQL 9.3 with both the table name and column name enclosed in double quotes:
CREATE TABLE "TEST" ("Col1" BIGINT);
After successful creation, the quotes are preserved when viewed in pgAdminIII's SQL pane. However, when querying the information_schema.tables view to list all available tables:
SELECT table_schema, table_name FROM information_schema.tables WHERE NOT table_schema='pg_catalog' AND NOT table_schema='information_schema';
The result shows table names without quotes. Since the table was created with quotes, using the unquoted name directly in queries triggers the aforementioned error. Similar issues occur with column names.
Core Analysis
PostgreSQL handles identifiers (e.g., table names, column names) according to SQL standards, but the use of quotes introduces case sensitivity. By default, PostgreSQL converts unquoted identifiers to lowercase, making them case-insensitive. For example, TEST and test are treated as identical. However, once double quotes are used, identifiers become case-sensitive and must match exactly.
The information_schema.tables view returns unquoted identifiers, reflecting PostgreSQL's internal normalized storage. Thus, table names retrieved from this view may not work directly for queries involving quoted tables, leading to relation does not exist errors.
Solution Comparison
Based on the best answer, developers have two choices:
- Do not use quotes: All identifiers are automatically converted to lowercase, maintaining case-insensitivity. For example,
TESTandtestare equivalent. This simplifies querying and maintenance, recommended as the default strategy. - Use quotes: Identifiers are case-sensitive and must match exactly. For example,
"Test"andTest(the latter converted totest) are different. Once quotes are introduced, consistency must be maintained in all related operations to avoid errors.
The following code examples demonstrate these behaviors:
-- Without quotes, case-insensitive
CREATE TABLE test (col1 BIGINT);
SELECT * FROM TEST; -- Succeeds, as TEST is converted to test
-- With quotes, case-sensitive
CREATE TABLE "Test" ("Col1" BIGINT);
SELECT * FROM Test; -- Fails, as Test is converted to test, but table name is "Test"
SELECT * FROM "Test"; -- Succeeds, exact match
Best practices strongly advise against using quotes to keep identifiers simple and consistent. If special characters or case preservation are necessary, ensure proper quoting in all queries.
Practical Recommendations
When retrieving table or column names from information_schema for quoted objects, developers need to manually add quotes. For example, using dynamic SQL:
SELECT 'SELECT * FROM "' || table_name || '";' FROM information_schema.tables WHERE table_name = 'TEST';
However, this approach adds complexity. A better practice is to adhere to the no-quotes convention, ensuring identifiers are always lowercase to avoid such issues.
Conclusion
Quoted identifiers in PostgreSQL offer flexibility but introduce case sensitivity and maintenance overhead. By avoiding quotes and sticking to lowercase, case-insensitive identifiers, developers can significantly reduce errors like ERROR: 42P01 and improve code readability and maintainability. When dealing with existing quoted objects, consistency in quoting is essential. In summary, simplifying naming conventions is a key best practice in PostgreSQL database development.