Keywords: C Programming | Character Arrays | Character Pointers | Memory Management | Type System
Abstract: This article provides an in-depth examination of the core distinctions between character arrays and character pointers in C, focusing on array-to-pointer decay mechanisms, memory allocation strategies, and modification permissions. Through detailed code examples and memory layout diagrams, it clarifies different behaviors in function parameter passing, sizeof operations, and string manipulations, helping developers avoid common undefined behavior pitfalls.
Type System and Implicit Conversion Mechanisms
In the C type system, char[] and char* represent two distinct data types, yet they exhibit similar behavioral characteristics in specific contexts. This apparent similarity stems from the "array-to-pointer decay" rule defined by the C standard: when an array type appears in an expression context requiring a pointer type, the compiler automatically converts it to a pointer to the array's first element.
Consider the following function declaration:
void printSomething(char *p)
{
printf("p: %s", p);
}
When passing a character array parameter:
char s[10] = "hello";
printSomething(s);
The compiler actually performs implicit conversion:
char s[10] = "hello";
printSomething(&s[0]);
Memory Allocation and Storage Characteristics
Character arrays and character pointers differ fundamentally in memory management. The declaration char p[] = "hello" allocates contiguous space in stack memory or static storage, completely storing the string content including the null terminator '\0'. The sizeof operator can retrieve the array's actual capacity:
#include <stdio.h>
#include <string.h>
int main()
{
char *p = "hello";
char q[] = "hello";
printf("%zu\n", sizeof(p)); // Outputs pointer size (4 bytes on x86)
printf("%zu\n", sizeof(q)); // Outputs total array size (6 bytes, including '\0')
printf("%zu\n", strlen(p)); // Outputs string length 5
printf("%zu\n", strlen(q)); // Outputs string length 5
return 0;
}
Dual Semantics of String Literals
The C standard defines dual semantics for string literal processing. When used to initialize character arrays:
char c[] = "abc";
This is equivalent to explicit initialization:
char c[] = {'a', 'b', 'c', '\0'};
Array elements possess full modification permissions. However, when used for pointer initialization:
char *c = "abc";
The actual creation process resembles:
static char __unnamed[] = "abc";
char *c = __unnamed;
Here, the string is stored in read-only memory, and any modification operation will lead to undefined behavior.
Practical Applications and Considerations
In engineering practice, character arrays support complete array operations:
char s[] = "geeksquiz";
s[0] = 'j'; // Legal operation
printf("%s", s); // Outputs "jeeksquiz"
Whereas string literals pointed to by character pointers prohibit modification:
char *s = "geeksquiz";
// s[0] = 'j'; // Undefined behavior, typically causing segmentation fault
Modern compilers issue warnings for non-const pointers pointing to string literals, recommending const qualifiers:
const char *s = "geeksquiz"; // Eliminates compiler warning
The difference in pointer arithmetic also warrants attention:
char arr[] = "test";
char *ptr = "test";
// ptr++ // Legal, pointer moves backward
// arr++ // Illegal, array name is not an lvalue
Underlying Implementation and Memory Layout
Disassembly analysis reveals specific storage differences. For pointer declaration:
char *s = "abc";
The GCC compiler stores it in the .rodata read-only data segment. For array declaration:
char s[] = "abc";
The compiler directly initializes array elements in stack space, possessing full read-write permissions. This memory layout difference directly determines their respective operational ranges and behavioral characteristics.