Keywords: C | string | strcmp | comparison
Abstract: This article explains why using == or != to compare strings in C is incorrect and demonstrates the proper use of the strcmp function for lexicographical string comparison, including examples and best practices.
Introduction
In C programming, a common mistake when comparing strings is using the equality operators == or !=. This often leads to unexpected behavior, as these operators compare the memory addresses of the string arrays rather than their contents. A typical scenario is illustrated in a user's question where a program intended to repeat input until the user types it back fails because of incorrect string comparison.
The Problem with Using == for Strings
In C, strings are represented as arrays of characters, and the name of an array decays to a pointer to its first element. When you use == or != with string arrays, you are comparing these pointer values, not the actual character sequences. For example, in the code snippet:
char input[40];
char check[40];
// ... initialization
while (check != input) {
// loop body
}
This condition checks if the addresses of check and input are different, which they always are since they are distinct arrays. Thus, the loop never exits even if the strings are identical.
Introducing the strcmp Function
To compare strings lexicographically, the standard library provides the strcmp function, declared in the <string.h> header. Its syntax is:
int strcmp(const char *s1, const char *s2);
It returns an integer value:
- 0 if the strings are equal.
- A positive value if
s1is lexicographically greater thans2. - A negative value if
s1is lexicographically less thans2.
How strcmp Works
The strcmp function compares the two strings character by character, based on their ASCII values. It starts from the first character and proceeds until it finds a mismatch or reaches the null terminator. If all characters match, it returns 0. Otherwise, it returns the difference in ASCII values of the first non-matching characters.
Example: Fixing the User's Code
In the user's original code, the string comparison was flawed. Here's a corrected version using strcmp and replacing the obsolete gets function with fgets for safety:
#include <stdio.h>
#include <string.h>
int main() {
char input[40];
char check[40];
printf("Hello!\nPlease enter a word or character:\n");
fgets(input, sizeof(input), stdin); // Read input safely
// Remove newline character if present
input[strcspn(input, "\n")] = 0;
printf("I will now repeat this until you type it back to me.\n");
while (strcmp(check, input) != 0) {
printf("%s\n", input);
fgets(check, sizeof(check), stdin);
check[strcspn(check, "\n")] = 0; // Remove newline
}
printf("Good bye!\n");
return 0;
}
This code now correctly compares the string contents and exits when the user inputs the same string.
Additional Examples
From reference materials, strcmp can be used in various contexts. For instance, to check if two strings are identical:
#include <stdio.h>
#include <string.h>
int main() {
char s1[] = "example";
char s2[] = "example";
if (strcmp(s1, s2) == 0) {
printf("Strings are equal.\n");
} else {
printf("Strings are not equal.\n");
}
return 0;
}
Another example shows lexicographical comparison:
#include <stdio.h>
#include <string.h>
int main() {
char str1[] = "apple";
char str2[] = "banana";
int result = strcmp(str1, str2);
if (result < 0) {
printf("'%s' comes before '%s'\n", str1, str2);
} else if (result > 0) {
printf("'%s' comes after '%s'\n", str1, str2);
} else {
printf("Strings are equal.\n");
}
return 0;
}
Best Practices and Common Pitfalls
When using strcmp, ensure that strings are null-terminated. Avoid using unsafe functions like gets, which can cause buffer overflows; instead, use fgets or other secure alternatives. Also, be mindful that strcmp is case-sensitive; for case-insensitive comparison, consider using strcasecmp if available or implementing a custom function.
Conclusion
Proper string comparison in C requires the use of strcmp or similar functions to compare content rather than addresses. By understanding how strcmp works and following best practices, developers can avoid common errors and write robust code.