C-Strings
Table of Contents
Basics
The C programming language has a set of functions implementing operations on strings (character/byte strings) in its standard library:
#include <stdio.h> // strlen, strcpy, strtok, strstr
#include <string.h> // printf
int main() {
char str1[] = "Hello";
char text[] = "one,two,three";
// strlen - get length of string
printf("Length of str1: %zu\n", strlen(str1)); // 5
// strcpy - copy string into buffer
char buffer[50];
strcpy(buffer, str1); // buffer now contains "Hello"
printf("Copied: %s\n", buffer);
// strcat - concatenate two strings
strcat(buffer, " "); // buffer: "Hello "
strcat(buffer, "World"); // buffer: "Hello World"
printf("Concatenated: %s\n", buffer);
// strtok - tokenize a string by commas
char *token = strtok(text, ",");
while (token != NULL) {
printf("Token: %s\n", token);
token = strtok(NULL, ",");
}
// strstr - search for a substring
char *found = strstr(buffer, "World");
if (found) {
printf("Found substring at position: %ld\n", found - buffer);
}
// ...
}
Length of str1: 5
Copied: Hello
Concatenated: Hello World
Token: one
Token: two
Token: three
Found substring at position: 6
There is no real string data type but instead char[]
are used (character arrays).
Also while other languages like Java store the length of an array C instead has the convention to null-terminate every char[]
instead:
- a string of $n$ characters is represented as an array of $n + 1$ elements
- each character is usually a numeric value that maps to a character (e.g. ASCII codes)
- the last character is a “
NULL
character” with numeric value $0$ (no other character has this numeric value!)
Meaning the line char str1[] = "Hello"
actually stores 6 char
s where the last one is NULL
:
#include <stdio.h> // strlen
#include <string.h> // printf
int main() {
char str1[] = "Hello";
// Reported length of string
printf("Length of str1: %zu\n", strlen(str1)); // 5
// Memory layout
for (size_t i = 0; i < strlen(str1) + 1; i++) {
printf("str[%zu] = '%c' (ASCII %d)\n", i, str1[i], (unsigned char)str1[i]);
}
}
Length of str1: 5
str[0] = 'H' (ASCII 72)
str[1] = 'e' (ASCII 101)
str[2] = 'l' (ASCII 108)
str[3] = 'l' (ASCII 108)
str[4] = 'o' (ASCII 111)
str[5] = '