Classroom practice: Strings, Dynamic Strings
Gábor Horváth / Zsolt Kohári · 2020.10.30.
Strings as zero terminated arrays. Low-level string operations. Dynamic memory allocation.
Preparation for the practice:
- The lecture notes on arrays should be reviewed.
- The lecture notes on pointers and strings should be reviewed.
- The lecture notes on dynamic memory management should be reviewed.
The task is to reverse a string. Let us write two different variants of the solution. According to the first approach, the string reversing function receives two arrays, one is the input, and the other is the output. The second approach shall use only a single array, that contains the input string initially, and the function reverses it in-place, by modifying the content of the input string.
Solution
The two solution approaches lead to very different solutions. The one where the output is directed to a separate array mostly consists of copy operations, such that the input array is traversed backwards, while the output array is built forwards. The inplace reverse algorithm is a bit more tricky. In each step, two characters have to be swapped (first - last, second - before the last, etc.). Take care to stop swapping in the middle, otherwise the already reversed array is reversed back again.
In both cases the end of the output string must be handled with care, we have to make sure that the terminating zero value is at the end of the resulting string before the function returns. Note that the functions do not receive the length of the strings, since it is unnecessary because of the zero marks the end of the string.
#include <stdio.h> #include <string.h> void reverse_to_array(char *in, char *out) { /* find the position of the last valid character */ int i = strlen(in) - 1; /* goes backwards and puts all the characters to the output array */ int j = 0; while (i >= 0) { out[j] = in[i]; ++j; --i; } /* closing the output string with a terminating zero value */ out[j] = '\0'; } void reverse_inplace(char *str) { /* determine the length of the string */ int len = strlen(str); /* going till the middle */ for (int i = 0; i != len/2; ++i) { /* str[i] <-> str[len-1-i] */ char temp = str[i]; str[i] = str[len-1-i]; str[len-1-i] = temp; } /* the terminating zero value remains where it was originally */ } int main(void) { char first[] = "hello world!"; char second[20]; reverse_to_array(first, second); printf("[%s] [%s]\n", first, second); printf("[%s]\n", second); reverse_inplace(second); printf("[%s]\n", second); return 0; }
Write a function that receives a string and removes all space characters form it. The function should work with the original string (received as parameter), without creating a new string!
Solution
Some considerations:
- Is the capacity of the original string enough to fit the result? Yes, since the resulting string can only be shorter.
- Do we need an auxiliary array? No, since the string can only be shorter during the space removal.
- Write down a string with spaces, and the same string with the spaces removed below each other, algined to characters. The solution should now be straight forward: two variables are needed, one marks the current position in the original string (we copy the next character from this position), and an other variable that marks the target position for the next caracter.
#include <stdio.h> void removespaces(char *t) { /* go through all of the characters */ /* "from" increases in every iteration, "to" increases only if needed */ int to = 0; for (int from = 0; t[from] != '\0'; ++from) { /* if not a space, copy it */ if (t[from] != ' ') { t[to] = t[from]; to++; } } t[to] = '\0'; /* terminating zero */ } int main(void) { char hello[] = "H e l l o, world!"; removespaces(hello); printf("Without spaces: [%s]\n", hello); return 0; }
In our previous solution we have removed the spaces in-place, the input array was modified and it stored the output when the function finished. Let us solve it differently, such a way, that the function creates and returns a brand new string instead.
Write a simple C program to demonstrate the usage of the function.
Solution
Dynamic memory management will be useful for this exercise for two reasons. First, it enables us to allocate an array in the function which it can actually return. (Remember that arrays declared as local variables can not be returned since they are removed from the memory after the function finishes. However, arrays allocated dynamically survive the death of the function). Second, we can set the size of the resulting string at run-time as needed, without wasting memory.
The input string is traversed twice: first the number of non-space characters are counted, then, in the second phase, the non-space characters are copied to the target array allocated for this purpose.
The space-removing function can return a NULL
in case of any error, just like the malloc()
function.
#include <stdio.h> #include <stdlib.h> char *removespaces(char *inp) { int nonspace = 0; for (int i = 0; inp[i] != '\0'; ++i) if (inp[i] != ' ') nonspace += 1; char *str = (char*) malloc(sizeof(char) * (nonspace+1)); if (str == NULL) return NULL; /* :( */ int j = 0; for (int i = 0; inp[i] != '\0'; ++i) if (inp[i] != ' ') str[j++] = inp[i]; str[j] = '\0'; return str; } int main(void) { char *nospaces = removespaces("hello word apple tree"); if (nospaces == NULL) { printf("there was an error, no space-free string is available"); } else { printf("%s", nospaces); free(nospaces); } return 0; }
4. Reading in a long line
The gets()
function (reading a line of text) is very much unsafe: an array must be used in the call, to store the text but the required size of the array turns out after reading the whole line only.
Read in a line of arbitrary length and print it! To handle arbitrarily long lines, the program has to increase the size of the string by one each time a new character is read.
Hint
The dynamic array can not be allocated by a single malloc class, since the number of elements is not known in advance. Hence we need to resize the dynamic array every time after reading a new character. The steps are:
- Allocating a new (longer) array for the string extended by the new character.
- Copy existing characters from the old.
- Release the old array.
- Make your main pointer point to the new.
- Put the new character to the end.
- Put the terminating NULL to the end.
Remember, a pointer can be set to point to any allocated memory segment! The main rule is to remember the address of each dynamically allocated memory segment. If we lose an address, there is no way to access the stored data, neither can the allocated memory be released.
We must pay special attention to one thing: what is the content of the array when the string is empty? How many elements (characters) are there in the array?
Solution
For convenience, it is worth to always put the terminating zero to the end of the array - it is a string after all, that
will be passed to printf()
or an other string manipulation function sooner or later.
As a consequence, the array of the empty string is of size 1, to accomodate the terminating zero. At the extension,
the new size will becnt+1+1
, since we had cnt
characters before, +1
is needed for the new character
and +1
for the terminating zero.
#include <stdio.h> #include <stdlib.h> int main(void) { printf("Enter an arbitrary long line!\n"); int cnt = 0; char *line = (char*) malloc(sizeof(char) * 1); line[0] = '\0'; char newchar; while (scanf("%c", &newchar) == 1 && newchar != '\n') { /* array extension */ char *newarr = (char*) malloc(sizeof(char) * (cnt+1+1)); for (int i = 0; i < cnt; ++i) newarr[i] = line[i]; newarr[cnt] = newchar; newarr[cnt+1] = '\0'; free(line); line = newarr; ++cnt; } printf("[%s]\n", line); free(line); return 0; }
5. read_long_line()
Modify the previous program so that you put the reading of the line into a separate function! The return value should be the address of the array containing the text read in.
Test this program.
Hint
Releasing (free()
) the string must remain in main. If you put it into the function your data would be destoyed before returning the result.
Solution
The free()
call to release the memory should stay in the main! It can not be put into the function, since it would mean that
the array gets destroyed before the function finished.
#include <stdio.h> #include <stdlib.h> char *read_long_line() { int cnt = 0; char *line = (char*) malloc(sizeof(char) * 1); line[0] = '\0'; char newchar; while (scanf("%c", &newchar) == 1 && newchar != '\n') { /* array extension */ char *newarr = (char*) malloc(sizeof(char) * (cnt+1+1)); for (int i = 0; i < cnt; ++i) newarr[i] = line[i]; newarr[cnt] = newchar; newarr[cnt+1] = '\0'; free(line); line = newarr; ++cnt; } return line; } int main(void) { printf("Enter an arbitrary long line!\n"); char *line = read_long_line(); printf("[%s]\n", line); free(line); return 0; }
Palindrome
Write a program that receives a string and checks whether it is a palindrome, hence if it reads the same backward as forward (such as "racecar"). A well-known English palindrome is "Was it a car or a cat I saw?". Try to solve it both when it is forbidden to modify the string and when it is allowed to modify it. What is the relation between this task and the previous one ("removing spaces")?
Stammering 2
See our old Stammering problem for detailed description of the concept. You can start from your solution of that problem and modify the code.
Put the stammering process into a function that receives a string parameter: the text to process, and returns another string: the modified text. The function should first calculate the memory required for the processed text, then allocate dynamic memory of exact size. After producing the "stammered" string the function should return in.
Call the above funtion from main()
, print the result string and release the dynamic memory.
Trimmer 2
Solve the trimmer problem of week 8 in such a way that the function receives just the original string and produces the trimmed text in a new, dynamically allocated string of exactly the required size. The function returns the address of the result.
Your main()
should call the function, print the result string and at last release the dynamic memory.
Separate odd and even 2
Look at the odd/even separation problem from classroom practice about pointers. Review the function that receives an array of n
integers and separates the values into two further arrays (received as parameters, too): odd values in one, even values in the other. void separate(int *a, int n, int *odd, int *even, int *n_e){...}
.
Write a function very similar to that one but receiving only one array as input parameter. Have the function count the odd and even elements first, allocate the odd and even arrays of appropriate size, fill the new arrays and at last return them via the parameter list. Note: an * is needed for it is an array. Another * is necessary to make it an output parameter. Finally it turns out to be an int **odd
.