Classroom practice: Dynamic arrays, part II.
Gábor Horváth / Zsolt Kohári · 2020.11.20.
Dynamic memory allocation. Dynamic arrays. Data type for storing sets.
Preparation for the practice:
- The lecture notes on arrays should be reviewed.
- The lecture notes on pointers and strings should be reviewed.
- The lecture notes on dynamic memory management should be reviewed.
The code snippets contain at least one (or potentially more) errors. Find them! Explain what the problem is! What makes the codes wrong? What is the principal problem with them? How to fix them?

int fun(void)[100] { int array[100]; return array; }
Solution
Nice try, but it does not work. First of all, it is syntactically wrong: arrays can neither be received nor be returned by functions like this. In case we would want to return an array created in the function, we have to allocate it dynamically, and return the pointer pointing to it.

struct DynArray { int n; int array[n]; };
Solution
In the C language the size of a struct must always be fixed, and known at compile time, the compiler must know
its size even before the program is executed. This is one of the reasons that make the above solution wrong: changing
member variable n
in the structure won't resize the array. The other issue is related to the scopes: member
variable n
will never exist in itself as a stand-alone variable in the code, only if we create an instance
from structure DynArray
. Then, something.n
is a valid variable, but without creating an instance
it is not possible to use n
(for instance, to set the array size to it).
Fix: the array should not be put into the structure, it should be created externally. The structure should store a pointer to it only:
struct DynArray { int n; int *array; };
Then, the array must be allocated dynamically:
struct DynArray da1; da1.n = 100; da1.array = (int*) malloc(sizeof(int) * 100);

struct DynArray { int n; int *array = (int*) malloc(sizeof(int) * n); };
Solution
Total nonsense, in C language no executable statements can be put into the structures.
How should it be executed, when variable n
does not even have a value yet?
It can be fixed as shown in the previous example.
Write a program that implements various operations on a set consisting of real numbers. The number of elements in the set can be arbitrary large. The following operations must be implemented (by a function for each):
- It must be possible to check whether an element is in the set or not
- It must be possible to insert a new element into the set (if it is already there, nothing should happen)
- It must be possible to remove an element from the set
Draw a figure about the memory layout of the set data structure!
Theoretically, it is not recommended to compare real numbers using the ==
operator. For simplicity, let us now ignore this issue and focus on the dynamic memory operations instead!
Solution
The requirement "the number of elements in the set can be arbitrary large" indicates that we have to rely on the dynamic memory management. The elements of the set can be stored in a dynamically growing array. A set is characterized by two properties: the pointer pointing to this array and the number of elements in the array. These two data must be managed together all the time, hence it makes sense to enclose them in a structure:
typedef struct Set { int len; double *data; } Set;
When an instance of the set is created, both the len
and the data
member variables contain memory garbage.
Consequently, the set must be properly initialized before the first usage. To avoid memory leaks, the memory allocated for the array
must be released when the set is not used any more. Therefore, apart from the above listed three functions we need to implement two more:
one for initializing and one for destroying a set.
Many functions operating of the set modify the member variables of the structure. For instance, the data array may get reallocated, the
length of the array can grow, etc. For this reason, we will pass the pointer to the set structure to these functions, instead of
passing the structure itself. This is only correct solution in case of the insert
and the remove
functions, but
for convenience (to make the call interface of the functions uniform) we are going to implement all functions like this. In this case we do not
have to remember the exact parameter list of each function, since they are the same. Those functions, that dont modify the set (e.g., the one
that checks if a given element is included) will receive a pointer pointing to a constant structure, so that the compiler can check whether
we wrote a wrong code in the function that modifies the set by mistake.
Inserting a new number to the set or removing one from it is a costly operation. Since arrays can not be resized, the a new array must be created
every time the arrray is modified, and the old elements must be copied from the old place to the new one. This copy operation makes the
implementation inefficient. After the new array is fully filled with the updated content, the old one must be released and the pointer in the
structure pointing to the data array must be updated to point to the new array. (The realloc()
function does basically the same)
#include <stdio.h> #include <stdlib.h> #include <stdbool.h> typedef struct Set { int len; double *data; } Set; /* receives an uninitialized set and makes a valid empty set from it */ void set_init(Set *s) { s->len = 0; s->data = NULL; } /* releases the set when it is not needed anymore */ void set_destroy(Set *s) { free(s->data); } /* returns true is the element is in the set */ bool set_contains(Set const *s, double what) { for (int i = 0; i < s->len; ++i) if (s->data[i] == what) return true; /* found in the set */ return false; /* not found */ } void set_insert(Set *s, double what) { /* do nothing if already in the set */ if (set_contains(s, what)) return; /* copy elements to a bigger by one array */ double *newarr = (double*) malloc((s->len + 1) * sizeof(double)); for (int i = 0; i < s->len; ++i) newarr[i] = s->data[i]; newarr[s->len] = what; /* put the new element to the end */ free(s->data); /* the old array is not needed any more */ s->data = newarr; /* the pointer points to the new array from now on */ s->len++; /* number of elements has increased by one */ } void set_remove(Set *s, double what) { /* do nothing if not in the set */ if (!set_contains(s, what)) return; /* new array, number of elements is less by one */ double *newarr = (double*) malloc((s->len - 1) * sizeof(double)); int j = 0; for (int i = 0; i < s->len; ++i) if (s->data[i] != what) newarr[j++] = s->data[i]; free(s->data); s->data = newarr; s->len--; /* number of elements has decreased by one */ } /* lists the content of the set */ void set_print(Set const *s) { for (int i = 0; i < s->len; ++i) printf("%g ", s->data[i]); printf("\n"); } int main(void) { Set s; set_init(&s); set_insert(&s, 3.14); set_insert(&s, 2.0); set_insert(&s, 3.14); set_print(&s); set_insert(&s, 6.1); set_print(&s); set_remove(&s, 3.14); set_print(&s); set_destroy(&s); return 0; }
In the solution above we have exploited that free(NULL)
is accepted by the standard. For some older C compilers this
can be a problem, though. In this case, writing if (ptr!=NULL) free(ptr);
solves the problem.
After the main()
function has put number 6.1 to the set, the memory layout is as follows:
Set s
is in the stack, consisting of the length and the pointer, and the pointer points to a dynamically
allocated array, given by the malloc()
call.
Let us modify the previous solution such that the initializer function returns a new, dynamically created set, instead of initializing an existing one. After this modification using the set should be as simple as follows:
Set *s; s = set_new(); set_insert(s, 3.14); set_insert(s, 6.1); set_print(s); set_remove(s, 3.14); set_print(s); set_destroy(s);
This way the sets can be passed freely among the functions, they wont be deleted since they are not local variables any more.
Observe how benefitial the modification is to the syntax: since s
is always a pointer, we dont need to use the
address-of operator &
any more.
Solution
This approach leads to the following memory layout
The local variable in the main()
is only a pointer, both the set structure and the data array are located in the heap, since they
were allocated dynamically.
The difference is only in the initializing and the destroying functions. When initializing, the structure has to be created by
malloc()
, too, consequently, when destroying, the structure has to be released, too:
Set *set_init() { Set *ps; ps = (Set*) malloc(sizeof(Set)); ps->data = NULL; ps->len = 0; return ps; } void set_destroy(Set *ps) { free(ps->data); free(ps); }