1 # Array List
2
3 Next to an array list implementation of the list interface,
4 UCX offers several functions to work with plain C arrays equipped with a size and a capacity.
5
6 The high-level [list interface](list.h.md) is documented on a separate page and explains how lists are used
7 that are created by one of the following functions.
8
9 ```C
10 #include <cx/array_list.h>
11
12 CxList *cxArrayListCreate(const CxAllocator *allocator,
13 cx_compare_func comparator, size_t elem_size,
14 size_t initial_capacity);
15
16 CxList *cxArrayListCreateSimple(size_t elem_size,
17 size_t initial_capacity);
18 ```
19
20 The remaining documentation on this page concentrates on dealing with plain C arrays.
21
22 ## Declare Array with Size and Capacity
23
24 ```C
25 #include <cx/array_list.h>
26
27 #define CX_ARRAY_DECLARE_SIZED(type, name, size_type)
28
29 #define CX_ARRAY_DECLARE(type, name)
30
31 #define cx_array_initialize(ARRAY, capacity)
32
33 #define cx_array_initialize_a(allocator, ARRAY, capacity)
34 ```
35
36 The purpose of the UCX array functions is to keep track of the size and capacity of a plain C array.
37
38 The raw functions do not need this information packed into a structure, but working with independent variables is quite cumbersome.
39 Therefore, UCX defines several macros that call the raw functions assuming certain variable names.
40
41 This is what the `CX_ARRAY_DECLARE_SIZED()` and `CX_ARRAY_DECLARE()` macros are for.
42 They take a `type` for the array members, a `name` for the array variable, and a `size_type` for the type of the size and capacity variables (`size_t` by default when you use `CX_ARRAY_DECLARE()`),
43 and generate three variables named `name`, `name_size`, and `name_capacity`.
44
45 For example, you can abbreviate the following code
46 ```C
47 struct my_data {
48 int other_int;
49 float other_float;
50 int *my_array;
51 size_t my_array_size;
52 size_t my_array_capacity;
53 }
54 ```
55 by substituting the three members for the array with `CX_ARRAY_DECLARE()`.
56 ```C
57 struct my_data {
58 int other_int;
59 float other_float;
60 CX_ARRAY_DECLARE(int, my_array);
61 }
62 ```
63
64 > Using `CX_ARRAY_DECLARE_SIZED()` can save some space when you are using a size type that is _half_ as wide as `sizeof(void*)`.
65 > On 64-bit architectures, having the possibility to store more than four billion items is rarely necessary, hence using 32-bit integers for size and capacity
66 > saves eight bytes (assuming proper alignment in your struct).
67
68 Once the array is declared, you can use one of the macros `cx_array_initialize()` or `cx_array_initialize_a()` to allocate memory.
69 The former uses the [default allocator](allocator.h.md#default-allocator) and the latter allows you to use a specific allocator.
70 Important to note is, that the `ARRAY` argument expects the variable's _name_.
71 The macros set `ARRAY_size` to zero, `ARRAY_capacity` to the specified initial capacity, and invoke the allocator's `malloc()` function to allocate the memory.
72
73 Using the example from above, and the UCX [memory pool](mempool.h.md), this could look like this:
74
75 ```C
76 CxMempool *mpool = cxMempoolCreateSimple(128);
77
78 struct my_data data;
79 cx_array_initialize_a(mpool->allocator, data.my_array, 16);
80 ```
81
82 > The aforementioned macros simplify your life, but keep in mind that using them is not mandatory.
83 > All other macros continue to work perfectly if you declare and initialize your array manually, as long as you use
84 > the naming convention to suffix the size variable with `_size` and the capacity variable with `_capacity`.
85 > Also, you can freely decide in which order you want to declare the variables.
86 >
87 > For example, when you have multiple arrays in your struct with 8-bit `size_type` (because in your use case you don't expect more than 255 elements),
88 > it is favorable to group all pointers and then declare the size and capacity variables to avoid padding between the array declarations.
89
90 ## Array Reallocator
91
92 ```C
93 #include <cx/array_list.h>
94
95 typedef struct {
96 void *(*realloc)(void *array,
97 size_t old_capacity, size_t new_capacity,
98 size_t elem_size, CxArrayReallocator *alloc);
99 const CxAllocator *allocator;
100 const void *stack_ptr;
101 } CxArrayReallocator;
102
103 CxArrayReallocator cx_array_reallocator(
104 const struct cx_allocator_s *allocator,
105 const void *stack_ptr
106 );
107
108 extern CxArrayReallocator* cx_array_default_reallocator;
109 ```
110
111 The main purpose of the functions defined in the array list header
112 is to make it easier to write to an array without caring too much about a possibly necessary reallocation.
113
114 This is realized by passing a reallocator to the various functions that specify how an array shall be reallocated when needed.
115
116 The default `cx_array_default_reallocator` simply defers to the [default allocator](allocator.h.md#default-allocator).
117
118 A reallocator created with the `cx_array_reallocator()` function uses a more sophisticated approach.
119 On the one hand, it can use an arbitrary UCX [allocator](allocator.h.md) for the reallocation,
120 and on the other hand, it can optionally keep track of a stack memory pointer.
121 If you pass a non-`NULL` pointer to `stack_ptr`, the reallocator will _always_ allocate _new_ memory for the array,
122 if the current location equals the `stack_ptr` address.
123
124 This allows you to allocate an array on the stack and instruct UCX to automatically move it to heap memory when the capacity is exceeded.
125 Combined with a UCX [memory pool](mempool.h.md) this can be a powerful tool.
126 For example, you can add small arrays to your structs plus a memory pool and then use that memory pool to reallocate the arrays on the heap when needed.
127 When you are done with the arrays, you can then use the memory pool to free the memory for those arrays that needed to be moved to the heap.
128
129 ## Reserve
130
131 ```C
132 #include <cx/array_list.h>
133
134 int cx_array_reserve(
135 void **array, void *size, void *capacity, unsigned width,
136 size_t elem_size, size_t count,
137 CxArrayReallocator *reallocator);
138
139 #define cx_array_simple_reserve(ARRAY, count)
140 #define cx_array_simple_reserve_a(reallocator, ARRAY, count)
141 ```
142
143 The function `cx_array_reserve()` guarantees that at least `count` _additional_ elements
144 can be stored in the array pointed to by `array`.
145
146 The `array` argument is a pointer to the location of the target array pointer.
147 The reason for this additional indirection is that this function writes
148 a new pointer back to that variable if the array needed to be reallocated.
149
150 If reallocation fails, this function returns non-zero, leaving the array as it was.
151 Otherwise, the function returns zero.
152
153 If `*size + elem_count` does not exceed the current `*capacity`, this function does nothing and simply returns zero.
154 Otherwise, the specified `reallocator` is used to reserve the necessary space.
155 If reallocation was necessary but failed, this function returns non-zero.
156
157 The actual data type of `size` and `capacity` can be a pointer to an arbitrary integer whose byte-width is determined by the `width` argument.
158 On 32-bit platforms the widths 1, 2, and 4 are supported, and on 64-bit platforms a width of 8 is also supported.
159 Passing zero as argument makes the function assume the native width for size arguments `sizeof(size_t)`.
160
161 The convenience macros take the _name_ of an array variable and invoke the function by deducing the other arguments
162 (including the width of the size and capacity).
163 The reallocator used by the `cx_array_simple_reserve()` macro is the `cx_array_default_reallocator`.
164
165 > While the actual integer type for `size` and `capacity` can be chosen freely, it must be _the same_ for both variables.
166 > For example, it is not allowed, and it does not make any sense either, to use a 16-bit integer for the size, but a 32-bit integer for the capacity.
167
168 ## Copy
169
170 ```C
171 #include <cx/array_list.h>
172
173 int cx_array_copy(
174 void **target, void *size, void *capacity, unsigned width,
175 size_t index, const void *src,
176 size_t elem_size, size_t elem_count,
177 CxArrayReallocator *reallocator);
178
179 #define cx_array_simple_copy(ARRAY, index, src, count)
180
181 #define cx_array_simple_copy_a(reallocator, ARRAY, index, src, count)
182 ```
183
184 The function `cx_array_copy()` first [reserves](#reserve) enough memory to copy `elem_count` number of elements from the `src` array
185 to the target array at the specified `index`, and then copies the data with one call to `memmove()`.
186
187 A few things to note:
188 * `*target` and `src` can point to the same memory region, since the underlying copy operation is a `memmove()`
189 * `*target` does not need to point to the start of the array, but `size` and `capacity` always start counting from the
190 position `*target` points to - in this scenario, the need for reallocation must be avoided for obvious reasons
191 * `index` does not need to be within size and not even within the capacity of the current array
192 * if `index` equals `*size`, this function effectively appends the data to the target array
193 * the data in the target array is overwritten - if you want to insert data, you first need to copy the existing data to the end of the array and then copy the new data in a separate call
194
195 > If you are using `cx_array_reserve()` already in your program, there is no need to call `cx_array_copy()` or any of the convenience macros anymore.
196 > In other words: if you already guarantee, by any means that no reallocation is necessary when writing to your array,
197 > it is strongly recommended to use standard library functions or direct index-based access instead of the UCX functions,
198 > which only purpose is to make it easier for you to guarantee that your array's capacity is large enough to hold new elements.
199
200 ## Add Elements
201
202 ```C
203 #include <cx/array_list.h>
204
205 #define cx_array_add(target, size, capacity, elem_size, elem,
206 reallocator);
207
208 #define cx_array_simple_add(ARRAY, elem)
209
210 #define cx_array_simple_add_a(reallocator, ARRAY, elem)
211 ```
212
213 The macros for adding an element are all convenience macros that invoke `cx_array_copy()`
214 interpreting the `elem` as an array of size one, copied to the past-by-one index of the target array.
215
216 ## Insertion Sort
217
218 ```C
219 #include <cx/array_list.h>
220
221 int cx_array_insert_sorted(
222 void **target, size_t *size, size_t *capacity,
223 cx_compare_func cmp_func,
224 const void *src, size_t elem_size, size_t elem_count,
225 CxArrayReallocator *reallocator);
226
227 #define cx_array_simple_insert_sorted(ARRAY,
228 src, elem_count, cmp_func)
229
230 #define cx_array_simple_insert_sorted_a(reallocator, ARRAY,
231 src, elem_count, cmp_func)
232
233 #define cx_array_add_sorted(target, size, capacity,
234 elem_size, elem, cx_compare_func cmp_func, reallocator);
235
236 #define cx_array_simple_add_sorted(ARRAY,
237 elem, cmp_func)
238
239 #define cx_array_simple_add_sorted_a(reallocator, ARRAY,
240 elem, cmp_func)
241 ```
242
243 The function `cx_array_insert_sorted()` inserts the `elem_count` number of already sorted elements from the `src` array
244 into the target array, comparing the elements with the given `cmp_func`.
245
246 The arguments of this function are used similar to [`cx_array_copy()`](#copy) with two notable exceptions:
247 first, this function actually inserts the items, moving existing items when necessary, and second,
248 no particular index is required because this function determines the insertion points automatically using [binary search](#binary-search).
249
250 If either the target or the source array is not already sorted with respect to the given compare function, the behavior is undefined.
251
252 The convenience macros are all calling `cx_array_insert_sorted()` by deducing the missing arguments.
253 The `cx_array_add_sorted()` family of macros are interpreting the `elem` as a `src` array with an `elem_count` of one.
254
255 ## Sets of Unique Elements
256
257 ```C
258 #include <cx/array_list.h>
259
260 int cx_array_insert_unique(
261 void **target, size_t *size, size_t *capacity,
262 cx_compare_func cmp_func,
263 const void *src, size_t elem_size, size_t elem_count,
264 CxArrayReallocator *reallocator);
265
266 #define cx_array_simple_insert_unique(ARRAY,
267 src, elem_count, cmp_func)
268
269 #define cx_array_simple_insert_unique_a(reallocator, ARRAY,
270 src, elem_count, cmp_func)
271
272 #define cx_array_add_unique(target, size, capacity,
273 elem_size, elem, cx_compare_func cmp_func, reallocator);
274
275 #define cx_array_simple_add_unique(ARRAY,
276 elem, cmp_func)
277
278 #define cx_array_simple_add_unique_a(reallocator, ARRAY,
279 elem, cmp_func)
280 ```
281
282 The above functions are similar to `cx_array_insert_sorted()` and its [relatives](#insertion-sort),
283 except that they skip insertion of elements that are already present in the target array.
284
285 ## Binary Search
286
287 ```C
288 #include <cx/array_list.h>
289
290 size_t cx_array_binary_search(
291 const void *arr, size_t size, size_t elem_size,
292 const void *elem, cx_compare_func cmp_func);
293
294 size_t cx_array_binary_search_inf(
295 const void *arr, size_t size, size_t elem_size,
296 const void *elem, cx_compare_func cmp_func);
297
298 size_t cx_array_binary_search_sup(
299 const void *arr, size_t size, size_t elem_size,
300 const void *elem, cx_compare_func cmp_func);
301 ```
302
303 The function `cx_array_binary_search()` searches the array `arr` for the data pointed to by `elem` using the compare function `cmp_func`.
304
305 If the element is found (i.e. `cmp_func` returns zero), the function returns the index of the element.
306 Otherwise, the function returns `size`.
307
308 The functions `cx_array_binary_search_inf()` and `cx_array_binary_search_sup()` are equivalent,
309 except that they return the index of the closest element if the searched element is not found.
310 The former function returns the closest element that is less or equal (the greatest lower bound / infimum),
311 and the latter function returns the closest element that is larger or equal (the least upper bound / supremum)
312 than the searched element.
313
314 When the found element appears more than once in the array,
315 the binary search and the infimum report the largest index
316 and the supremum reports the smallest index of the identical items, respectively.
317
318 > Note that all the above functions are only well-defined on arrays which are sorted with respect to the given compare function.
319 >
320 > This can, for example, easily be achieved by calling the standard library's `qsort()` function.
321 >{style="note"}
322
323 > Despite the name, neither C nor POSIX standards require the standard library's `bsearch()` function to be implemented using binary search.
324 > But observations show that it usually is.
325 > This makes `cx_array_binary_search()` likely to be equivalent to `bsearch()`, except that it returns an index rather than a pointer.
326
327 ## Iterators
328
329 ```C
330 #include <cx/iterator.h>
331
332 CxIterator cxIterator(const void *array,
333 size_t elem_size, size_t elem_count,
334 bool remove_keeps_order);
335
336 CxIterator cxIteratorPtr(const void *array,
337 size_t elem_count,
338 bool remove_keeps_order);
339 ```
340
341 Iterators over plain C arrays are defined in [iterator.h](iterator.h.md#creating-an-iterator).
342
343 ## Other
344
345 ```C
346 #include <cx/array_list.h>
347
348 void cx_array_swap(
349 void *arr,
350 size_t elem_size,
351 size_t idx1,
352 size_t idx2
353 );
354 ```
355
356 The function `cx_array_swap()` exchanges two items with a three-way swap.
357
358 No memory is allocated if the element size `elem_size` is not larger than `CX_ARRAY_SWAP_SBO_SIZE` (cf. [](install.md#small-buffer-optimizations)).
359
360 You have to make sure that both indices are within bounds of the array `arr`.
361
362 <seealso>
363 <category ref="apidoc">
364 <a href="https://ucx.sourceforge.io/api/array__list_8h.html">array_list.h</a>
365 </category>
366 </seealso>
367