1 # Linked List
2
3 On top of implementing the list interface, this header also defines several low-level functions that
4 work with arbitrary structures.
5
6 The high-level [list interface](list.h.md) is documented on a separate page and explains how lists are used
7 that are created by one of the following functions.
8
9 ```C
10 #include <cx/linked_list.h>
11
12 typedef struct cx_linked_list_s cx_linked_list;
13
14 CxList *cxLinkedListCreate(const CxAllocator *allocator,
15 size_t elem_size);
16 ```
17
18 If you are using high-level linked lists to implement your own list structure,
19 it might be necessary to store additional data in each node that does not belong to the elements.
20 For this purpose, there exists the following function.
21
22 ```C
23 void cx_linked_list_extra_data(cx_linked_list *list, size_t len);
24 ```
25
26 It instructs the linked list to allocate `len` bytes of extra data when allocating a node.
27 The location offset of the extra data is stored in the `loc_extra` field of the linked list.
28
29 > The [key/value list](kv_list.h.md) uses this technique to store the `CxHashKey` as extra data in each node.
30
31 ## Basic Structure
32
33 This section and the remaining content of this page describe the low-level functions for linked lists.
34
35 Those work with nodes that have the following structure.
36
37 ```C
38 struct node {
39 node *next;
40 node *prev; // optional
41 // ... the element data goes here ...
42 };
43 ```
44
45 Each node must have at least one pointer that contains the address of the subsequent node.
46 Optionally, for doubly linked lists, there may be a second pointer that points to the predecessor.
47
48 The functions usually expect a `loc_prev` and a `loc_next` offset.
49 In the example structure from above you can obtain them with `offsetof(struct node, next)` and `offsetof(struct node, prev)`.
50 In all functions, `loc_prev` is optional in the sense, that when you do not have a `prev` pointer, you can specify a negative value.
51 When a function expects a `loc_advance` offset, you can freely choose if you want to pass the offset of the `next` or the `prev` pointer,
52 depending on what you want to do.
53
54 If a function expects a `void** begin` and a `void** end` pointer, they are usually both optional, unless otherwise specified.
55 If non-`NULL`, they point to the variables where the addresses of the first, or the last, node of a list are stored, respectively.
56 When a list operation results in a new first (or last) node, the addresses are overwritten.
57 In simple scenarios, you usually keep a pointer to the beginning of a list, and hence you would usually pass `NULL` to the `end` argument.
58 If you are designing a stack-like linked-list, it may happen that you only want to store the last node of a list (the top of stack),
59 hence passing `NULL` to the `begin` argument and the address of your top-of-stack pointer to the `end` argument.
60 Either way, most functions allow you a big deal of flexibility — please still read the documentation of each function carefully to learn which combinations are allowed.
61
62 When you are working with a singly linked list, it is still sometimes necessary to access the predecessor of a node.
63 In the absence of a `prev` pointer, the only chance is to start at the beginning of the list and search for that node.
64 This is exactly what `cx_linked_list_prev()` does.
65 ```C
66 #include <cx/linked_list.h>
67
68 void *cx_linked_list_prev(const void *begin,
69 ptrdiff_t loc_next, const void *node);
70 ```
71
72 Starting with the `begin` node, the function checks if the next node is exactly the `node` (by pointer-comparison).
73 If true, the function terminates and returns the current node.
74 Otherwise, it moves on with the search.
75 If `begin` is already the searched `node`, this function immediately returns `NULL` as there is no predecessor.
76 Note carefully that the behavior of this function is not defined when `node` is not contained in the list that starts with `begin`.
77
78 > It is advisable to use the low-level functions inside own custom functions that you define particularly for your lists.
79 > Otherwise you will get a lot of boilerplate code when specifying the offsets to the pointers in your node structure in every call
80 > across your entire program.
81
82 ## Link and Unlink
83
84 ```C
85 #include <cx/linked_list.h>
86
87 void cx_linked_list_link(void *left, void *right,
88 ptrdiff_t loc_prev, ptrdiff_t loc_next);
89
90 void cx_linked_list_unlink(void *left, void *right,
91 ptrdiff_t loc_prev, ptrdiff_t loc_next);
92 ```
93
94 When you have two nodes `left` and `right` you can link or unlink them with the functions shown above.
95
96 When linking `left` and `right` you should make sure that `left` as currently no successor and `right` has no predecessor,
97 because the pointers will be overwritten without unlinking possible existing links, first.
98
99 On the other hand, when unlinking `left` and `right`, they must actually be linked right now.
100 This is even asserted in debug builds.
101
102 Usually you should not need those functions when working with the [insert](#insert) and [remove](#remove) functions below.
103
104 ## Insert
105
106 ```C
107 #include <cx/linked_list.h>
108
109 void cx_linked_list_add(void **begin, void **end,
110 ptrdiff_t loc_prev, ptrdiff_t loc_next,
111 void *new_node);
112
113 void cx_linked_list_prepend(void **begin, void **end,
114 ptrdiff_t loc_prev, ptrdiff_t loc_next,
115 void *new_node);
116
117 void cx_linked_list_insert(void **begin, void **end,
118 ptrdiff_t loc_prev, ptrdiff_t loc_next,
119 void *node, void *new_node);
120
121 void cx_linked_list_insert_chain(void **begin, void **end,
122 ptrdiff_t loc_prev, ptrdiff_t loc_next,
123 void *node, void *insert_begin, void *insert_end);
124
125 void cx_linked_list_insert_sorted(void **begin, void **end,
126 ptrdiff_t loc_prev, ptrdiff_t loc_next,
127 void *new_node, cx_compare_func cmp_func);
128
129 void cx_linked_list_insert_sorted_chain(void **begin, void **end,
130 ptrdiff_t loc_prev, ptrdiff_t loc_next,
131 void *insert_begin, cx_compare_func cmp_func);
132
133 int cx_linked_list_insert_unique(void **begin, void **end,
134 ptrdiff_t loc_prev, ptrdiff_t loc_next,
135 void *new_node, cx_compare_func cmp_func);
136
137 void *cx_linked_list_insert_unique_chain(void **begin, void **end,
138 ptrdiff_t loc_prev, ptrdiff_t loc_next,
139 void *insert_begin, cx_compare_func cmp_func);
140 ```
141
142 The above functions can be used to insert one or more elements into a linked list.
143
144 The functions `cx_linked_list_add()` and `cx_linked_list_prepend()` add the `new_node` to the beginning, or the end, of the list, respectively.
145 Either `begin` or `end` needs to be non-`NULL`.
146 If you pass `NULL` to `end` in `cx_linked_list_add()`, you have to specify `begin` and `loc_next`, instead.
147 On the other hand, if you pass `NULL` to `begin` in `cx_linked_list_prepend()`, you have to specify `end` and `loc_prev`, instead.
148
149 The function `cx_linked_list_insert()` behaves like `cx_linked_list_insert_chain()` where both
150 `insert_begin` and `insert_end` are set to `new_node`.
151
152 The function `cx_linked_list_insert_chain()` inserts the chain of nodes starting with `insert_begin` after the node `node`.
153 If `insert_end` is `NULL`, the end is automatically detected, in which case `loc_next` must be available.
154 If you pass `NULL` to `node`, the entire chain is prepended to the list, in which case either `begin` must be non-`NULL`,
155 or `end` must be non-`NULL` and `loc_prev` must be available to determine the start of the list.
156
157 The functions `cx_linked_list_insert_sorted()` and `cx_linked_list_insert_sorted_chain()` are equivalent,
158 except that `begin` and `loc_next` are always required, and the target list must already be sorted.
159 The order is determined by the `cmp_func` to which the pointers to the nodes are passed.
160
161 The functions `cx_linked_list_insert_unique()` and `cx_linked_list_insert_unique_chain()` are similar to
162 `cx_linked_list_insert_sorted()` and `cx_linked_list_insert_sorted_chain()`, except that they only insert the node
163 if it is not already contained in the list.
164 When a duplicate is found, `cx_linked_list_insert_unique()` returns non-zero and `cx_linked_list_insert_unique_chain()`
165 returns a pointer to a new chain starting with the first node that was not inserted.
166
167 > The `cx_linked_list_insert_sorted_chain()` function does not have an `insert_end` argument, because
168 > it cannot take advantage of simply inserting the entire chain as-is, as the chain might need to be broken
169 > to maintain the sort order.
170
171 ## Access and Find
172
173 ```C
174 #include <cx/linked_list.h>
175
176 void *cx_linked_list_first(const void *node, ptrdiff_t loc_prev);
177
178 void *cx_linked_list_last(const void *node, ptrdiff_t loc_next);
179
180 void *cx_linked_list_at(const void *start,
181 size_t start_index, ptrdiff_t loc_advance, size_t index);
182
183 void *cx_linked_list_find(const void *start,
184 ptrdiff_t loc_advance, ptrdiff_t loc_data,
185 cx_compare_func cmp_func,
186 const void *elem, size_t *found_index);
187
188 size_t cx_linked_list_size(const void *node, ptrdiff_t loc_next);
189 ```
190
191 The functions `cx_linked_list_first()` and `cx_linked_list_last()` can be used to find the first, or last,
192 node in your list in case you are not keeping track of them separately.
193 They can start at an arbitrary node within the list.
194
195 The function `cx_linked_list_at()` starts at an arbitrary node `start` which is _specified_ to have the index `start_index`,
196 and finds the node at `index`.
197 If `index` is larger than `start_index`, you must pass the offset of the `next` pointer to `loc_advanced`.
198 On the other hand, if `index` is smaller than `start_index`, you must pass the offset of the `prev` pointer.
199 If both indices match, the function simply returns `start`.
200 And if `index` is out-of-bounds, the function returns `NULL`.
201
202 The function `cx_linked_list_find()` starts a search at the `start` node and compares each element with `elem`.
203 If `loc_data` is non-zero, the data-type of `elem` must be a pointer to data which is compatible to the data located at the specified offset in the node.
204 If `loc_data` is zero, `elem` is expected to point to a node structure (which is usually artificially created for the sake of comparison and not contained in the list).
205 When the searched element is found, a pointer to the node is returned and the index (assuming `start` has index zero) is written to the optional `found_index`, if non-`NULL`.
206
207 The size of a list, or sub-list, can be determined with `cx_linked_list_size()` which may start at an arbitrary `node´ in the list.
208
209 > A creative way of using `cx_linked_list_size()` in doubly-linked lists is to use the offset of the `prev` pointer
210 > for `loc_next`, in which case the function will return the index of the node within the list plus one.
211
212 ## Remove
213
214 ```C
215 #include <cx/linked_list.h>
216
217 void cx_linked_list_remove(void **begin, void **end,
218 ptrdiff_t loc_prev, ptrdiff_t loc_next, void *node);
219
220 size_t cx_linked_list_remove_chain(void **begin, void **end,
221 ptrdiff_t loc_prev, ptrdiff_t loc_next,
222 void *node, size_t num);
223 ```
224
225 You can either remove a single element or a specified number of subsequent elements from the list.
226
227 The function `cx_linked_list_remove()` is equivalent to `cx_linked_list_remove_chain()` where `num` is set to one.
228
229 The function `cx_linked_list_remove_chain()` removes up to `num` number of nodes from the list where `node` is contained, including `node` itself.
230 It returns the actual number of removed elements, which might be smaller when the list does not contain at least `num` elements.
231
232 The `prev` and `next` pointers of _all_ removed nodes are kept completely intact, allowing traversal within the removed chain,
233 as well as identifying the formerly adjacent nodes with the list from which the chain was removed.
234
235 > Both `begin` and `end` pointers are optional, if you specify both `loc_prev` and `loc_next`.
236 > In case your list does not have a `prev` pointer, specifying `begin` is mandatory (because there would be no other way to determine the predecessor of `node`).
237 >{style="note"}
238
239 > While specifying _only_ `loc_prev` and `end` is technically illegal, you can simply swap roles
240 > and use the offset of your `prev` pointer as `loc_next` and the address of your `end` pointer as `begin`.
241 > The list is then traversed backwards, but otherwise everything works as expected.
242
243 ## Reorder
244
245 ```C
246 #include <cx/linked_list.h>
247
248 void cx_linked_list_reverse(void **begin, void **end,
249 ptrdiff_t loc_prev, ptrdiff_t loc_next);
250
251 void cx_linked_list_sort(void **begin, void **end,
252 ptrdiff_t loc_prev, ptrdiff_t loc_next,
253 ptrdiff_t loc_data, cx_compare_func cmp_func);
254 ```
255
256 The function `cx_linked_list_reverse()` swaps all links of all nodes in the specified list, effectively reversing the order of elements.
257
258 The function `cx_linked_list_sort()` uses a recursive _natural mergesort_ implementation to sort the list with respect to the specified `cmp_fnc`.
259 The non-negative `loc_data` offset is used to calculate the pointers that are passed to the compare function.
260 If you choose `loc_data` to be zero, the pointers to the nodes themselves are passed.
261
262 > The `begin` pointer is required in all of the above functions while the `end` pointer is still optional.
263 > {style="note"}
264
265 > Sorting uses [small buffer optimization](install.md#small-buffer-optimizations) for small list sizes.
266 > If a list contains no more than `CX_LINKED_LIST_SORT_SBO_SIZE` number of elements, no additional heap memory is allocated during the entire operation.
267 > Otherwise, merge operations still do not allocate extra memory as long as the length of the merged sub-list is not larger.
268
269 ## Compare
270
271 ```C
272 #include <cx/linked_list.h>
273
274 int cx_linked_list_compare(
275 const void *begin_left, const void *begin_right,
276 ptrdiff_t loc_advance, ptrdiff_t loc_data,
277 cx_compare_func cmp_func
278 );
279 ```
280
281 For comparing two linked list, you need to specify where to start,
282 and the offsets for the pointer to the next node and the data.
283
284 In the natural case, `begin_left` and `begin_right` point to the first node of either list,
285 and `loc_advance` is the offset of the `next` pointer.
286 But it is also possible to start with the _last_ node of both lists and use the `prev` pointer to compare them backwards.
287
288 The `loc_data` offset is used to calculate the pointer that is passed to the `cmp_fnc`.
289 This can either be the offset of a specific field in the struct or simply zero, in which case the pointers to the nodes themselves are passed to the compare function.
290
291 <seealso>
292 <category ref="apidoc">
293 <a href="https://ucx.sourceforge.io/api/linked__list_8h.html">linked_list.h</a>
294 </category>
295 </seealso>
296