Linked lists are linear data structures, similar to arrays. They consist of a collection of nodes that are linked to each other. We often implement linked lists in classes, with a separate class for a list node.
We’re motivated to extend arrays as linear data structures because they’re difficult to insert and delete elements, and because arrays typically have fixed sizes (this is worse in C, better in Python).
There are two main types of linked lists:
- Singly linked lists have pointers only to the next node.
- Doubly linked lists have pointers to the previous and next node.
Basic operations
As with other data structures, we’re interested in implementing basic operations:
- Searching for a given value — time complexity of , because we have to read through the list sequentially.
- Inserting data, usually sorted
- Deleting data or the list
- Copying data
Copying list
Most applicable to C++. Recall that the copy constructor does a shallow copy instead of a deep copy. So the basic idea is that we need to create a new list by copying one node at a time. For operator=, we have a similar story to the copy constructor. If the left-hand side is not empty, we must empty it first. We should also ensure that the left-hand side is different from the right-hand side (else we do nothing).
Implementation
In object-oriented languages, we try to define linked lists within classes, and bundle any operations into the class. In C, we use structs.
We don’t have a pass by reference in the operator=
function. Syntactically, rhs == *this
is invalid, and &rhs == this
is valid.
We follow the same basic structure in C. Note the use of explicit memory-based pointers to point to following nodes. Note also our need to define the head of the linked list as its own structure — this reduces the complexity of needing to constantly call a double pointer when referring to the head.
Language-specific
In C++
There are two main list types in the Standard Template Library: std::list<T>
(doubly-linked list) and std::forward_list<T>
(singly-linked list). The main trade-off is that forward_list
is more space efficient, but cannot iterate bidirectionally.
Some useful methods:
empty()
: checks whether the container is empty.max_size()
: checks the maximum possible number of elements.
In C
BSD, Unix, and Linux systems provide the queue.h header, which allows us to define a singly-linked list and tail queue. SLIST
implements a singly-linked list, LIST
implements a doubly-linked list, STAILQ
implements a singly-linked tail queue, and TAILQ
implements a doubly-linked tail queue.
Addendums
Linked lists are generally bad for modern caching. Since arrays occupy a single contiguous location in memory, locality algorithms are able to efficiently retrieve the rest of the array if needed. But linked list nodes are all over the place, and can occupy random positions in memory.