Linked lists are linear data structures, similar to arrays. They consist of a collection of nodes that are linked to each other. We often implement linked lists in classes, with a separate class for a list node.
We’re motivated to extend arrays as linear data structures because they’re difficult to insert and delete elements, and because arrays typically have fixed sizes (this is worse in C, better in Python).
There are two main types of linked lists:
- Singly linked lists have pointers only to the next node.
- Doubly linked lists have pointers to the previous and next node.
Basic operations
As with other data structures, we’re interested in implementing basic operations:
- Searching for a given value — time complexity of , because we have to read through the list sequentially.
- Inserting data, usually sorted
- Deleting data or the list
- Copying data
Copying list
Most applicable to C++. Recall that the copy constructor does a shallow copy instead of a deep copy. So the basic idea is that we need to create a new list by copying one node at a time. For operator=, we have a similar story to the copy constructor. If the left-hand side is not empty, we must empty it first. We should also ensure that the left-hand side is different from the right-hand side (else we do nothing).
Implementation
In object-oriented languages, we try to define linked lists within classes, and bundle any operations into the class. In C, we use structs.
class List {
private:
Node *head;
public:
List() { head = NULL; }
~List() { delete head; }
bool dataExists(int d);
void insertData(int d);
bool deleteData(int d);
List (const List& original) { // copy constructor
Node *p = original.head; // to use on original
Node *np = NULL; // to use on list to be created
head = NULL; // first go around, doesn't have a value yet
while (p != NULL) {
Node *n = new Node(p->getData());
if (head == NULL) {
head = n; // if first node is being added
} else {
np->setNext(n); // np always points to last node in list
}
p = p->getNext();
np = n;
}
}
List& operator= (List rhs) {
if (head != NULL) { // first case is to check if list is empty or not
delete head;
head = NULL;
}
Node *p = rhs.head;
Node *np = NULL;
while (p != NULL) {
Node *n = new Node(p->getData());
if (head == NULL) {
head = n; // if first node is being added
} else {
np->setNext(n); // np always points to last node in list
}
p = p->getNext();
np = n;
}
return *this;
}
}
class Node {
public:
int data;
Node *next;
Node() { data = 0; next = nullptr; }
Node(int d) { data = 0; next = nullptr; }
Node(int d, Node *n) {
data = d;
next = n;
}
~Node() {
delete next; // note this for implementing methods
}
}
We don’t have a pass by reference in the operator=
function. Syntactically, rhs == *this
is invalid, and &rhs == this
is valid.
We follow the same basic structure in C. Note the use of explicit memory-based pointers to point to following nodes. Note also our need to define the head of the linked list as its own structure — this reduces the complexity of needing to constantly call a double pointer when referring to the head.
typedef struct node {
int data;
struct node *next;
} Node; // defines the node structure
typedef struct linked_list {
Node* head;
} LinkedList; // defines the head of the linked list
Language-specific
In C++
There are two main list types in the Standard Template Library: std::list<T>
(doubly-linked list) and std::forward_list<T>
(singly-linked list). The main trade-off is that forward_list
is more space efficient, but cannot iterate bidirectionally.
Some useful methods:
empty()
: checks whether the container is empty.max_size()
: checks the maximum possible number of elements.
In C
BSD, Unix, and Linux systems provide the queue.h header, which allows us to define a singly-linked list and tail queue. SLIST
implements a singly-linked list, LIST
implements a doubly-linked list, STAILQ
implements a singly-linked tail queue, and TAILQ
implements a doubly-linked tail queue.
Addendums
Linked lists are generally bad for modern caching. Since arrays occupy a single contiguous location in memory, locality algorithms are able to efficiently retrieve the rest of the array if needed. But linked list nodes are all over the place, and can occupy random positions in memory.