Linked lists are linear data structures, similar to arrays. They consist of a collection of nodes that are linked to each other. We often implement linked lists in classes, with a separate class for a list node.

We’re motivated to extend arrays as linear data structures because they’re difficult to insert and delete elements, and because arrays typically have fixed sizes (this is worse in C, better in Python).

There are two main types of linked lists:

  • Singly linked lists have pointers only to the next node.
  • Doubly linked lists have pointers to the previous and next node.

Basic operations

As with other data structures, we’re interested in implementing basic operations:

  • Searching for a given value — time complexity of , because we have to read through the list sequentially.
  • Inserting data, usually sorted
  • Deleting data or the list
  • Copying data

Copying list

Most applicable to C++. Recall that the copy constructor does a shallow copy instead of a deep copy. So the basic idea is that we need to create a new list by copying one node at a time. For operator=, we have a similar story to the copy constructor. If the left-hand side is not empty, we must empty it first. We should also ensure that the left-hand side is different from the right-hand side (else we do nothing).

Implementation

In object-oriented languages, we try to define linked lists within classes, and bundle any operations into the class. In C, we use structs.

class List {
	private:
		Node *head;
	public:
		List() { head = NULL; }
		~List() { delete head; }
		bool dataExists(int d);
		void insertData(int d);
		bool deleteData(int d);
		List (const List& original) { // copy constructor
			Node *p = original.head; // to use on original
			Node *np = NULL; // to use on list to be created
			head = NULL; // first go around, doesn't have a value yet
 
			while (p != NULL) {
				Node *n = new Node(p->getData());
				if (head == NULL) {
					head = n; // if first node is being added
				} else {
					np->setNext(n); // np always points to last node in list
				}
				p = p->getNext();
				np = n;
			}
		}
		List& operator= (List rhs) {
			if (head != NULL) { // first case is to check if list is empty or not
				delete head;
				head = NULL;
			}
			Node *p = rhs.head;
			Node *np = NULL;
			while (p != NULL) {
				Node *n = new Node(p->getData());
				if (head == NULL) {
					head = n; // if first node is being added
				} else {
					np->setNext(n); // np always points to last node in list
				}
				p = p->getNext();
				np = n;
			}
			return *this;
		}
}
 
class Node {
	public:
		int data;
		Node *next;
 
		Node() { data = 0; next = nullptr; }
		Node(int d) { data = 0; next = nullptr; }
		Node(int d, Node *n) {
			data = d;
			next = n;
		}
		~Node() {
			delete next; // note this for implementing methods
		}
}

We don’t have a pass by reference in the operator= function. Syntactically, rhs == *this is invalid, and &rhs == this is valid.

We follow the same basic structure in C. Note the use of explicit memory-based pointers to point to following nodes. Note also our need to define the head of the linked list as its own structure — this reduces the complexity of needing to constantly call a double pointer when referring to the head.

typedef struct node {
	int data;
	struct node *next;
} Node; // defines the node structure
 
typedef struct linked_list {
	Node* head;
} LinkedList; // defines the head of the linked list

Language-specific

In C++

There are two main list types in the Standard Template Library: std::list<T> (doubly-linked list) and std::forward_list<T> (singly-linked list). The main trade-off is that forward_list is more space efficient, but cannot iterate bidirectionally.

Some useful methods:

  • empty(): checks whether the container is empty.
  • max_size(): checks the maximum possible number of elements.

In C

BSD, Unix, and Linux systems provide the queue.h header, which allows us to define a singly-linked list and tail queue. SLIST implements a singly-linked list, LIST implements a doubly-linked list, STAILQ implements a singly-linked tail queue, and TAILQ implements a doubly-linked tail queue.

Addendums

Linked lists are generally bad for modern caching. Since arrays occupy a single contiguous location in memory, locality algorithms are able to efficiently retrieve the rest of the array if needed. But linked list nodes are all over the place, and can occupy random positions in memory.