In programming, arrays are linear data structures that are built into many languages. Whereas variables can only store one value or one memory address, arrays can store multiple values at once. One-dimensional arrays are akin to column vectors, and two-dimensional arrays matrices.
In general, when we describe arrays we describe data structures that cannot be dynamically resized at runtime. Arrays with this capability are dynamic arrays, and are more commonly supported in higher-level languages. For instance, Python has the “list” class; C++ has the “vector” class.
Basics
In most languages, the first index of an array begins at (the exception being scientific computing languages like MATLAB or R, where they begin at ). When we define arrays (maybe int arr[n]
), we define with elements, where they’re indexed from 0 to .
We cannot access data outside of specified bounds, otherwise we’d be overwriting unrelated data. In Python, negative indices start from the right end of the list.
Arrays give us direct access if we know which element we want to access, in time! But they’re difficult to add or delete elements so it would take time to relocate the array. So we sometimes may want to instead use linked lists.
In languages with manual memory management, arrays are pointers to the first element. What this means is that we can use pointer arithmetic and the two lines below are equivalent:
Because of this fact, when we pass arrays to functions and perform some manipulation, we actually observe changes. We extend this idea to mutability in Python — so in all languages if we do some manipulation we observe changes. When we define the array (in C and C++), the number can’t be a variable because the compiler needs to know how much space to allocate.
Neat tricks
Python
In Python, we can get a sub-list by specifying an interval in the indices (i.e., arr[2:4]
). We can also concatenate and replicate (with *
), delete with del arr[i]
. We can find the first index of a given value with .index()
, append with .append()
, insert with .insert(i, value)
, remove a given value with .remove(value)
, sort with .sort()
Rust
Higher-dimensional arrays
Outside of 1-dimensional arrays we can create 2D arrays. For instance, int a[5][10]
is akin to a matrix. In many languages it is also helpful to think about it like an array of arrays (i.e., the rows form the main array and the columns form the sub-arrays).
In memory, arrays are stored in row-major order (most of the time). What this means is that the rows are stored one-by-one from top to bottom. In numerical computing languages (like MATLAB or Fortran), 2D arrays are stored in column-major order.1
Dynamic allocation
Because arrays are pointers to the first element, we can dynamically allocate them on the heap in languages that support dynamic memory allocation.
So say we try to initialise an array of integers dynamically:
See also
- String, especially in C
- Linked list, another type of linear data structure
- Vector (C++), a variable-sized array in C++
Footnotes
-
But to be honest, if you’re worrying about memory in numerical languages, you have a bigger set of problems on your plate. ↩