In programming, arrays are linear data structures that are built into many languages. Whereas variables can only store one value or one memory address, arrays can store multiple values at once. One-dimensional arrays are akin to column vectors, and two-dimensional arrays matrices.

In general, when we describe arrays we describe data structures that cannot be dynamically resized at runtime. Arrays with this capability are dynamic arrays, and are more commonly supported in higher-level languages. For instance, Python has the “list” class; C++ has the “vector” class.

Basics

In most languages, the first index of an array begins at (the exception being scientific computing languages like MATLAB or R, where they begin at ). When we define arrays (maybe int arr[n]), we define with elements, where they’re indexed from 0 to .

We cannot access data outside of specified bounds, otherwise we’d be overwriting unrelated data. In Python, negative indices start from the right end of the list.

Arrays give us direct access if we know which element we want to access, in time! But they’re difficult to add or delete elements so it would take time to relocate the array. So we sometimes may want to instead use linked lists.

In languages with manual memory management, arrays are pointers to the first element. What this means is that we can use pointer arithmetic and the two lines below are equivalent:

int arr[50];
arr[0] = 1;
*arr = 1;
// these are equivalent
arr[8] == *(arr + 8);
// >> TRUE

Because of this fact, when we pass arrays to functions and perform some manipulation, we actually observe changes. We extend this idea to mutability in Python — so in all languages if we do some manipulation we observe changes. When we define the array (in C and C++), the number can’t be a variable because the compiler needs to know how much space to allocate.

Neat tricks

int a[100] = {0}; // will set the entire array to 0

Python

In Python, we can get a sub-list by specifying an interval in the indices (i.e., arr[2:4]). We can also concatenate and replicate (with *), delete with del arr[i]. We can find the first index of a given value with .index(), append with .append(), insert with .insert(i, value), remove a given value with .remove(value), sort with .sort()

Rust

let a: [i32; 5] = [1, 2, 3, 4, 5] // declares an i32 type arr with 5 elements

Higher-dimensional arrays

Outside of 1-dimensional arrays we can create 2D arrays. For instance, int a[5][10] is akin to a matrix. In many languages it is also helpful to think about it like an array of arrays (i.e., the rows form the main array and the columns form the sub-arrays).

In memory, arrays are stored in row-major order (most of the time). What this means is that the rows are stored one-by-one from top to bottom. In numerical computing languages (like MATLAB or Fortran), 2D arrays are stored in column-major order.1

Dynamic allocation

Because arrays are pointers to the first element, we can dynamically allocate them on the heap in languages that support dynamic memory allocation.

So say we try to initialise an array of integers dynamically:

int number = 50;
int *arr;
arr = (int*) malloc(number * sizeof(int));
int *a;
a = new int[50];
a[1] = 6; // ezpz
delete [] a; // to tell the compiler that we're freeing an array

See also

Footnotes

  1. But to be honest, if you’re worrying about memory in numerical languages, you have a bigger set of problems on your plate.