Parallel computing is a software paradigm that asks us to consider using the full capabilities of modern processors, i.e., their many cores and threads. Learning programming often deals with sequential programs, but modern processors can do much more for us in terms of performance and productivity.

We run two or more things at the same time (they are independent).

Note that parallelisation isn’t the same as concurrent code. Concurrency is the practice of switching between two things (you can get interrupted). The goal is to make progress on multiple things.

Some key takeaways from implementing parallel programs:

  • Parallelisation isn’t the only solution. While it’s a powerful optimisation technique, it’s not appropriate for all situations. In particular, it comes with some overhead that is unavoidable (making it unideal for smaller chunks of work).
  • Avoid modifying shared states and data. Privatise variables or ensure each thread doesn’t share memory.
  • Parallelise large, expensive chunks of work.
  • It’s hard to parallelise things! This is a whole academic discipline with many books. Don’t worry if things don’t work right away.

Basics

The premise behind multicore processors is that Moore’s law cannot improve single-threaded performance. This is because clock speeds are essentially limited due to an insane power dissipation at higher frequencies.

Work needs to be partitioned — figuring out how and when to do this is non-trivial. Moreover, each partition needs to be able to communicate with the processor and with each other. This creates overhead that could degrade performance.

  • With too many threads created, caches can overflow, resulting in cache misses.
  • Debugging a program spread out across multiple states can be difficult to understand.

Partitioning resources are also necessary sometimes. This is especially true in numerical applications, like in matrix computations. This is again non-trivial, especially if the data structures in question are complicated.

Single-threaded programs don’t need to worry about sharing memory between different threads (obviously). Multithreaded software does. Each thread shares some resources, like CPU caches, IO devices, accelerators, and files. When we don’t keep this in mind, we can get race conditions, when multiple threads are “racing” to access the same memory at the same time. Synchronising access to the threads can be done in different ways.

Sub-pages

Resources

  • Is Parallel Programming Hard, and, if so, What Can You Do About It?, by Paul E. McKenney
  • Programming Massively Parallel Processors, by David B. Kirk and Wen-mei W. Hwu

See also