Dynamo is a NoSQL1 key-value storage used within Amazon. Its implementation is closed-source, but several technologies have extended the idea:

  • Amazon S3 uses many core features of Dynamo.
  • DynamoDB improves on Dynamo by automatically provisioning hardware, scaling, and re-partitioning data. It is also accessible via AWS.
  • Cassandra is an open-source implementation of many of Dynamo’s techniques.

Dynamo is designed as a decentralised (P2P) replicated distributed hash table. It is designed to provide high availability and consistency under node failures.

Context

  • platform
    • requests may go through as many 150 microservices
    • core services: between 10-100 ms. recall, 10 ms == accessing the disk 1 time
    • 10 000s servers — number from decades ago
  • use cases
    • shopping cart - don’t lose data
    • highly available — even if 1 node active, an op (adding to cart) should succeed
  • RDMS — relational database
    • kv interface is easier — shopping cart ID, product ID, etc
    • triggers - notifies client upon data change
    • scales up — better with faster machines? but faster machines generally expensive
  • reqs
    • tunability — some applications have different needs

Architecture

  • what
    • P2P — no leaders in the network. all nodes are equivalent. much like BitTorrent
    • note: ZooKeeper can largely fulfil the design reqs, but this relies on an external non P2P service
  • context — much like a file descriptor
  • consistent hashing
    • standard hashing — we might mod # number of servers. but with a new node, we now need to mod a diff number partitioning gives us different results
    • virtual nodes — each node gets a few IDs
    • more storage, we can assign more virtual nodes — get some more flexibility
  • optimistic replication
  • gossip
    • problem: info propagation takes a while

Sloppy quorum

Dynamo achieves high availability and perceived consistency (i.e., data should always be writable, and we should be able to avoid anomalies resulting from weak consistency) by: being consistent during normal operation, and sloppy during failures. Under no failures, Dynamo is fairly consistent — it is almost linearisable and we can get a total order of writes, etc.

During normal operation, Dynamo operates with a typical majority quorum (where ). During node failures, writes are forwarded to a new node, i.e., you can always write (assuming there’s at least one functioning node). Reads and writes are performed on healthy nodes. This doesn’t guarantee that reads/writes will overlap, so there can still be conflicting writes. In practice, >95% of reads will still read the latest data.

Resources

Footnotes

  1. “One of the key motivators behind the idea of NoSQL” - Ashvin Goel