Dynamo is a NoSQL1 key-value storage used within Amazon. Its implementation is closed-source, but several technologies have extended the idea:
- Amazon S3 uses many core features of Dynamo.
- DynamoDB improves on Dynamo by automatically provisioning hardware, scaling, and re-partitioning data. It is also accessible via AWS.
- Cassandra is an open-source implementation of many of Dynamo’s techniques.
Dynamo is designed as a decentralised (P2P) replicated distributed hash table. It is designed to provide high availability and consistency under node failures.
Context
- platform
- requests may go through as many 150 microservices
- core services: between 10-100 ms. recall, 10 ms == accessing the disk 1 time
- 10 000s servers — number from decades ago
- use cases
- shopping cart - don’t lose data
- highly available — even if 1 node active, an op (adding to cart) should succeed
- RDMS — relational database
- kv interface is easier — shopping cart ID, product ID, etc
- triggers - notifies client upon data change
- scales up — better with faster machines? but faster machines generally expensive
- reqs
- tunability — some applications have different needs
Architecture
- what
- P2P — no leaders in the network. all nodes are equivalent. much like BitTorrent
- note: ZooKeeper can largely fulfil the design reqs, but this relies on an external non P2P service
- context — much like a file descriptor
- consistent hashing
- standard hashing — we might mod # number of servers. but with a new node, we now need to mod a diff number → partitioning gives us different results
- virtual nodes — each node gets a few IDs
- more storage, we can assign more virtual nodes — get some more flexibility
- optimistic replication
- gossip
- problem: info propagation takes a while
Sloppy quorum
Dynamo achieves high availability and perceived consistency (i.e., data should always be writable, and we should be able to avoid anomalies resulting from weak consistency) by: being consistent during normal operation, and sloppy during failures. Under no failures, Dynamo is fairly consistent — it is almost linearisable and we can get a total order of writes, etc.
During normal operation, Dynamo operates with a typical majority quorum (where ). During node failures, writes are forwarded to a new node, i.e., you can always write (assuming there’s at least one functioning node). Reads and writes are performed on healthy nodes. This doesn’t guarantee that reads/writes will overlap, so there can still be conflicting writes. In practice, >95% of reads will still read the latest data.
Resources
- Dynamo: Amazon’s highly available key-value store, the original paper from 2007
- Amazon’s DynamoDB — 10 years later, an article that includes the original talk from 2007
Footnotes
-
“One of the key motivators behind the idea of NoSQL” - Ashvin Goel ↩