Distributed Systems
- Distributed system is a software system in which components located on networked computers communicate and coordinate their actions by passing messages.
Scalability
- Scalability is the ability of a system, network, or process, to handle a growing amount of work in a capable manner or its ability to be enlarged to accommodate that growth.Two particularly relevant aspects
? Performance
? Availability
Permformance aspects
? Short response time/low latency for a given piece of
work
? High throughput (rate of processing work)
? Low utilization of computing resource(s)
Availability
- Availability = uptime / (uptime + downtime)
- Availability % Downtime per year
- 90 % ("one nine") More than one month
- 99.9 % ("three nines") Less than 9 hours
- 99.9999% ("six nines") 31 seconds
Fault tolerance
- Fault tolerance -ability of a system to behave in a well-defined manner once faults occur.
- Failures are norm.
Replication
- Replication is making copies of the same data on
multiple machines.
Consistency
Why strong consistency is hard to achieve
Nodes
? each node executes a program concurrently
? knowledge is local
? global state is potentially out of date
? nodes can fail and recover from failure independently
? messages can be delayed or lost
? clocks are not synchronized across nodes
Links
? Asynchronous system model.
? No timing assumptions.
? No bound on message transmission delay
? Useful clocks do not exist.
ACID
? Atomic
? Consistent
? Isolated
? Durable
BASE
? Basic Availability
? Soft-state
? Eventual consistency
CAP theorem
- It is impossible for a distributed computer system to simultaneously provide all three of the following guarantees:
? Consistency (all nodes see the same data at the same time)
? Availability (every request received by a non-failing
[database] node in the system must result in a [non-error]
response)
? Partition tolerance (the system continues to operate despite
arbitrary partitioning due to network failures)
CP systems
- Protocols:
? Strict quorum protocols (paxos, raft, zab)
? 2PC - Storages:
? MongoDB
? HBase
? Zookeeper
AP System
AP systems
- Protocols:
? Partial quorum protocols - Storages:
? Couch DB
? Cassandra
? Amazon Dynamo
CAP fifteen years later
●Partitions are rare, there is little reason to forfeit C or
A when the system is not partitioned.
●Choice between C and A can occur many times
within the same system at very fine granularity.
●All three properties are more continuous than binary.
●Most software doesn’t neatly fit CP/AP definition.