CFS Chord

Zhaoyu Luo bio photo By Zhaoyu Luo

Old time

  • people are forced to use P2P for large scale network
  • P2P’s need is replaced by scalable internet services within datacenters
  • P2P’s extremes
    • scalability
    • failure (frequent join and leave)
    • security (anyone could access)
    • wide area (geo-distributed)
    • heterogeneity

General Problem: Distribution: keys -> spread across machines

  • Natural solution: hash()
    • what if machine join/leave/crash?
    • join: redistribute all keys: very costly
  • Consistent hasing
    • easy if nodes all know each other, but doesn’t scale
    • use scalable consistent hasing (Chord)

Chord

  • APIs:
    • lookup() service: given a key, return machine responsible for that key
    • update(): when key responsibilities change
  • idea: hash(nodes, keys) -> m-bit_id
    • map keys to successor

Load Balance

  • virtual server: each node pretends to be V virtual servers: hash(node_ip, virtual_server_number)
    • but not free, the cost is: increase number of node, increase number of hop

Routing during lookup

  • Successors: simple, needs O(1) info
  • Finger Table: fast, needs O(logN) info

Example: Node join

  • the new node check its successor and predecessor (stabilization check): update their predecessor and successor

CFS

  • “read mostly” File System, publisher: makes FS available, clients: mount + read
  • key: hash(content) => keys
  • block-based storage
    • high overhead to read a big file. Alleviate this by:
      • prefetching
    • caching: cache the block on its lookup path nodes
    • good load balance
  • replication: k replicas at successors

References

What is the most important idea in the CFS paper that builders of modern distributed systems should know?

  • As the scale increases in a distributed system, each node is not necessary to maintain a global view of the whole cluster
    • Even if the distributed system has a central server, it could not handle thousands of machines’ constant pings
    • The CFS is clever that each server maintains only a logarithmic number of machines, since CFS imposes an ordering in its files (which seems like an index)