Grapevine, Leases

Compared with hardware, software failure could be more arduous. Same version software could get crashed by the same bug which results in crash totally
- nohup is a must to alleviate from having all the servers crash in a short period before anyone notices that

Change message get destroyed
- each registration server would resynchronize the data with others periodically
  - use timestamp to conduct merges

implementing distributed database

authentication
lookups/additions/deletions to database
- every entity has name (RName): two types of entity:
  - groups: list of RNames
- individual: person(inbox sites), machine(network address)
Grapevine Registry: GV (fully replicated)
- GV is group contains all names of registry servers
Bootstrap problem: how to find other servers?: manually input message servers

replicates registry: all on particular registry server
- Goal: eventually consistent (can observe inconsistencies, but after time T, they will resolve)
internal representation: (e.g., group)
- group: two lists: active_list, deleted_list (entity can be on one list at a time)
  - garbage collection on deleted_list, maybe every 2 weeks

update: add(), delete()
1. remove entity from active/deleted list
  - generate timestamp, unique ID
  - put entity on active/deleted list
merge
1. process do local update, then generate change messages for other replicas
  - takes two lists, merge into one using timestamps
  - change messages may get lost: we may need to sync the servers after a period of time

implemented by group of message servers
- accept(): here is some mail to deliver
- poll/check(): do I have mail?
- retrieve: get my mail

prepare email: body, list of recipients (inbox sites)
- Accept: pick any message server and give it the message
- nothing else is replicated, but only the service replicated, you could pick any server - message server: store locally and say OK

Leases give great management flexibility.
- The server could control the term of leases, and it could choose from seeking write approval or wait for lease expire
- A heavily write-shared file might be given a lease term of zero
- A heavily read-shared file could be optimized by using a smaller number of leases, e.g., one lease per directory
  - to eliminate the need for clients to request lease extensions
- The client could choose when to request extension, relinquish and approve write
lease is contract: rights for limited time
- allow to make sensible progress (slow client will not block others too much, we have expiration)
client wants to access file F
1. request lease(time period) from server for F
  - if re-access file before lease expires: use it without server interaction
  - otherwise, when expires, renew it before re-use
different clients want to modify F
1. have to contact all lease holders to get permission
  - It is OK if one client doesn’t reply

provide strict consistency in spite of non-Byzantine failures, including partitions
a key assumption is that clocks are reasonably accurate
- use physical clocks, real clock

lease extenstion overhead: extension after lease expires
- on-demand extension rather than periodic extension
false sharing: leaseholder hold the lease without accessing the file, while others could not write
Scaling Memcache at Facebook, thundering herds
- without lease, all cache misses would go to database query
- with lease, I would wait for a short time to let other leaseholder to set the data, so that I will not query the DB

shorter term is good
- false sharing: reduces “false” conflicts
- server availability: server could reboot quickly (after crash)
longer term is good
- reduces overheads for extensions

simple:
- client: forget all leases upon reboot
- server: wait(until all leases have expired)

希言自然