Problem
- I/O is the bottleneck of a system
- Read is easy to deal with, but write operations could not scale-out due to multiple replication
Solutions
- use memory to store most recent files
- don’t do file replications
- it would do checkpoints of all files
- needed information is
lineage
: input file, binary programs, execution configuration, dependency type
- discard old file versions after checkpoint
- if file is too big to fit in memory, checkpoint leafs firstly during operation burst, then checkpoint the rest of non-leaf files
- once one file have checkpoint, I could discard it from memory
- but once the storage is down forever, it could not be recovered without previous backups or replications