- Programming
Programming
- 三观整合
抛砖引玉
- spanner tao paxos raft
Raft
- CFS Chord
Old time
- people are forced to use P2P for large scale network
- P2P’s need is replaced by scalable internet services within datacenters
- P2P’s extremes
- scalability
- failure (frequent join and leave)
- security (anyone could access)
- wide area (geo-distributed)
- heterogeneity
- BlinkDB
Approximation is hard
- sampling and sketches
- low space and time complexity
- make strong assumptions about the query workload
- on-line aggregation
- highly variable performance
- make fewer assumptions about the query workload
- OLA may need to read the entire table to compute a result with satisfactory error bounds
- Project Adam: Building and Efficient and Scalable Deep Learning Training System
Why Deep Learning is hard?
- extremely hard to construct appropriate features
- Parameter Server
ML challenges
- needs enormous network bandwidth
- cost of synchronization and machine latency is high: since ML algorithms are sequential
- fault tolerance is critical: machines and jobs can be preempted
- Auth
What “secure” means?
- Confidentiality
- Authentication
- Integrity. Intruder can change message
- Accountability. No party can deny sending message
- Availability (DoS)
- Eventual Consistency
Strong Consistency
- One copy, up to date
- Why we need it? it is easier to implement certain applications
- Why not? scale, latency, availability
- Pregel-like Graph Processing Systems Comparison
Pregel-like
- Bulk Synchronous Parallel model
- is a vertex state machine (think like a vertex)
- but may encounter the straggler problem
- use vertex-centric approach, graph parallel
- each vertex runs parallel: gather sum apply scatter
- address in-memory batch processing of large graphs
- terminate when all vertices are inactive and no more messages are in transit
- computation is performed on locally stored data
- Pregel only supports graphs that fit in memory
- master/workers model: the master partitions the input graph into partitions and assign it to a worker
- Global state?
- global aggregators
- phases of the algorithm