Impala

Zhaoyu Luo bio photo By Zhaoyu Luo

Impala

  • Business intelligent analytics
  • An MPP database, strength on multi-user
  • query plan generation
    • single node plan
    • do partial aggregation “combines” in plan tree
  • resource allocation
    • multiple queries
    • admission control
    • it sacrifice resource efficiency for performance
  • shared-nothing
  • I/O operations are always full
  • Parquet
  • loads working set in memory
  • streaming data
  • llama: low latency application master
  • virtual function
    • calls: body of function
  • loop unrolling
    • LLVM: code generation atop LLVM
    • quasi quotes
      1. get AST
    • get optimized code

Design

  • Each node is able to accept and execute queries
    • each node is ready, so no overhead of scheduling a map task
    • read throughput could scale with number of disks
  • avoid remote read, don’t need to go to name node, data node
    • read directly and locally
  • All nodes’ system catalogs are up to date
  • Coordination and synchronization cluster-wide metadata
  • Does not support UPDATE or DELETE, only supports bulk insertions
    • due to HDFS feature
  • avoid synchronous RPCs wherever possible on the critical path of any query
  • pub-sub service: statestore
    • push updates to all interested parties, e.g., metadata changes to all subscribers
  • construct a bloom filter to implement a simple version of a semi-join

Datastore

  • Parquet: row groups, columns stored sequentially on disk (pages, compression at pages)
  • ORCFile: stripes, row groups, columnas storage

Statestore

  • it is cluster metadata: load on machines, liveness of nodes, catalogs for underlying data
    • catalogs: physical plan and well-form-ness
  • topics: arrays of (key, value, version) triplets
    • persistent throught the lifetime of the statestore
      • not persisted across service restarts
  • need registration
    • send delta changes every 2s
    • if ping time-out need re-register
    • failed subscriber would be removed

Resource/Workload management

  • YARN is centralized scheduling
    • decision is made with full knowledge of cluster state
    • latency is too high
  • Impala needs to handle thousands of queries per second
    • New complementary but independent admission control: allow users to control their workloads without centralized decision-making
    • A service between Impala and YARN: resource caching, gang scheduling, incremental allocation changes