Skip to content

Hive

  • Allows fetching data from HDFS using SQL queries
  • Hive can be used with MapReduce or Tez to fetch the data
  • It's a translation layer from SQL to MapReduce/Tez commands
  • Uses HiveSQL

Drawbacks

  • Simple OLAP queries! (read operations). Bad for OLTP (write operations)
  • Stores data de-normalized
  • Limited to what SQL can do (pig, spark allows more complex stuff)
  • No transactions

Using Hive

  • Hive CLI
  • Saved query files (hive -f /home/user/queries.hql)
  • Ambari / Hue
  • JDBC/ODBC server
  • Thrift service (even though hive is not suitable for OLTP)
  • Via Oozie