Spark Context (sc)
SparkContext
is the environment in which the RDDs
are available
- Spark Context object takes care automatically of the resilience of RDDs
- By distributing them across machines
- Support node failures
Shell
- The
Spark Shell
creates a sc
(spark context) object for you
- With the sc object you can interact with the RDDs
Python (pyspark)
- Python library for accessing the spark context
- Under the hood python code has to be translated into java bytecode
Scala (org.apache.spark)
- Scala library for accessing the spark context
- Scala is preferred over python because it's the most efficient