Skip to content

Spark Context (sc)

  • SparkContext is the environment in which the RDDs are available
  • Spark Context object takes care automatically of the resilience of RDDs
  • By distributing them across machines
  • Support node failures

Shell

  • The Spark Shell creates a sc (spark context) object for you
  • With the sc object you can interact with the RDDs

Python (pyspark)

  • Python library for accessing the spark context
  • Under the hood python code has to be translated into java bytecode

Scala (org.apache.spark)

  • Scala library for accessing the spark context
  • Scala is preferred over python because it's the most efficient