Docker images to:
- Setup a standalone Apache Spark cluster running one Spark Master and multiple Spark workers
- Build Spark applications in Java, Scala or Python to run on a Spark cluster
Currently supported versions:
- Spark 3.3.1 for Hadoop 3.3 with OpenJDK 8 and Scala 2.12
git clone https://github.com/jairoserrano/docker-spark.git docker-spark
cd docker-spark
sh build.sh
docker compose up
Building and running your Spark application on top of the Spark cluster is as simple as extending a template Docker image. Check the template's README for further documentation.