• Version: 1.0.0
  • Released: Feb 2019

Eskapade is a light-weight, python-based data analysis framework, meant for modularizing all sorts of data analysis problems into reusable analysis components. For documentation on Eskapade, please go to this link.

Eskapade-Spark is the Spark-based extension of Eskapade. For documentation on Eskapade-Spark, please go here.

Release notes

Version 1.0

Eskapade-Spark v1.0 (February 2019) is in synch with Eskapade-Core v1.0 and Eskapade v1.0, contains several small upgrades wrt v0.9:

  • Minor upgrades to spark_histogrammar_filler link.
  • Include hive_reader and hive_writer links, for working with hive tables.
  • Include jdbc module, for opening a connection to a jdbc database, and a jdbc_reader link.

Version 0.9

Eskapade-Spark v0.9 (December 2018) contains only one update compared with v0.8:

  • All code has been updated to Eskapade v0.9, where the core functionality has been split off into the Eskapade-Core package. As such the code is backwards-incompatible with v0.8.

See release notes for previous versions of Eskapade-Spark.



Eskapade-Spark requires Python 3.5+, Eskapade v0.8+ and Spark v2.1.2. These are pre-installed in the Eskapade docker.


To install the package from pypi, do:

$ pip install Eskapade-Spark


Alternatively, you can check out the repository from github and install it yourself:

$ git clone eskapade-spark

To (re)install the python code from your local directory, type from the top directory:

$ pip install -e eskapade-spark


After installation, you can now do in Python:

import eskapadespark

Congratulations, you are now ready to use Eskapade-Spark!

Quick run

To see the available Eskapade-Spark examples, do:

$ export TUTDIR=`pip show Eskapade-Spark | grep Location | awk '{ print $2"/eskapadespark/tutorials" }'`
$ ls -l $TUTDIR/

E.g. you can now run:

$ eskapade_run $TUTDIR/

For all available examples, please see the tutorials.

Contact and support

Contact us at: kave [at] kpmg [dot] com

Please note that the KPMG Eskapade group provides support only on a best-effort basis.