Developing and Contributing to Ibis¶
For a primer on general open source contributions, see the pandas contribution guide. The project will be run much like pandas has been.
Linux Test Environment Setup¶
Conda Environment Setup¶
Install the latest version of miniconda:
# Download the miniconda bash installer curl -Ls -o $HOME/miniconda.sh \ https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh # Run the installer bash $HOME/miniconda.sh -b -p $HOME/miniconda # Put the conda command on your PATH export PATH="$HOME/miniconda/bin:$PATH"
Install the development environment of your choice (Python 3.6 in this example), activate and install ibis in development mode:
# Create a conda environment ready for ibis development conda env create --name ibis36 --file=ci/requirements-dev-3.6.yml # Activate the conda environment source activate ibis36 # Install ibis python setup.py develop
The following command does three steps:
Downloads the test data
Starts each backend via docker-compose
Initializes the backends with the test tables
cd ci bash build.sh
To use specific backends follow the instructions below.
Download Test Dataset¶
Download the test data:
By default this will download and extract the dataset under testing/ibis-testing-data.
Setting Up Test Databases¶
To start each backends
cd ci docker-compose up
Impala (with UDFs)¶
Start the Impala docker image in another terminal:
# Keeping this running as long as you want to test ibis docker run --tty --rm --hostname impala cpcloud86/impala:java8
Load data and UDFs into impala:
ci/impalamgr.py load --data --data-dir ibis-testing-data
Start the Clickhouse Server docker image in another terminal:
# Keeping this running as long as you want to test ibis docker run --rm -p 9000:9000 --tty yandex/clickhouse-server
PostgreSQL can be used from either the installation that resides on the Impala docker image or from your machine directly.
Here’s how to load test data into PostgreSQL:
SQLite comes already installed on many systems. If you used the conda setup instructions above, then SQLite will be available in the conda environment.
You are now ready to run the full ibis test suite:
Here’s a few ideas to think about outside of participating in the primary development roadmap:
- Use cases and IPython notebooks
- Other SQL-based backends (Presto, Hive, Spark SQL)
- S3 filesytem support
- Integration with MLLib via PySpark