Git Product home page Git Product logo

pygdf's Introduction

PyGDF

Build Status  Documentation Status

PyGDF implements the Python interface to access and manipulate the GPU DataFrame of GPU Open Analytics Initiative (GoAi). We aim to provide a simple interface that is similar to the Pandas DataFrame and hide the details of GPU programming.

Read more about GoAi and the GDF

Setup

Conda

You can get a minimal conda installation with Miniconda or get the full installation with Anaconda.

You can install and update PyGDF using the conda command:

conda install -c numba -c conda-forge -c gpuopenanalytics/label/dev -c defaults pygdf=0.1.0a3

You can create and activate a development environment using the conda command:

conda env create --name pygdf_dev --file conda_environments/testing_py35.yml
source activate pygdf_dev

Install from Source

To install PyGDF from source, clone the repository and run the python install command:

git clone https://github.com/gpuopenanalytics/pygdf.git
python setup.py install

Note: This assumes dependencies including libgdf are already installed, so it is recommended to use the conda environment.

A Dockerfile is provided for building and installing LibGDF and PyGDF from their respective master branches.

Notes:

  • We test with and recommended installing nvidia-docker2
  • Host's installed nvidia driver must support >= the specified CUDA version (9.2 by default).
  • Alternative CUDA_VERSION should be specified via Docker build-arg
  • Alternate branches for libgdf and pygdf may be specified as Docker build-args LIBGDF_REPO and PYGDF_REPO. See Dockerfile for example.
  • Ubuntu 16.04 is the default OS for this container. Alternate OSes may be specified as Docker build-arg LINUX_VERSION. See list of available images.
  • Python 3.6 is default, but other versions may be specified via PYTHON_VERSION build-arg
  • GCC & G++ 5.x are default compiler versions, but other versions (which are supplied by the OS package manager) may be specified via CC and CXX build-args respectively
  • numba (0.40.0), numpy (1.14.3), and pandas (0.20.3) versions are also configurable as build-args

From pygdf project root, to build with defaults:

docker build -t pygdf .
...
 ---> ec65aaa3d4b1
 Successfully built ec65aaa3d4b1
 Successfully tagged pygdf:latest

docker run --runtime=nvidia -it pygdf bash
/# source activate gdf
(gdf) root@3f689ba9c842:/# python -c "import pygdf"
(gdf) root@3f689ba9c842:/# 

Pip

Currently, we don't support pip install yet. Please use conda for the time being.

Testing

This project uses py.test.

In the source root directory and with the development environment activated, run:

py.test

Getting Started

Please see the Demo Docker Repository for example notebooks on how you can utilize the GPU DataFrame.

GPU Open Analytics Initiative

The GPU Open Analytics Initiative (GoAi) seeks to foster and develop open collaboration between GPU analytics projects and products to enable data scientists to efficiently combine the best tools for their workflows. The first project of GoAi is the GPU DataFrame (GDF), which enables tabular data to be directly exchanged between libraries and applications on the GPU.

GPU DataFrame

The GPU DataFrame is a common API that enables efficient interchange of tabular data between processes running on the GPU. End-to-end computation on the GPU avoids unnecessary copying and converting of data off the GPU, reducing compute time and cost for high-performance analytics common in artificial intelligence workloads. The GPU DataFrame uses the Apache Arrow columnar data format on the GPU. Currently, a subset of the features in Arrow are supported.

pygdf's People

Contributors

sklam avatar dantegd avatar kkraus14 avatar seibert avatar randerzander avatar beckernick avatar mrocklin avatar hhuuggoo avatar pearu avatar shwina avatar jcrist avatar tomaugspurger avatar yashv28 avatar vindows avatar wamsiv avatar jrhemstad avatar mike-wendt avatar iroy30 avatar andersy005 avatar ayushdg avatar nsakharnykh avatar scopatz avatar andrewseidl avatar mtjrider avatar michael-balint avatar randyzwitch avatar rgsl888prabhu avatar

Watchers

Cedric Chee avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.