Git Product home page Git Product logo

pyspark-stubs's Introduction

PySpark Stubs

Build Status PyPI version Conda Forge version

A collection of the Apache Spark stub files. These files were generated by stubgen and manually edited to include accurate type hints.

Tests and configuration files have been originally contributed to the Typeshed project. Please refer to its contributors list and license for details.

Motivation

  • Static error detection (see SPARK-20631)

    SPARK-20631

  • Improved completion for chained method calls.

    Syntax completion

Installation and usage

Please note that the guidelines for distribution of type information is still work in progress (PEP 561 - Distributing and Packaging Type Information). Currently installation script overlays existing Spark installations (pyi stub files are copied next to their py counterparts in the PySpark installation directory). If this approach is not acceptable you can add stub files to the search path manually.

According to PEP 484:

Third-party stub packages can use any location for stub storage. Type checkers should search for them using PYTHONPATH.

Moreover:

Third-party stub packages can use any location for stub storage. Type checkers should search for them using PYTHONPATH. A default fallback directory that is always checked is shared/typehints/python3.5/ (or 3.6, etc.)

Please check usage before proceeding.

The package is available on PYPI:

pip install pyspark-stubs

and conda-forge:

conda install -c conda-forge pyspark-stubs

Depending on your environment you might also need a type checker, like Mypy or Pytype.

This package is tested against MyPy development branch and in rare cases (primarily important upstrean bugfixes), is not compatible with the preceding MyPy release.

PySpark Version Compatibility

Package versions follow PySpark versions with exception to maintenance releases - i.e. pyspark-stubs==2.3.0 should be compatible with pyspark>=2.3.0,<2.4.0. Maintenance releases (post1, post2, ..., postN) are reserved for internal annotations updates.

API Coverage

Module Dynamically typed Statically typed Notes
pyspark  
pyspark.accumulators  
pyspark.broadcast Mixed
pyspark.cloudpickle Internal
pyspark.conf  
pyspark.context  
pyspark.daemon Internal
pyspark.files  
pyspark.find_spark_home Internal
pyspark.heapq3 Internal
pyspark.java_gateway Internal
pyspark.join  
pyspark.ml  
pyspark.ml.base  
pyspark.ml.classification  
pyspark.ml.clustering  
pyspark.ml.common Mixed
pyspark.ml.evaluation  
pyspark.ml.feature  
pyspark.ml.fpm  
pyspark.ml.image  
pyspark.ml.linalg  
pyspark.ml.param  
pyspark.ml.param._shared_params_code_gen Internal
pyspark.ml.param.shared  
pyspark.ml.pipeline  
pyspark.ml.recommendation  
pyspark.ml.regression  
pyspark.ml.stat  
pyspark.ml.tests Tests
pyspark.ml.tuning  
pyspark.ml.util  
pyspark.ml.wrapper Mixed
pyspark.mllib  
pyspark.mllib.classification  
pyspark.mllib.clustering  
pyspark.mllib.common  
pyspark.mllib.evaluation  
pyspark.mllib.feature  
pyspark.mllib.fpm  
pyspark.mllib.linalg  
pyspark.mllib.linalg.distributed  
pyspark.mllib.random  
pyspark.mllib.recommendation  
pyspark.mllib.regression  
pyspark.mllib.stat  
pyspark.mllib.stat.KernelDensity  
pyspark.mllib.stat._statistics  
pyspark.mllib.stat.distribution  
pyspark.mllib.stat.test  
pyspark.mllib.tests Tests
pyspark.mllib.tree  
pyspark.mllib.util  
pyspark.profiler  
pyspark.resourceinformation  
pyspark.rdd  
pyspark.rddsampler  
pyspark.resultiterable  
pyspark.serializers  
pyspark.shell Internal
pyspark.shuffle Internal
pyspark.sql  
pyspark.sql.catalog  
pyspark.sql.cogroup  
pyspark.sql.column  
pyspark.sql.conf  
pyspark.sql.context  
pyspark.sql.dataframe  
pyspark.sql.functions  
pyspark.sql.group  
pyspark.sql.readwriter  
pyspark.sql.session  
pyspark.sql.streaming  
pyspark.sql.tests Tests
pyspark.sql.types  
pyspark.sql.udf  
pyspark.sql.utils  
pyspark.sql.window  
pyspark.statcounter  
pyspark.status  
pyspark.storagelevel  
pyspark.streaming  
pyspark.streaming.context  
pyspark.streaming.dstream  
pyspark.streaming.kinesis  
pyspark.streaming.listener  
pyspark.streaming.tests Tests
pyspark.streaming.util  
pyspark.taskcontext  
pyspark.tests Tests
pyspark.traceback_utils Internal
pyspark.util  
pyspark.version  
pyspark.worker Internal

Disclaimer

Apache Spark, Spark, PySpark, Apache, and the Spark logo are trademarks of The Apache Software Foundation. This project is not owned, endorsed, or sponsored by The Apache Software Foundation.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.