Git Product home page Git Product logo

dataenforce's People

Contributors

cedricfr avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

dataenforce's Issues

Releases?

Hi!

We pin packages to versions, could you make a tagged release of this repo?
In the meantime, I am going to fork it and tag it from my fork.

Thanks for the awesome package.

Shape information

Is it possible to add shape information?

e.g.

Dataset[(10, 3), ("a": int, "b": int, "c": int)]

Wil type a dataframe with 10 rows and 3 columns

[Feature Request] Dataset from a dataclass

Would be nice to have a feature to define a Dataset using a dataclass as a source.

So, instead of

DUser = Dataset["id": int, "name": str]

def process1(data: DUser):
  pass

we can use a dataclass as a source for field names and types, like

@dataclass
class User:
  id: int
  name: str

def process1(data: Dataset[User]):
  pass

This can help to automatically update a list of fields based on the data class, and also might be useful in refactoring.
Can be used like

users = pd.DataFrame(
  [
    User(id=1, name="Sam"),
    User(id=2, name="Rhett")
  ]
)

process1(users)

Still alive?

Was wondering if this project is still alive. I was considering including it in my projects.

validate fails when non-Dataset function arguments have type annotations

Consider the following code:

import numpy as np

from dataenforce import Dataset, validate

@validate
def myfunc(data: Dataset["a": int, "b": np.float], data2: Dataset['x'], other_param: int):
    return data

when calling myfunc:

t.myfunc(pd.DataFrame([{'a':1, 'b': 1.2}]), pd.DataFrame([{'x': 10}]), 39)                                                                                                                                 

I get:

ValueError: myfunc() requires a code object with 0 free vars, not 3

Expected behaviour: Ignore non-Dataset arguments, both positional and keyword, or optionally delegate the type validation for non-Dataset arguments to the another type-enforcing library.

thank you!

Unable to type-cast a dataframe to Dataset

from typing import cast       

df = cast(Dataset["timestamp", "volume"], df)

Gives the error:

Expected class type but received "Unknown | DatasetMeta"
  "DatasetMeta" is not a classPylancereportGeneralTypeIssues

Is this an issue with Pylance? I would expect to be able to cast a dataframe to the Dataset type

Can not interpret np.datetime types

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-6-f00935f5c253> in <module>

/opt/Anaconda3/envs/basic_ml/lib/python3.8/site-packages/dataenforce/__init__.py in wrapper(*args, **kwargs)
     45                     dtypes = dict(value.dtypes)
     46                     for colname, dt in hint.dtypes.items():
---> 47                         if not np.issubdtype(dtypes[colname], np.dtype(dt)):
     48                             raise TypeError("%s is not a subtype of %s for column %s" % (dtypes[colname], dt, colname))
     49         return f(*args, **kwargs)

/opt/Anaconda3/envs/basic_ml/lib/python3.8/site-packages/numpy/core/numerictypes.py in issubdtype(arg1, arg2)
    417     """
    418     if not issubclass_(arg1, generic):
--> 419         arg1 = dtype(arg1).type
    420     if not issubclass_(arg2, generic):
    421         arg2 = dtype(arg2).type

TypeError: Cannot interpret 'datetime64[ns, UTC]' as a data type

Integration with mypy?

I saw that your package has a validate decorator to ensure the data frame during run time,
Is there a way for it to integrate with mypy for static code analysis?

Python 3.9

The library is not compatible with pylance 3.9

PyRe check, Undefined or invalid type [11]: Annotation is not defined as a type.

when using PyRe check, i get the following
Undefined or invalid type [11]: Annotation DummyDataframe is not defined as a type.

code example

from dataenforce import Dataset

DummyDataframe = Dataset["id": int]


def test_annotation(_df: DummyDataframe) -> DummyDataframe:
    pass

is there something that can be done to avoid/fix this issue?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.