Comments (6)
In terms of testing, I think unit tests are the way to go. We don't need to test every combination of APIs, as long as the conversion is working properly. We can add some canary testing on one or two APIs to ensure that end to end is working properly. Does this make sense?
from modin.
@devin-petersohn, thanks for the suggestion! It does make sense. We can start with unit tests to verify if API works with a single set of parameters.
from modin.
It seems to me we could have unit tests even for the first impl. Just copy tests from the dataframe folder to another one and leave a single set of parameters for every tests. @devin-petersohn, what do you think?
from modin.
@arunjose696, thanks for your research. I think we should copy those tests to a different directory (e.g., modin/tests/pandas/native_df_mode) and update them to specifically test interoperability. This way, we would not bloat up existing tests and would make navigation for interoperability tests easier.
from modin.
After the first implementation of small QC is done, I will open a PR with interoperablilty and have unit tests to verify if the API works with single set of parameters.
For the first implementation #7259., would it suffice to go with tests in modin/modin/tests/pandas/dataframe folder for now by setting the MODIN_NATIVE_DATAFRAME_MODE, to verify the query compiler works for dataframes, or should we add unit tests even for the initial implementation?
from modin.
That makes sense to me. Thanks @YarShev and @arunjose696 !
from modin.
Related Issues (20)
- Update Modin on Dask documentation.
- Push computation to query compiler for Series.corr
- Prepare Modin code to NumPy 2.0 HOT 4
- REFACTOR: Minimize the access of methods _modin_frame methods from ._query_compiler layer
- unpin numexpr HOT 1
- BUG: conda install modin-all isn't installing modin-ray or ray HOT 16
- Avoid using `synchronize_labels` for `combine` function
- Pin numpy<2 and release 0.30.1, 0.29.1, 0.28.3, 0.27.1 versions HOT 3
- `versioneer.py` is broken in python 3.12
- NumPy 2.0 support
- Add similar methods as in 7294 for operating on columns?
- Avoid unnecessary length checks in `df.squeeze`
- Run a subset of CI tests for all Python versions that Modin has declared supported on a scheduled basis.
- Upgrade github actions dependency versions HOT 1
- ValueError: The 'nrows' option is not supported with the 'pyarrow' engine HOT 4
- BUG: HOT 1
- Poor performance of df.insert and df.to_parquet HOT 23
- Pass sort parameter in stack to `query_compiler` from modin/pandas/dataframe.py
- BUG: columns mismatch after df.update
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from modin.