Comments (7)
So you raise a good point - I don't necessarily care about 1.1 specifically being being the minimum version (its been a few years, we can move it 1.3?), but I do think it is worth having some minimum version being compatible with the library. The situation I worry about is we decide to use some newer pandas feature without really thinking about it, forcing anyone to immediately upgrade to pandas 2.x, which is a pretty big lift.
I agree with you that we don't do this for other libraries (though arguably we should for numpy
), but the few other libraries we directly depend on are probably much less painful to upgrade than doing a full pandas version bump. Unfortunately I can't find any pandas equivalent of https://numpy.org/neps/nep-0029-deprecation_policy.html.
Thoughts?
from exchange_calendars.
I agree the ideal is to be able to declare the minimum supported version of major dependencies.
The tests were failing as pandas 1.1 wouldn't build (here).
I bumped the minimum to 1.3 and the same thing happened (here).
Bumped it to 1.4 and it built on all platforms (here), although took an age - about 17 minutes on macos! (Worth noting this isn't an issue with the later versions - the latest one builds fine, seems to be something to do with the later v1 releases).
@gerrymanoim, If your happy with pandas>=1.4 I'll leave it there...?
from exchange_calendars.
Yep - fine with that.
Though clicking through those 1.3 seems to be a numpy error? https://github.com/gerrymanoim/exchange_calendars/actions/runs/6017394262/job/16323473168. It occurs while building pandas but the issue seems to be in actually building numpy. The version of numpy there is also ancient Downloading numpy-1.19.3.zip (7.3 MB)
. Looking at 1.1 again, seems to be the same case Collecting numpy==1.17.3
. I'm not surprised that these numpys don't build with 3.11.
Per the DEP policy, looks like we should at minimum be using 1.22. Maybe we just need to specify that so pandas doesn't try with the min version?
from exchange_calendars.
Good spot on the lower numpy version.
It's nothing to do with any min numpy that we're specifying. The tests on Py 3.11 run using the dependencies defined in requirements_minpandas.txt
(here) which has numpy pinned to 1.25.2. The workflow logs show that 1.25.2 is the version being collected. However, within the pandas build pandas then collects its own version of numpy as the minimum supported numpy version for the pandas version being built. So, building pandas 1.3 using Python 3.11 will use numpy 1.19.3, as determined by this pyproject.toml
file for pandas 1.3.
So, to work out which minimum pandas version we should be specifying we have to work backwards from numpy...
- The minimum numpy version for python 3.11 is 1.23.2 as defined here in the
oldest-supported-numpy
package (the package which pandas itself now uses in its builds to define its numpy dependency). - However, pandas 1.4 requires numpy <=1.22.4 (as defined on the 1.4 dev branch here). Although 1.4 is building for us, this explains the excessive build time and suggests we should go with a higher minimum pandas version.
- Pandas 1.5 has (currently) no such restrictions, suggesting that our minimum required pandas version should be 1.5.
@gerrymanoim - let me know if you have any objections to setting it to 1.5.
(The following has ended up as a be of a 'note to self' on managing dependencies)...
I think this all kind of serves as an example why there's a reasonable argument for us to not define minimum versions of dependencies. Defining minimum versions is useful to advise clients that 'this wont work on versions <x', although the minimum version is often interpreted as it's defined, i.e. 'this will work on versions >=x'. But there can never be any guarantee that the package will work with any combination of dependencies except those as defined in the requirements file. Keeping track of what works with what is a rabbit-hole, and every minimum version will eventually start failing, often without obvious cause.
Whilst I think the ideal IS to define a minimum version of significant dependencies, in the FOSS world I don't think the value it offers clients is worth the effort required of maintainers to stay on top of it - it's a losing battle. Ideal but not practical.
Where my thinking is on this:
- the limited time we have should be spent on keeping the package up to date with the latest versions of dependencies (rather than considering support for older versions).
- requirements files should always accompany releases and these form the contract. We guarantee it works with this combination of latest versions (at least on the Python versions and the OSs that its tested over).
- It's reasonable to require clients to maintain their code if they wish to use the latest version.
Interested to hear what anyone else thinks...?
@gerrymanoim, we're still in time to remove the min version for pandas if I've convinced you 😀
from exchange_calendars.
I think I still lean towards supporting some non 2.0 pandas version as an enforced testing minimum (for now), given that 1.5.3 was released in january and 2.0 was released in April. Perfectly happy for that to be 1.5 if that makes our lives easier. I'm happy to drop this guarantee in the future.
I even agree with all your thinking around FOSS. I think where I'm coming from is:
- Pandas is our main dependency
- Pandas 2.x is fairly new
- Pandas 1.x -> 2.x is a fairly big upgrade. Even minor versions can cause annoying to fix breaking changes for a large enough quantitative codebase. This means that people upgrade slowly/rarely.
- There's nothing in
exchange_calendars
that necessitates we use pandas 2.x functionality (or at least I haven't seen it). - I want to avoid a scenario where we accidentally use a pandas 2.x only function or behavior such that clients are forced to upgrade without warning if they want any more calendar updates (or we then have to release a new minor version). I'm okay forcing this update in the future (similar to how we drop python versions from time to time), but think it is too soon to say "you must use latest pandas".
I'd even be okay where we pin min pandas and min numpy to some compatible version where we don't build from source, but I care much more about supporting all versions of python, so if py311 requires numpy 1.23.2, that kind of forces us to use at least 1.5 (I don't really have interest in maintaining python version specific deps).
fwiw here are the pandas by version download stats from pypi
num_downloads version
139179616 1.3.5
109924039 1.5.3
83627259 1.1.5
60940600 2.0.3
40190853 1.2.5
29904745 1.0.5
27627850 2.0.1
27423437 2.0.2
26081110 1.3.4
23006841 0.24.2
from exchange_calendars.
👍
I've changed the min pandas to 1.5 in #323. Tests all passed with build times the same for the latest pandas as for 1.5.
(I hope to add a commit next week to move from pytz
to zoneinfo
, then I'll release it all as 4.3. Cheers.)
from exchange_calendars.
From 4.3 minimum pandas version is 1.5 (implemented in #323).
from exchange_calendars.
Related Issues (20)
- Migrate to pypi trusted publishers
- ecal: TypeError: DatetimeArray._generate_range() got an unexpected keyword argument 'closed' HOT 1
- XASX missing 2 days HOT 2
- FutureWarning: 'T' is deprecated and will be removed in a future version HOT 1
- Consider supporting polars HOT 6
- ecal: AttributeError: 'datetime.timezone' object has no attribute 'key' HOT 1
- XCSE - Prayers day is now not a holiday in Denmark HOT 1
- XKRX - 2024 Election Day missing in calendar HOT 1
- XASX - 2024, early closes missing on Christmas Eve and New Year's Eve HOT 1
- ValueError: assignment destination is read-only in exchange_calendars/exchange_calendar.py", line 2907 when running with Pandas COW HOT 1
- Add EEX holiday calendar for futures HOT 1
- XDUB missing early may bank holiday post-2021
- ModuleNotFoundError: No module named 'exchange_calendars' HOT 1
- XNSE and XBOM special holiday 2024-01-22 HOT 2
- XNSE and XBOM holiday 2024-05-20 HOT 2
- Add calendar for XNSE HOT 1
- Getting AttributeError: 'NoneType' object has no attribute 'total_seconds' requesting calendar for XNYS HOT 1
- PR Labeler workflow failing
- NYSE future schedule ends at 2025-07-18
- First session is alway 20 years in the past HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from exchange_calendars.