Git Product home page Git Product logo

Comments (14)

martindurant avatar martindurant commented on August 14, 2024

There are a number of changes required for py2 compatibility. If this is important to you, it can be the highest priority task.

from fastparquet.

mrocklin avatar mrocklin commented on August 14, 2024

@jbednar do you need Python 2 support? We were hoping to start avoiding maintaining this by default unless someone showed up willing to pay the bill for it. Maintaining Python2/3 support for data formats can be somewhat costly due to Python 2's management of bytes/str.

from fastparquet.

martindurant avatar martindurant commented on August 14, 2024

I've had a look, and I can get some tests working for py2 in my branch https://github.com/martindurant/fastparquet/tree/py2 , but probably covering all the cases where strings crop up would be annoying.

from fastparquet.

jbednar avatar jbednar commented on August 14, 2024

I do have a client project where we are using python2, but we could conceivably migrate it. At the moment, I just need to benchmark the read times under different conditions, and if I'm going to make a fair comparison it's hard to be sure about the results if some of the tests have to use Python2 and some Python3. Currently my castra tests have to use Python2 because the data file isn't portable to Python3, but again I could convert that by exporting to something portable and recreating it using Python3. So I think for all my own use cases, I can work around it; it's just a pain rather than a showstopper. But the most important is that I would guess that it might give people pause if they want to store their data in Parquet format, because they would worry that people using Python2 couldn't use it.

from fastparquet.

jbednar avatar jbednar commented on August 14, 2024

In any case, pip should presumably not allow installation on Python2 if it's not supported, and the name of the wheel currently suggests that Python2 is supported.

from fastparquet.

mrocklin avatar mrocklin commented on August 14, 2024

The performance limiting pieces here are not strongly Python version specific. You'll be much more bound by, say, LLVM version than Python version. I think that you can probably make the comparison relatively faithfully even if crossing versions.

If you find that Parquet is a good solution then perhaps the client project could pay for some of the work to port fastparquet to Python 2?

from fastparquet.

jbednar avatar jbednar commented on August 14, 2024

I'm old, and I don't care how to count how many times I've thought (or my students or collaborators have thought) that they had identified performance improvements or reductions, when in fact they were just being confused by seemingly irrelevant differences in setup that later turned out to be relevant. :-) So I've learned not to make any claims about performance if I'm not sure the two cases really are comparable, which is quite difficult to do between Python2 and Python3 (who knows which library might come into play for each one?). So while I too guess that Py2/3 doesn't matter in this case, I will always prefer to control it instead.

from fastparquet.

mrocklin avatar mrocklin commented on August 14, 2024

As you like. I just don't think that Python 2 support is as high a priority as some other things. I would prefer to see us build out a well fleshed out Python 3 solution and then think about Python 2 support, ideally with the financial backing of someone willing to pay for it.

from fastparquet.

mrocklin avatar mrocklin commented on August 14, 2024

Just my preference though. I don't have a good cost metric on what it will take to migrate and maintain. My guess is that it is non-trivial.

from fastparquet.

jbednar avatar jbednar commented on August 14, 2024

Sure.

from fastparquet.

freshforlife avatar freshforlife commented on August 14, 2024

@martindurant @mrocklin , Is fastparquet compatible with python 2.7.12 at this time or you guys are concentrating on Python 3 ? I require it for reading parquet files using dask dataframes. The error which I get when I import fastparquet is ImportError: cannot import name encoding

from fastparquet.

martindurant avatar martindurant commented on August 14, 2024

@freshforlife , PR #66 seems to work well, just needs a little cleanup before merging. Feel free to use and comment.

from fastparquet.

freshforlife avatar freshforlife commented on August 14, 2024

@martindurant Thanks !! Where do I download it from ? pip install seems to work and this version is installed: fastparquet==0.0.4.post1 , however it gives the above error.

from fastparquet.

martindurant avatar martindurant commented on August 14, 2024

To use that, you would need to clone this repository, checkout the relevant branch and python-install. Pip should be able to install directly from that git branch if you can get the correct syntax. I will be releasing a new version of fastparquet shortly after the PR is merged.

from fastparquet.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.