Git Product home page Git Product logo

dulwich-py3k's People

Contributors

abderrahim avatar asabil avatar bcully avatar bduncan avatar brutasse avatar bz2 avatar danc86 avatar dborowitz avatar ddborowitz avatar dirkneumann avatar durin42 avatar eberle1080 avatar herve76 avatar jameinel avatar james-w avatar jc2k avatar jelmer avatar mbr avatar mckern avatar milki avatar mwhudson avatar nickstenning avatar pquantin avatar rblasch avatar rctay avatar rockstar avatar ronnypfannschmidt avatar techtonik avatar whitlockjc avatar zombiezen avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

tdc-bob

dulwich-py3k's Issues

Add unit tests for Sha1Sum class

The Sha1Sum class is probably one of the biggest changes introduced in the port. It definitely needs some unit tests to make sure it works, since there are so many areas that rely on it working correctly.

Possible memory leak

Heh, just noticed a small potential memory leak in _objects.c introduced by yours truly:

Look right around Py_DECREF(key_string);

Version checks

There are a number of places in the code (C, Python, and Makefile) that check the python version. I should make sure that the version numbers are correct, and that the checks are actually needed in python 3.

Use correct encodings

It occurs to me that I'm using the default encoding when switching from bytes to strings, which is ASCII. Upon attempting to view "real-world" log entries, I'm getting all kinds of UnicodeEncodeError exceptions because the characters are out of range.

dulwich, I guess, will be using utf-8 as the default encoding until I see a compelling reason not to.

Documention, documentation, documentation

First of all, I need to add comments to my new stuff. That's just polite.

Secondly, there's one major change between python 2 and python 3 that will affect how this is called: Strings vs Bytes. In some places Strings are necessary (e.g. path names). In other places, Bytes objects are preferred. When to use each should be very clearly documented, and reflected in the documentation and comments.

Finally I suppose I should probably also document precisely what changed from 2 to 3, so that any future maintainers can make any necessary changes when porting from upstream.

Port dulwich C extensions to Python 3

The C python 3 C API has changed somewhat, mainly related to the strings / bytes calls. The C extensions can be deactivated for now, but eventually they'll need ported over too.

Add unit tests to check for unicode abilities

By default it seems that python 3 likes to use the ascii codec when you call decode. This is fine and good for us US users, but try running dulwich log on the dulwich project itself and see what happens (spoiler alert: there are utf8 characters in the commit logs, and no errors are thrown until you actually try to print the string).

There need to be unit tests that check that everything that could potentially be a unicode string, actually supports unicode (i.e. decode is being called correctly).

Remove (or minimize) the conversion decorators

The port is finished, but it relies very heavily on decorators to do a lot of the heavy lifting. In particular, wrap3kstr and convert3kstr. These are simple wrapper methods I wrote (the first is a decorator) that transparently converts between String and Bytes objects.

They're very nice, and very necessary, but they slow things down considerably and make the code behave in a mysterious fashion (passing in a string may result in a bytes object being returned, hardly intuitive). I believe strongly in the principle of least astonishment, so I'd like to remove these things slowly, instead preferring explicit conversions and type checking (assertions?).

To support this, I've added a flag to the wrap3kstr decorator: enforcing=True. This does an assert on the type, rather than a conversion. This should allow me to slowly convert the calling code into something explicit, and eventually get rid of the decorators completely.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.