Git Product home page Git Product logo

python-glob2's Introduction

python-glob2

This is an extended version of Python's builtin glob module (http://docs.python.org/library/glob.html) which adds:

  • The ability to capture the text matched by glob patterns, and return those matches alongside the filenames.
  • A recursive '**' globbing syntax, akin for example to the globstar option of the bash shell.
  • The ability to replace the filesystem functions used, in order to glob on virtual filesystems.
  • Compatible with Python 2 and Python 3 (tested with 3.3).

It's currently based on the glob code from Python 3.3.1.

Examples

Matches being returned:

import glob2

for filename, (version,) in glob2.iglob('./binaries/project-*.zip', with_matches=True):
    print version

Recursive glob:

>>> import glob2
>>> all_header_files = glob2.glob('src/**/*.h')
['src/fs.h', 'src/media/mp3.h', 'src/media/mp3/frame.h', ...]

Note that ** must appear on it's own as a directory element to have its special meaning. **h will not have the desired effect.

** will match ".", so **/*.py returns Python files in the current directory. If this is not wanted, */**/*.py should be used instead.

Custom Globber:

from glob2 import Globber

class VirtualStorageGlobber(Globber):
    def __init__(self, storage):
        self.storage = storage
    def listdir(self, path):
        # Must raise os.error if path is not a directory
        return self.storage.listdir(path)
    def exists(self, path):
        return self.storage.exists(path)
    def isdir(self, path):
        # Used only for trailing slash syntax (``foo/``).
        return self.storage.isdir(path)
    def islink(self, path):
        # Used only for recursive glob (``**``).
        return self.storage.islink(path)

globber = VirtualStorageGlobber(sftp_storage)
globber.glob('/var/www/**/*.js')

If isdir and/or islink cannot be implemented for a storage, you can make them return a fixed value, with the following consequences:

  • If isdir returns True, a glob expression ending with a slash will return all items, even non-directories, if it returns False, the same glob expression will return nothing.
  • Return islink True, the recursive globbing syntax ** will follow all links. If you return False, it will not work at all.

python-glob2's People

Contributors

0x022b avatar ankostis avatar jehamilton avatar joolswills avatar miracle2k avatar petebrowne avatar pombredanne avatar xoviat avatar zalan-axis avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

python-glob2's Issues

License

Are there any restrictions on using this? Does it have a specific license?

pypi overwrote 0.4.1

Looks like you guys somehow uploaded version 0.4.1 to pypi and overwrote the previous "version" of 0.4.1 with different code.

Might wanna upgrade the version before pushing to pypi, this broke stuff for us.

No option to override the excluding of hidden files?

Hi,

From what I can see of the code, there is no way to override whether it looks at hidden files. I think this feature could be added by overriding the check but before I look at creating a PR I wanted to see if this was something you were interested in?

Thanks!

`iglob` returns a list on Python 2

On Python 2, iglob returns a list rather than an iterator.

Python 2.7.11 (default, Mar  1 2016, 18:40:10) 
[GCC 4.2.1 Compatible Apple LLVM 7.0.2 (clang-700.1.81)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import glob2
>>> result = glob2.iglob('**/*.py')
>>> type(result)
<type 'list'>

Unicode compatibility

I have had some issues in globbing and matching filenames with unicode characters on different OSes/filesystems.

@remram44 has suggested a solution which I think will be a good addition to glob2. The only issue I can see is that it may be slower than the current code though I haven't benchmarked it.

If you think it would be useful, I can work on a pull request with this feature.

0.4.1 Incompatible with python2.6

I cannot import glob2 in python2.6 (works fine in 2.7). Below is the traceback

Python 2.6.6 (r266:84292, Jan 22 2014, 09:42:36)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import glob2
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.6/site-packages/glob2/__init__.py", line 2, in <module>
    from .impl import *
  File "/usr/lib/python2.6/site-packages/glob2/impl.py", line 8, in <module>
    from . import fnmatch
  File "/usr/lib/python2.6/site-packages/glob2/fnmatch.py", line 18, in <module>
    from .compat import lru_cache
  File "/usr/lib/python2.6/site-packages/glob2/compat.py", line 22
    fasttypes = {int, str, frozenset, type(None)},
                    ^
SyntaxError: invalid syntax

How can `sep` parameter be used? Is it broken?

I'm using a custom globber that internally uses only forward slashes and I also consistently specify sep='/' to iglob etc., and yet, on Windows I can still get paths joined by '\'.

I see that this sep is only being passed to _join_paths, and inside it all that happens is that / gets replaced with this character. What about replacing \, though?

python-glob2/glob2/impl.py

Lines 211 to 215 in ef4b58f

def _join_paths(paths, sep=None):
path = join(*paths)
if sep:
path = re.sub(r'\/', sep, path) # cached internally
return path

There's also a suspicious backslash (that has no effect at all) inside the pattern ๐Ÿค”

It's strange that os.path.join is being used for this library, which strives to fully abstract this operation -- and currently seems like I have no better option but to map all its outputs through pathlib.Path(p).to_posix(). This particular operation is not overridable.

So, my use case doesn't work, but I also find it difficult to understand how sep can be used at all, as on Windows it will always just be skipped.

Merging with coala glob

Hi,

we have created a glob library similar to yours for the coala project. Currently I'm looking into possibilities of splitting out general purpose parts into own libraries and merge with existent projects where possible.

Would you be open for such a merge? We have quite similar functionality as far as I can see from roughly reading over your readme.

Our code is at https://github.com/coala-analyzer/coala/blob/master/coalib/parsing/Globbing.py and our tests reside at https://github.com/coala-analyzer/coala/blob/master/coalib/tests/parsing/GlobbingTest.py . That said, our glob module is fully tested with continuous integration against windows, linux and mac with full test branch coverage which is one of the main benefits you'd get from the merge - mature stability. With the merge we can also offer maintenance help, setting up proper CI doing automated releases and that stuff if this is desired as we will continue to use the library.

DeprecationWarning: Flags not at the start of the expression for glob2.glob("/tmp/**/*.json")

Warning

  • DeprecationWarning: Flags not at the start of the expression
  • Python 3.6.8

Used the following method:

glob2.glob("/tmp/**/*.json")

Logs

0: DeprecationWarning: Flags not at the start of the expression '(.*)\\Z(?ms)'
  return re.compile(res, flags).match
/usr/lib/python3.6/site-packages/glob2/fnmatch.py:80: DeprecationWarning: 
    Flags not at the start of the expression '(.*)\\.json\\Z(?ms)'
  return re.compile(res, flags).match

Fails in Python 3.5.2 - module 'itertools' has no attribute 'imap'

According to this stackoverflow answer, imap was removed from itertools because map now supplies the same functionality.

Full message:

all_header_files = glob2.glob('*/.htm')
Traceback (most recent call last):
File "", line 1, in
File "/usr/lib/python3.5/site-packages/glob2/impl.py", line 54, in glob
return list(self.iglob(pathname, with_matches, include_hidden))
File "/usr/lib/python3.5/site-packages/glob2/impl.py", line 76, in iglob
return itertools.imap(lambda s: s[0], result)
AttributeError: module 'itertools' has no attribute 'imap'

Merge in changes made in Python 3.4+ stdlib.

Hello,

I appreciate if you update you very useful library to the latest python 3.6.x

I don't mean it has problem or incompatibility now, but I appreciate if you consider and update the repo.

It is a very very useful package indeed

release?

Hello, would you mind releasing sometime soon?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.