explosion / catalogue Goto Github PK
View Code? Open in Web Editor NEWSuper lightweight function registries for your library
License: MIT License
Super lightweight function registries for your library
License: MIT License
Dear explosion
,
In your release v2.0.2
you have introduced a max version constraint for importlib-metadata
which needs to be <3.3.0
.
Line 1 in 7a39fc7
This makes catalogue
incompatible with the twine
(which is a quite useful package to test the packaging + upload to PyPI), since twine
has been requiring importlib-metadata >= 3.6
since their release v3.4.0
.
Was there any specific reason for pinning importlib-metadata
to <3.3.0
?
Can you please suggest some solution to allow compatibility with twine
?
Thank you in advance!
Hi, I had a question regarding the following line:
catalogue/catalogue/__init__.py
Line 111 in 3dc5259
Why is this line different than that of _get_all
function? Should it be if len(self.namespace) <= len(keys)
?
in spacy.cli.util.py you did this:
app = typer.Typer(name=NAME, help=HELP)
def setup_cli() -> None:
# Make sure the entry-point for CLI runs, so that they get imported.
registry.cli.get_all()
# Ensure that the help messages always display the correct prompt
command = get_command(app)
command(prog_name=COMMAND)
Where the registry.cli
is a catalogue Registry. But I can't get where you actually register commands to this. because they are in different python files and won't be registered normally. But somehow it is working. can you please give me some explanation?
โPS: When I use the same structure commands in other files doesn't get imported.
I'm seeing an issue running the tests with python3.10. The error is
python3.10-catalogue> ============================= test session starts ==============================
python3.10-catalogue> platform linux -- Python 3.10.1, pytest-6.2.5, py-1.11.0, pluggy-1.0.0
python3.10-catalogue> rootdir: /build/catalogue-2.0.6
python3.10-catalogue> collected 8 items
python3.10-catalogue> catalogue/tests/test_catalogue.py ......F. [100%]
python3.10-catalogue> =================================== FAILURES ===================================
python3.10-catalogue> ______________________________ test_entry_points _______________________________
python3.10-catalogue> def test_entry_points():
python3.10-catalogue> # Create a new EntryPoint object by pretending we have a setup.cfg and
python3.10-catalogue> # use one of catalogue's util functions as the advertised function
python3.10-catalogue> ep_string = "[options.entry_points]test_foo\n bar = catalogue:check_exists"
python3.10-catalogue> > ep = catalogue.importlib_metadata.EntryPoint._from_text(ep_string)
python3.10-catalogue> E AttributeError: type object 'EntryPoint' has no attribute '_from_text'
python3.10-catalogue> catalogue/tests/test_catalogue.py:108: AttributeError
python3.10-catalogue> =========================== short test summary info ============================
python3.10-catalogue> FAILED catalogue/tests/test_catalogue.py::test_entry_points - AttributeError:...
The same route works fine with python3.9. Here's a full log from the NixOS CI system: https://hydra.nixos.org/log/pfyk1v5fl14yf1n31v8ppjknqxwzgrgm-python3.10-catalogue-2.0.6.drv
Is v1.x supposed to maintain python 2.7 compatibility? It looks like the recent commit from 10 days ago (ef4fd81) creates 1.0.1 but introduces changes that require python 3.
This was discovered when trying to install an older compatible spacy package, but pip grabbed 1.0.1 which ends up producing this error:
.../site-packages/catalogue/_importlib_metadata/__init__.py", line 170 def __len__(self) -> int: ^ SyntaxError: invalid syntax
Hello,
I just noticed that the catalogue
version v2.1.0
(Most recent version on PyPI currently) depends on dependencies not declared in the setup related files.
This is probably due to the temporary inclusion of the config system, which now lives in confection
(#33).
The traceback when importing catalogue
is the following:
Traceback (most recent call last):
File "...", line 8, in <module>
from functions import FUNCTIONS
File "./functions.py", line 1, in <module>
import catalogue
File ".../lib/python3.10/site-packages/catalogue/__init__.py", line 2, in <module>
from catalogue.config import *
File ".../lib/python3.10/site-packages/catalogue/config/__init__.py", line 1, in <module>
from .config import *
File ".../lib/python3.10/site-packages/catalogue/config/config.py", line 10, in <module>
from pydantic import BaseModel, create_model, ValidationError, Extra
ModuleNotFoundError: No module named 'pydantic'
When installing pydantic
to fix the problem, the traceback states it cannot find srsly
.
There's a stackoverflow post that seems to point at a problem with catalogue
:
File "C:\Users\user1\AppData\Local\Continuum\anaconda3\envs\py37\lib\site-packages\catalogue.py", line 8, in
import importlib.metadata as importlib_metadata
ModuleNotFoundError: No module named 'importlib.metadata'
Referring to https://github.com/explosion/catalogue/blob/master/catalogue.py#L8, which makes no sense to me as ModuleNotFoundError
is a subtype of ImportError
and that is properly caught as an exception.
There's a lot of other things going on in those error logs, too, but this caught my attention and wanted to log this here for future reference.
Thanks for building this.
Would it be easy to allow the __name__
of the callable being passed in as the default, but allowing a name
kwarg to overwrite it? I started going down a decorator rabbit-hole on stack overflow to try and pitch in a solution, but got lost in decorator hell.
Using loaders = catalogue.create("mypackage", "loaders")
as our shared example, here's what things look like at the moment:
#passing the name in explicitly
@loaders.register(name='custom_func')
def custom_func(data):
pass
vs.
#letting the func name itself
@loaders.register
def custom_func(data):
pass
vs.
#a cool shorthand version
@loaders
def custom_func(data):
pass
This was my attempt, but it doesn't accept passing in the name kwarg.
class Registry:
callables = {}
def __call__(self, func, name=None):
if not name: name = func.__name__
self.callables[name] = func
def __contains__(self, func):
return func in self.callables
def __repr__(self):
return f"{self.callables}"
loaders = Registry()
#this works fine
@loaders
def custom_func(data):
pass
#this not so much
@loaders(name='blah')
def custom_func(data):
pass
What do you think?
I'm getting this extremely odd error that tells me chain.v1
is not available while it is showing up as available name at the same time.
catalogue.RegistryError: Cant't find 'chain.v1' in registry horizon -> components. Available names: Pipeline.v1, _, chain.v1, fetch.v1, response_to_bytes.v1, response_to_dict.v1, response_to_text.v1
I've tried to run my test in debug mode with a break point where the error occurs. If I run Registry.get
then it fails as in the test but if I run Registry.get()
a second time then it works.
In Registry.get()
we only look for the specific name, i.e. from_entry_point = self.get_entry_point(name)
or in the global variable REGISTRY
. It doesn't exist in either in the first run. But in the second run it does exist in REGISTRY
since we called Registry.get_entry_points()
to print the error message, since this method actually load all the modules in the entry points and thereby populates REGISTRY
.
Load all modules from entry points in Registry.get()
if there is entry points but the requested one is found directly.
I am loading a spaCy model as part of a step in my Dataflow streaming pipeline. To load the pre-downloaded spaCy model for a specific language I am using nlp_model = spacy.load(SPACY_KEYS[lang]) where SPACY_KEYS is a dictionary containing the names of the models for each language (e.g. 'en': 'en_core_web_sm').
This works without any issues for the majority of the jobs run by the pipeline, but for a few iterations I am getting the following error, which seems to be coming from catalogue:
Error message from worker: generic::unknown: Traceback (most recent call last):
File "apache_beam/runners/common.py", line 1232, in apache_beam.runners.common.DoFnRunner.process
File "apache_beam/runners/common.py", line 752, in apache_beam.runners.common.PerWindowInvoker.invoke_process
File "apache_beam/runners/common.py", line 870, in apache_beam.runners.common.PerWindowInvoker._invoke_process_per_window
File "apache_beam/runners/common.py", line 1368, in apache_beam.runners.common._OutputProcessor.process_outputs
File "/usr/local/lib/python3.7/site-packages/submodules/entities_and_pii_removal.py", line 259, in entities_and_PII
nlp_model = spacy.load(SPACY_KEYS[lang]) # load spacy model
File "/usr/local/lib/python3.7/site-packages/spacy/__init__.py", line 52, in load
name, vocab=vocab, disable=disable, exclude=exclude, config=config
File "/usr/local/lib/python3.7/site-packages/spacy/util.py", line 420, in load_model
return load_model_from_package(name, **kwargs) # type: ignore[arg-type]
File "/usr/local/lib/python3.7/site-packages/spacy/util.py", line 453, in load_model_from_package
return cls.load(vocab=vocab, disable=disable, exclude=exclude, config=config) # type: ignore[attr-defined]
File "/usr/local/lib/python3.7/site-packages/de_core_news_sm/__init__.py", line 10, in load
return load_model_from_init_py(__file__, **overrides)
File "/usr/local/lib/python3.7/site-packages/spacy/util.py", line 621, in load_model_from_init_py
config=config,
File "/usr/local/lib/python3.7/site-packages/spacy/util.py", line 489, in load_model_from_path
return nlp.from_disk(model_path, exclude=exclude, overrides=overrides)
File "/usr/local/lib/python3.7/site-packages/spacy/language.py", line 2042, in from_disk
util.from_disk(path, deserializers, exclude) # type: ignore[arg-type]
File "/usr/local/lib/python3.7/site-packages/spacy/util.py", line 1299, in from_disk
reader(path / key)
File "/usr/local/lib/python3.7/site-packages/spacy/language.py", line 2037, in <lambda>
p, exclude=["vocab"]
File "spacy/pipeline/trainable_pipe.pyx", line 343, in spacy.pipeline.trainable_pipe.TrainablePipe.from_disk
File "/usr/local/lib/python3.7/site-packages/spacy/util.py", line 1299, in from_disk
reader(path / key)
File "spacy/pipeline/trainable_pipe.pyx", line 333, in spacy.pipeline.trainable_pipe.TrainablePipe.from_disk.load_model
File "spacy/pipeline/trainable_pipe.pyx", line 334, in spacy.pipeline.trainable_pipe.TrainablePipe.from_disk.load_model
File "/usr/local/lib/python3.7/site-packages/thinc/model.py", line 593, in from_bytes
return self.from_dict(msg)
File "/usr/local/lib/python3.7/site-packages/thinc/model.py", line 624, in from_dict
loaded_value = deserialize_attr(default_value, value, attr, node)
File "/usr/local/lib/python3.7/functools.py", line 840, in wrapper
return dispatch(args[0].__class__)(*args, **kw)
File "/usr/local/lib/python3.7/site-packages/thinc/model.py", line 804, in deserialize_attr
return srsly.msgpack_loads(value)
File "/usr/local/lib/python3.7/site-packages/srsly/_msgpack_api.py", line 27, in msgpack_loads
msg = msgpack.loads(data, raw=False, use_list=use_list)
File "/usr/local/lib/python3.7/site-packages/srsly/msgpack/__init__.py", line 76, in unpackb
for decoder in msgpack_decoders.get_all().values():
File "/usr/local/lib/python3.7/site-packages/catalogue/__init__.py", line 110, in get_all
for keys, value in REGISTRY.items():
RuntimeError: dictionary changed size during iteration
Run this with Python 3.10.0 and Spacy 3.2.4.
import en_core_web_md
from warnings import filterwarnings
filterwarnings('error')
pipeline = en_core_web_md.load()
You'll get,
Traceback (most recent call last):
File "/home/username/project/prog.py", line 5, in <module>
pipeline = en_core_web_md.load()
File "/home/username/.local/share/virtualenvs/project-2ZeatEXR/lib/python3.10/site-packages/en_core_web_md/__init__.py", line 10, in load
return load_model_from_init_py(__file__, **overrides)
File "/home/username/.local/share/virtualenvs/project-2ZeatEXR/lib/python3.10/site-packages/spacy/util.py", line 615, in load_model_from_init_py
return load_model_from_path(
File "/home/username/.local/share/virtualenvs/project-2ZeatEXR/lib/python3.10/site-packages/spacy/util.py", line 488, in load_model_from_path
nlp = load_model_from_config(config, vocab=vocab, disable=disable, exclude=exclude)
File "/home/username/.local/share/virtualenvs/project-2ZeatEXR/lib/python3.10/site-packages/spacy/util.py", line 524, in load_model_from_config
lang_cls = get_lang_class(nlp_config["lang"])
File "/home/username/.local/share/virtualenvs/project-2ZeatEXR/lib/python3.10/site-packages/spacy/util.py", line 325, in get_lang_class
if lang in registry.languages:
File "/home/username/.local/share/virtualenvs/project-2ZeatEXR/lib/python3.10/site-packages/catalogue/__init__.py", line 49, in __contains__
has_entry_point = self.entry_points and self.get_entry_point(name)
File "/home/username/.local/share/virtualenvs/project-2ZeatEXR/lib/python3.10/site-packages/catalogue/__init__.py", line 135, in get_entry_point
for entry_point in AVAILABLE_ENTRY_POINTS.get(self.entry_point_namespace, []):
File "/usr/lib/python3.10/importlib/metadata/__init__.py", line 400, in get
self._warn()
DeprecationWarning: SelectableGroups dict interface is deprecated. Use select.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.