As of April 2022, Lore has been deprecated at Instacart and will not be supported further. We advise against using Lore in new code.
The original readme has been moved.
Lore makes machine learning approachable for Software Engineers and maintainable for Machine Learning Researchers
License: MIT License
As of April 2022, Lore has been deprecated at Instacart and will not be supported further. We advise against using Lore in new code.
The original readme has been moved.
Hi,
We have some new models written under lore framework and we typically use Airflow to automate most of our existing data and ML pipelines. Thus, we would love to automate the lore training pipeline in Airflow as well. Before implementing this, I am wondering how you guys at Instacart manage the ML training pipelines in production? Thanks for the help!
hi, I user lore init like 'lore init my_app --python-version=3.6.4 --keras --xgboost' (centos 7)
I catch one exception
Traceback (most recent call last):
File "/home/zhangxiaxu/py36/lib/python3.6/shutil.py", line 544, in move
os.rename(src, real_dst)
FileNotFoundError: [Errno 2] No such file or directory: 'app' -> 'my_app'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/zhangxiaxu/py36/bin/lore", line 11, in
sys.exit(main())
File "/home/zhangxiaxu/py36/lib/python3.6/site-packages/lore/main.py", line 331, in main
known.func(known, unknown)
File "/home/zhangxiaxu/py36/lib/python3.6/site-packages/lore/main.py", line 534, in init
shutil.move('app', parsed.name)
File "/home/zhangxiaxu/py36/lib/python3.6/shutil.py", line 558, in move
copy_function(src, real_dst)
File "/home/zhangxiaxu/py36/lib/python3.6/shutil.py", line 257, in copy2
copyfile(src, dst, follow_symlinks=follow_symlinks)
File "/home/zhangxiaxu/py36/lib/python3.6/shutil.py", line 120, in copyfile
with open(src, 'rb') as fsrc:
FileNotFoundError: [Errno 2] No such file or directory: 'app'
Can you give me some help or suggestion?
I installed lore using: pip install lore
➜ pip --version
pip 9.0.1 from /usr/lib/python3/dist-packages (python 3.5)
➜ python --version
Python 3.5.3
➜ pip freeze | grep lore
lore==0.6.13
➜ lore init my_app --python-version=3.6.4 --keras --xgboost --postgres
zsh: command not found: lore
➜ echo $SHELL
/usr/bin/zsh
➜ which lore
lore not found
This is my system's info:
_,met$$$$$gg.
,g$$$$$$$$$$$$$$$P. OS: Debian 9.4 stretch
,g$$P"" """Y$$.". Kernel: x86_64 Linux 4.9.0-6-amd64
,$$P' `$$$.
',$$P ,ggs. `$$b: Packages: 1854
`d$$' ,$P"' . $$$ Shell: zsh 5.3.1
$$P d$' , $$P Resolution: 1600x900
$$: $$. - ,d$$' DE: Gnome
$$\; Y$b._ _,d$P' WM: GNOME Shell
Y$$. `.`"Y$$$$P"' WM Theme: Adwaita
`$$b "-.__ GTK Theme: Adwaita [GTK2/3]
`Y$$ Icon Theme: Adwaita
`Y$$. Font: Cantarell 11
`$$b. CPU: Intel Core i7-3687U CPU @ 3.3GHz
`Y$$b. GPU: Mesa DRI Intel(R) Ivybridge Mobile
`"Y$b._ RAM: 1804MiB / 15914MiB
`""""
Currently, Jinja2 templating is done here:
Lines 413 to 434 in f4789b1
And this is how they are called:
Lines 168 to 169 in f4789b1
The problem here is all the args are passed twice, one to jinja2, the other to snowflake connector.
It will break down right here, because nothing could be formated this way.
For example if we want to run
lore.io.analysis.execute(filename='somefile', job_type=self.job_type)
we will see processed_params
is not empty and breaks the program here:
ImportError: Failed to import test module: unit.test_subscribers
Traceback (most recent call last):
File "/home/betty/.pyenv/versions/3.7.9/lib/python3.7/unittest/loader.py", line 436, in _find_test_path
module = self._get_module_from_name(name)
File "/home/betty/.pyenv/versions/3.7.9/lib/python3.7/unittest/loader.py", line 377, in _get_module_from_name
import(name)
File "/home/betty/my_app/tests/unit/test_subscribers.py", line 4, in
from my_app.models.subscribers import DeepName, BoostedName
File "/home/betty/my_app/my_app/models/subscribers.py", line 2, in
import lore.models.keras
File "/home/betty/.pyenv/versions/3.7.9/envs/my_app/lib/python3.7/site-packages/lore/models/init.py", line 1, in
from lore.models import base
File "/home/betty/.pyenv/versions/3.7.9/envs/my_app/lib/python3.7/site-packages/lore/models/base.py", line 11, in
import botocore
ModuleNotFoundError: No module named 'botocore'
Ran 1 test in 0.092s
FAILED (errors=1)
I'm mostly posting this so it's searchable by others who might run into the issue, but perhaps a Git version check could be added to provide a more descriptive error code? Current error is:
subprocess.CalledProcessError: Command '('git', '-C', '/home/evan/.pyenv', 'pull')' returned non-zero exit status 129.
The latest Git version currently available from default centos7 repos is 1.8.3.1, but this doesn't support the -C command line switch is used by lore init
, resulting in the error. The -C switch appears to have been added in 1.8.5.
The easiest way to update Git on centos7 without using third-party repos appears to be this -
https://www.softwarecollections.org/en/scls/rhscl/rh-git29/
Specifically it assumes it is NOT installed through brew, but manually installed by cloning the git repo. If you try to run lore install
with pyenv installed this way you get
> lore install
WARNING pyenv executable is not present at /Users/thomasfulton/.pyenv/bin/pyenv
Would you like to blow away ~/.pyenv and rebuild from scratch? [Y/n] n
ERROR please fix pyenv before continuing
If you symlink your pyenv executable to that location you eventually get
fatal: not a git repository (or any of the parent directories): .git
Traceback (most recent call last):
File "/Users/thomasfulton/.pyenv/versions/3.7.2/bin/lore", line 11, in <module>
sys.exit(main())
File "/Users/thomasfulton/.pyenv/versions/3.7.2/lib/python3.7/site-packages/lore/__main__.py", line 355, in main
known.func(known, unknown)
File "/Users/thomasfulton/.pyenv/versions/3.7.2/lib/python3.7/site-packages/lore/__main__.py", line 652, in install
install_python_version()
File "/Users/thomasfulton/.pyenv/versions/3.7.2/lib/python3.7/site-packages/lore/__main__.py", line 1098, in install_python_version
subprocess.check_call(('git', '-C', env.PYENV, 'pull'))
File "/Users/thomasfulton/.pyenv/versions/3.7.2/lib/python3.7/subprocess.py", line 347, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '('git', '-C', '/Users/thomasfulton/.pyenv', 'pull')' returned non-zero exit status 128.
Hi there,
When I install Lore and try to create a test app using the code listed in the blog post:
lore init my_app --python-version=3.6.4 --keras
I get the following stack trace:
Traceback (most recent call last):
File "c:\users\brian\anaconda2\envs\lore\lib\runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "c:\users\brian\anaconda2\envs\lore\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "C:\Users\brian\Anaconda2\envs\Lore\Scripts\lore.exe\__main__.py", line 5, in <module>
File "c:\users\brian\anaconda2\envs\lore\lib\site-packages\lore\__init__.py", line 7, in <module>
from lore import env, util, ansi
File "c:\users\brian\anaconda2\envs\lore\lib\site-packages\lore\env.py", line 52, in <module>
locale.setlocale(locale.LC_ALL, 'en_US.UTF-8')
File "c:\users\brian\anaconda2\envs\lore\lib\locale.py", line 598, in setlocale
return _setlocale(category, locale)
locale.Error: unsupported locale setting
I tried Googling around a bit for this issue but couldn't find a good resolution.
Thanks!
Brian
any plan to add docker support to the project?
lore test
should set LORE_ENV=test
. Looks like it is being set too late and the test config is not being used.
Exception in thread Thread-3:
Traceback (most recent call last):
File "/app/.heroku/python/lib/python3.6/threading.py", line 916, in _bootstrap_inner
self.run()
File "/app/.heroku/python/lib/python3.6/threading.py", line 864, in run
self._target(*self._args, **self._kwargs)
File "/app/.heroku/python/lib/python3.6/site-packages/librato/aggregator.py", line 191, in submit
query_props=self.to_payload())
File "/app/.heroku/python/lib/python3.6/site-packages/librato/__init__.py", line 209, in _mexe
resp_data, success, backoff = self._process_response(resp, backoff)
File "/app/.heroku/python/lib/python3.6/site-packages/librato/__init__.py", line 183, in _process_response
raise exceptions.get(resp.status, resp_data)
librato.exceptions.Unauthorized: [401] Credentials are required to access this resource.
10:56:55.911 WARNING threading:120 => Exception in thread Thread-3:
Traceback (most recent call last):
File "/app/.heroku/python/lib/python3.6/threading.py", line 916, in _bootstrap_inner
self.run()
File "/app/.heroku/python/lib/python3.6/threading.py", line 864, in run
self._target(*self._args, **self._kwargs)
File "/app/.heroku/python/lib/python3.6/site-packages/librato/aggregator.py", line 191, in submit
query_props=self.to_payload())
File "/app/.heroku/python/lib/python3.6/site-packages/librato/__init__.py", line 209, in _mexe
resp_data, success, backoff = self._process_response(resp, backoff)
File "/app/.heroku/python/lib/python3.6/site-packages/librato/__init__.py", line 183, in _process_response
raise exceptions.get(resp.status, resp_data)
librato.exceptions.Unauthorized: [401] Credentials are required to access this resource.
In xgboost scikit-learn fit API, the argument eval_metric
can be a string for single metric or a list of string for multiple metrics. However, if we provide a list to the eval_metric
, it will break the fit
method in lore.estimators.xgboost.base
. Specifically, in this and that. Can we support multiple eval metrics? I can submit a PR to fix this. Thanks!
When I run the test,How to solve it
Hi @montanalow . This is really a great work. I really like how you abstract the common pitfalls in machine learning and streamline the process in this project. I see a lot of potential in this project from a data scientist perspective. If you don't mind, I can provide my feedback from using this tool.
For this particular issue, I encountered h5py
error because of too many Input
layers. As show here, we have to pass one encoder for each column in the dataframe, and each encoder corresponds to one Input
layer. I deal with a lot of DNA sequence data which is usually >5000 columns. I think it makes sense to at least combine the columns using Continuous
or Pass
encoders into one Input
.
It would be nice to add CatBoost support, one of the three most populare GBDT libraries (https://catboost.ai/, https://github.com/catboost/catboost).
I create a new project with command:
lore init test_subscribers --python-version=2.7.13 --keras
I set my python path as ~/.pyenv/versions/test_subscribers/bin/python
.
Then I follow the cute little example.But when I run tests with lore test
, there is an Import error.
ERROR: unit.test_subscribers (unittest.loader.ModuleImportFailure)
----------------------------------------------------------------------
ImportError: Failed to import test module: unit.test_subscribers
Traceback (most recent call last):
File "/home/wang/.pyenv/versions/2.7.13/lib/python2.7/unittest/loader.py", line 254, in _find_tests
module = self._get_module_from_name(name)
File "/home/wang/.pyenv/versions/2.7.13/lib/python2.7/unittest/loader.py", line 232, in _get_module_from_name
__import__(name)
File "/home/wang/code/ml/lore_subscribers/tests/unit/test_subscribers.py", line 15, in <module>
from lore_subscribers.models.subscribers import DeepName, BoostedName
File "/home/wang/code/ml/lore_subscribers/lore_subscribers/models/subscribers.py", line 20, in <module>
class DeepName(lore.models.keras.Base):
AttributeError: 'module' object has no attribute 'keras'
There is result of run lore pip list|grep Keras
:
Keras (2.1.4)
When I lore install
, cmd shows File "d:\anaconda\lib\site-packages\lore\ansi.py", line 85, in enable_vt_mode raise NotImplementedError NotImplementedError
There is a typo in setup.py introduced in this commit.
install_requires=[
+ 'flask>=0.12.2, <0.12.99'
'future>=0.15, <0.16.99',
See the missing comma ^.
See this post for more information https://stackoverflow.com/questions/49869827/lore-installation-pip-error/49869919#49869919
In the medium post you've mentioned that you are going to add some UI for logging info and etc.
What do you think about adding MLflow Tracking for that reason? What do you think about Lore + MLflow in general?
The lore is great but there are few things that made using that lib a bit painful:
Hello,
vi
editing mode works on my global ipython(or ipython of other virtualenvs). With lore console
, the binding defaults to the default binding.
Suggestions from here do not work as well.
i.e.
lore console --TerminalInteractiveShell.editing_mode=vi
didn't work.
hi. trying:
lore init . --python-version=3.6.6
> INFO converting existing directory to a Lore App
> ERROR "." already exists in this directoy! Add --bare to avoid clobbering existing files.
This is an empty dir - is this the expected behaviour?
lore version: system version: 0.6.14
I know lore has model.hyper_parameter_search() and command lore hyper_fit command ,but in lore project I have not find some docs or demo for it ,as we know , we always use from sklearn.model_selection import GridSearchCV
from sklearn.model_selection import RandomizedSearchCV
could you give some tutorial for this ,
by the way ,we also need to override the fit predict method ,
and sometime we also need some estimators that lore has not provide ,
for example from sklearn.ensemble import RandomForestRegressor
also ,we need lore support pyspark
Hi @montanalow!
In order to make any PR to contribute to this repo I'd like to understand few concepts of Lore.
Where is exactly proper place to specify what prediction is. For example we have instance of XGBoost classifier
class_estimator = XGBoostClassifier()
then we can make inference by calling class_estimator.predict()
if we want to get class label or class_estimator.predict_proba()
in order to get probability estimation of classes (in regression problem we could get real value output).
Instance of Lore Model has his own predict method that could be called in different places (e.g
Line 53 in bc3646e
So if we want to get probability estimation of classes and use flask as the serving method, then we have to override predict here at the lore model level https://github.com/instacart/lore/blob/master/lore/models/base.py#L133
But if we want to use build-in tools like shap_explainer
within the same model https://github.com/instacart/lore/blob/master/lore/models/base.py#L371
then we have to override predict method at the lore estimator level https://github.com/instacart/lore/blob/master/lore/estimators/__init__.py#L27
Finally, my question is where the proper place to specify what output prediction is?
I'm confused bc at the lore model level there is predict_proba
method https://github.com/instacart/lore/blob/master/lore/models/base.py#L143
Thanks!
Line 620 in 44ed811
When run lore generate scaffold
, get NameError: name 'require' is not defined
How to solve it
I know this is the wrong place for this, but I couldn't find anywhere better, the last time I tried reporting something via the website it confused the hell out of shopper happiness. Please consider creating an issue tracking system that your shoppers can report bugs and problems to and get follow ups on, SH cannot handle these things due to poor training and zealous adherence to their scripts.
Long story short, your algorithm had a delivery only order shopped at a store miles away from the destination when there is another delivery capable store in the same chain literally around the corner. Stuff isn't being delivered because the underpaid shoppers don't want to schlep around town due to bad code making bad decisions. It's so aggressively forcing orders to stores with iss that people aren't getting their deliveries, ignoring much closer stores in favor of bumping the IC bottom line a few cents. I understand the instacart platform is a barely functioning house of cards, but come on folks, go have a look at reddit and Facebook once in a while and witness for yourselves the absolute pain in the butt your software is for everyone. On behalf of shoppers everywhere I beg you, please, add some common sense and sanity to the system. There ain't no way we are driving an hour for 3 dollars, no we can't fit 40 flats of water in our car, most stores don't even have 150 bottles of diet Coke, stop shopping orders to stores very far from the customer when there are closer locations to their home, group delivery orders into as many orders per batch as possible so that we aren't losing money to make that trip, get management to okay setting the default tip to ten percent. Please, save us from the insanity, we are human beings too, not just cogs in the machine.
Pylint throws E1120 error for every use of @timed
,
no value for argument 'level' in function call (no-value-for-parameter)
Pylint is our recommended linter for the project. I am aware that pylint users tend to activate only a subset of linter rule to control the message verbose. But error messages are usually retained.
After a bit digging, it is caused by nested decorator usage which pylint is currently unable to handle.
def parametrized(decorator):
def layer(*args, **kwargs):
@functools.wraps(decorator)
def repl(f):
return decorator(f, *args, **kwargs)
return repl
return layer
@parametrized
def timed(func, level):
@functools.wraps(func)
def wrapper(*args, **kwargs):
with timer('.'.join([func.__module__, func.__name__]), level=level):
return func(*args, **kwargs)
return wrapper
An implementation of decorator with argument like below would fix the issue.
def timed(level):
def decorator(timed_func):
@functools.wraps(timed_func)
def wrapper(*args, **kwargs):
with timer('.'.join([timed_func.__module__, func.__name__]), level=level):
return timed_func(*args, **kwargs)
return wrapper
return decorator
Is there anyway we can hack lore install
so that it does not require gcc@5 anymore, as i have already had newer gcc, from which i can build xgboost successfully (confirmed).
In python 3.6, the inspect.getargspec
function here would return self
as the sole argument due to decorations like @timed
. I suspect it is the same in 3.5.
all thing is ok ,but when I try to exec lore server in centos 7 terminal console
meet error
console output
root@medell lasa]# lore server
Traceback (most recent call last):
File "/usr/local/python3/bin/lore", line 11, in
sys.exit(main())
File "/usr/local/python3/lib/python3.6/site-packages/lore/main.py", line 352, in main
known.func(known, unknown)
File "/usr/local/python3/lib/python3.6/site-packages/lore/main.py", line 530, in server
os.execv(env.BIN_FLASK, args)
FileNotFoundError: [Errno 2] No such file or directory
Hey, that looks like a great project and very close to what we were looking for.
I understand why you guys ditched any environments, but still - is there any way to use lore with existing / specific conda environment (or do you think it is relatively simple to PR this feature?)
Hi,
I am a little confused about the stratify implementation in _split_data
method in lore.pipelines.holdout.base
. From my understanding and some experiment, this is different than stratify
option in sklearn.model_selection.train_test_split
(see documentation). Would you please elaborate why you implement stratify
in this way? Thanks very much!
I run this command after install lore:
$ lore init my_app --python-version=3.6.4 --keras --xgboost --postgres
It returns result as below:
BUILD FAILED (Debian 9.4 using python-build 1.2.6)
Inspect or clean up the working tree at /tmp/python-build.20180726150343.2385
Results logged to /tmp/python-build.20180726150343.2385.log
Last 10 log lines:
sys.exit(ensurepip._main())
File "/tmp/python-build.20180726150343.2385/Python-3.6.4/Lib/ensurepip/__init__.py", line 204, in _main
default_pip=args.default_pip,
File "/tmp/python-build.20180726150343.2385/Python-3.6.4/Lib/ensurepip/__init__.py", line 117, in _bootstrap
return _run_pip(args + [p[0] for p in _PROJECTS], additional_paths)
File "/tmp/python-build.20180726150343.2385/Python-3.6.4/Lib/ensurepip/__init__.py", line 27, in _run_pip
import pip
zipimport.ZipImportError: can't decompress data; zlib not available
Makefile:1099: recipe for target 'install' failed
make: *** [install] Error 1
Traceback (most recent call last):
File "/usr/local/bin/lore", line 11, in <module>
sys.exit(main())
File "/usr/local/lib/python3.5/dist-packages/lore/__main__.py", line 352, in main
known.func(known, unknown)
File "/usr/local/lib/python3.5/dist-packages/lore/__main__.py", line 608, in init
install(parsed, unknown)
File "/usr/local/lib/python3.5/dist-packages/lore/__main__.py", line 625, in install
install_python_version()
File "/usr/local/lib/python3.5/dist-packages/lore/__main__.py", line 1039, in install_python_version
subprocess.check_call((env.BIN_PYENV, 'install', env.PYTHON_VERSION))
File "/usr/lib/python3.5/subprocess.py", line 271, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '('/home/thuan/.pyenv/bin/pyenv', 'install', '3.6.4')' returned non-zero exit status 1
My system info: ( refer issue #99 for my system info)
OS: Debian 9.4 stretch
Kernel: x86_64 Linux 4.9.0-6-amd64
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.