replicate / cog Goto Github PK
View Code? Open in Web Editor NEWContainers for machine learning
Home Page: https://cog.run
License: Apache License 2.0
Containers for machine learning
Home Page: https://cog.run
License: Apache License 2.0
It's not clear that it's just a string attached to your commit message. It looks scary. https://github.com/replicate/cog/blob/main/CONTRIBUTING.md
The double behavior of default
is not obvious.
This might want to be optional=True
instead, so that inputs are required by default. If the inputs were not required by default, then users normally would make them required. If required inputs are not marked as required, this will cause breakage.
From #118, the end to end tests can't connect to the bridge IP on macOS:
This should either be run in a consistent dev environment, or if we actually want to run the end to end tests on macOS (which probably makes sense in CI?), then we might need some kind of OS-based switch in there.
cog build
═══╡ Uploading /Users/tekumara/code3/cog-examples/inst-colorization to localhost:8080/examples/inst-colorization
⠋ uploading (925 MB, 269.985 MB/s) ═══╡ Building model...
═══╡ Received model
═══╡ Building cpu image
═══╡ * Installing Python prerequisites
═══╡ * Installing Python 3.8
═══╡ * Installing system packages
═══╡ * Installing Python packages
═══╡ * Installing Cog
═══╡ * Copying code
═══╡ Successfully built 507cf5936fd9
═══╡ Pushing localhost:5000/inst-colorization:507cf5936fd9 to registry
═══╡ Building gpu image
═══╡ * Installing Python prerequisites
═══╡ * Installing Python 3.8
═══╡ ---> Using cache
═══╡ ---> 68aac6e4699f
═══╡ Step 8/20 : RUN curl https://pyenv.run | bash && git clone https://github.com/momo-lab/pyenv-install-latest.git "$(pyenv root)"/plugins/pyenv-install-latest && pyenv
│ install-latest "3.8" && pyenv global $(pyenv install-latest --print "3.8")
═══╡ ---> Running in ae5b74d815ca
═══╡ % Total % Received % Xferd Average Speed Time Time Time Current
═══╡ Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
100 285 100 285 0 0 198 0 0:00:01 0:00:01 --:--:-- 198 0
═══╡ Cloning into '/root/.pyenv'...
═══╡ Cloning into '/root/.pyenv/plugins/pyenv-doctor'...
═══╡ Cloning into '/root/.pyenv/plugins/pyenv-installer'...
═══╡ Cloning into '/root/.pyenv/plugins/pyenv-update'...
═══╡ fatal: unable to access 'https://github.com/pyenv/pyenv-update.git/': gnutls_handshake() failed: The TLS connection was non-properly terminated.
═══╡ Failed to git clone https://github.com/pyenv/pyenv-update.git
═══╡ Error: Failed to build Docker image: exit status 255
High CPU usage during the build.
pip install
in building GPU image throws OOM with standard 4GB of memory. (Within Docker, not outside.)
═══╡ Building gpu image
═══╡ * Installing Python prerequisites
═══╡ * Installing Python 3.8
═══╡ * Installing system packages
═══╡ * Installing Python packages
═══╡ #11 sha256:6ba92f3047b5dec04235ade8528c87bc142e66bb38015765dc4f9cbb7d185cd8
═══╡ #11 DONE 0.5s
═══╡
═══╡ #12 [ 9/15] RUN pip install -f
│ https://dl.fbaipublicfiles.com/detectron2/wheels/cpu/index.html -f
│ https://download.pytorch.org/whl/cu101/torch_stable.html
│ --extra-index-url=git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI
│ cachetools==4.1.0 chardet==3.0.4 future==0.18.2 fvcore==0.1.dev200506
│ idna==2.9 importlib-metadata==1.6.0 jsonpatch==1.25 jsonpointer==2.0
│ markdown==3.2.2 mock==4.0.2 opencv-python==4.3.0.38 portalocker==1.7.0
│ pyasn1==0.4.8 pyasn1-modules==0.2.8 pydot==1.4.1 requests==2.23.0
│ requests-oauthlib==1.3.0 rsa==4.0 tabulate==0.8.7 termcolor==1.1.0
│ urllib3==1.25.8 visdom==0.1.8.9 websocket-client==0.57.0 werkzeug==1.0.1
│ yacs==0.1.7 zipp==3.1.0 cython==0.29.22 pyyaml==5.1 dominate==2.4.0
│ detectron2==0.1.2 torch==1.5.0 torchvision==0.6.0 pycocotools==2.0.2
│ ipython==7.21.0 scikit-image==0.18.1
═══╡ #12 sha256:84e81bafd53e4595c28450182c770765e490c57fe351dd48fbad3418b7ad1697
═══╡ #12 1.283 Looking in indexes: https://pypi.org/simple,
│ git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI
═══╡ #12 1.283 Looking in links:
│ https://dl.fbaipublicfiles.com/detectron2/wheels/cpu/index.html,
│ https://download.pytorch.org/whl/cu101/torch_stable.html
═══╡ #12 1.370 WARNING: Cannot look at git URL
│ git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI/cachetools/
│ because it does not support lookup as web pages.
═══╡ #12 2.202 Collecting cachetools==4.1.0
═══╡ #12 2.235 Downloading cachetools-4.1.0-py3-none-any.whl (10 kB)
═══╡ #12 2.265 WARNING: Cannot look at git URL
│ git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI/chardet/
│ because it does not support lookup as web pages.
═══╡ #12 2.942 Collecting chardet==3.0.4
═══╡ #12 2.950 Downloading chardet-3.0.4-py2.py3-none-any.whl (133 kB)
......
Console ═══╡
messages are for displaying messages to the user, not for large amounts of debugging output. The build output should be displayed as plain text, clearly delineated from the informational messages.
If I'm running my own server, I should just be able to point at http://10.1.1.1/hotdog-detector
In cog build log
, there is no line saying it is pushing the model, which is a time-consuming process.
If you don't set a setup()
function, you get an incomprehensible error:
═══╡ Traceback (most recent call last):
│ File "/usr/bin/cog-http-server", line 8, in <module>
│ cog.HTTPServer(Model()).start_server()
│ TypeError: Can't instantiate abstract class Model with abstract methods setup
│
═══╡ Container exited unexpectedly
My feeling is we should require setup()
to encourage users to do the right thing, but there should be a clearer error message.
There is preinstall
but it implies that it comes before installing other things, but it doesn't -- it becomes before copying code. We should:
Server should probably delete all local images besides the most recent one. That way caching still works, but we don't infinitely use up disk space.
Currently invalid keys are silently ignored.
For example, a cog.yaml
that contains this:
buildd:
python_version: "3.8"
python_packages:
- "torch=1.8.0"
Will silently not install torch
, instead of complaining that the key buildd
doesn't exist.
We may also want to validate the values at some point. For example, ensuring python_version
is a string and matches a given pattern. (If you omit the quotes, 3.1
and 3.10
are indistinguishable! Thanks YAML!)
(Edited by @andreasjansson and @bfirsh.)
We ran into this yesterday but haven't reproduced locally. Needs confirmation.
It's run()
in Python but cog infer
on CLI. We need to decide on the verb and stick to it.
It could be a fixed filename, but maybe it isn't that so as not to cause a race condition with multiple builds? Perhaps it could be a hash of the content.
It is unexpected behavior that the server behaves differently in different working directories. Simplest solution here might be that you need to explicitly specify where data is stored.
There is a lot of string concatenation without escaping and things like that. This feels like it needs a DSL.
Perhaps the simplest way would be to put all generated data inside environment variables, which can easily be escaped, then carefully input that data in fixed commands through the rest of the Dockerfile.
We can also do stuff like use the RUN ["foo", ...]
form.
Ideally we wouldn't use Dockerfiles at all, which is a larger piece of work in #21.
Currently depends on Python dependencies installed globally.
Use case: I have a folder of images, I want them all colorized. At the moment you have to wait a minute for the model to boot for each image.
It should also support reading from a file, as requested by @DeNeutoy.
(written by @andreasjansson @bfirsh)
I think we decided on this, but doesn't seem to be the case.
Steps to reproduce:
It now stalls saying ⠙ uploading (3.1 kB, 0.494 kB/s)
. Perhaps it's calculating some hashes, or something. Whatever it's doing it should show progress instead of looking broken.
I want to rename /infer
to /predict
but we can't change it because all the old models have it and there is no way for Cog to detect what version it is and what it should call.
Alternatively, perhaps it is the client which should detect what version of Cog the model has been made with, and adjust its API calls as appropriate.
Cog's is relative to /code
, Docker's is absolute. This is quite confusing, particularly when cog run
gets involved.
As far as I can see, the intention of cog.yaml's workdir is two-fold: to set PYTHONPATH correctly, and as a shortcut to set the directory for post_install
.
When I'm iterating, I have to wait for two images to build before I know it's completely broken.
Currently Cog requires you to upload an input file, then it's processed and results are returned. But there are cases when you might want to stream a continuous input to the model. For example, if you have a model that does audio event detection, you might want to display the current event as it happens.
This is clumsy to explain in getting started, and means we have to have extra imports in all the model definition docs. Also a thing users will stumble on.
It should use the same TerminalLogger
as cog run
.
Would love to see some kind of notebook integration. Can we expose the environment built in Docker as a Jupyter Notebook kernel possibly?
This will stop models from breaking when we update our default.
Steps to reproduce:
cog push
touch test
cog push
This will produce a version with the same ID. (And will presumably fail once #90 is in.)
Sorting is significant.
If you don't pass -o
, it should just print the output to stdout. This is a regression. It was working at some point.
It's common for models to download weights in the setup() function. This isn't reproducible (models might run without a network connection and weights files can disappear from the internet) so we should discourage it. In cases where you actually need network access we can provide a config option to allow the model to hit the network.
The tricky bit is allowing incoming access for the HTTP server, while disallowing outgoing connections. On a cursory search, there seems to be no simple way to do this without iptables rules on the host, or the container to a private network and using another container as a proxy. Some creativity might be needed.
As a start, perhaps we could bodge it inside the container. That way we can guarantee it isn't downloading any files for reproducibility reasons, but doesn't have any security guarantees.
There are some cases where network access might be needed. A few ideas:
cog.yaml
to allow network access. However -- Untrusted models not having network access is a neat security feature, so it would be a shame to allow model creators to break this.[This issue has been authored by @andreasjansson and @bfirsh.]
The server should be able to say "this client is not supported, you should upgrade!" elegantly.
Proposal here: https://consoledonottrack.com/
Instead of using Dockerfiles. Concatenating strings is fragile. https://www.docker.com/blog/compiling-containers-dockerfiles-llvm-and-buildkit/
Note that we use buildkit to build Dockerfiles for CPU. This is about calling the buildkit API directly instead of going via a Dockerfile.
We could also call the Docker API to create and commit containers, emulating the build process.
A nice side-effect of using Dockerfiles is we can generate a Dockerfile for users if they want to "eject" from Cog.
Related to #165
Cog should search up the file tree for cog.yaml
, like Keepsake does.
For example, if /home/ben/hotdog-detector/cog.yaml
exists, then I should be able to run cog predict
in /home/ben/hotdog-detector/subdir/
and it should do what I expect.
There is some nuance here with cog run
. Should the working directory be the relative current directory inside the container?
As part of implementing local mode and on server (#18) we need a sensible way of managing local images.
docker system prune
docker images
For clarity, this is additional work on top of #18. This also includes picking a sensible name when running locally and you aren't pointed at a registry.
As suggested by @zeke.
.npmignore
defaults to .gitignore
, but there is a dangerous silent failure in that: Suppose .gitignore
ignores secrets.json
. If you then you add .npmignore
with something new you want to ignore, it stops inheriting from .gitignore
therefore unignoring secrets.json
.
There is also an additional consideration for machine learning models: .gitignore
will normally ignore your model weights, but you want to include that for Cog, so maybe in this case the default would always be not what you want. In which case, it probably shouldn't be the default.
Maybe we need some sensible defaults that are clear to the user? Maybe there's something clever we can do based on .gitignore
?
"Project" is used nowhere else.
Currently hangs.
We don't use the word "arguments" anywhere else. This is "input types" or "inputs" or something along those lines?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.