modal-labs / modal-examples Goto Github PK
View Code? Open in Web Editor NEWExamples of programs built using Modal
Home Page: https://modal.com/docs
License: MIT License
Examples of programs built using Modal
Home Page: https://modal.com/docs
License: MIT License
As per https://modal.com/docs/guide/ex/vllm_inference, I ran:
git clone https://github.com/modal-labs/modal-examples
cd modal-examples
modal run 06_gpu_and_ml/vllm_inference.py
Here's the error that I got:
Downloading ray-2.7.1-cp38-cp38-manylinux2014_x86_64.whl (62.5 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 62.5/62.5 MB 178.5 MB/s eta 0:00:00
Downloading transformers-4.34.0-py3-none-any.whl (7.7 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 7.7/7.7 MB 163.4 MB/s eta 0:00:00
Downloading xformers-0.0.22-cp38-cp38-manylinux2014_x86_64.whl (211.6 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 211.6/211.6 MB 165.8 MB/s eta 0:00:00
Downloading ninja-1.11.1.1-py2.py3-none-manylinux1_x86_64.manylinux_2_5_x86_64.whl (307 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 307.2/307.2 kB 204.4 MB/s eta 0:00:00
Downloading uvicorn-0.23.2-py3-none-any.whl (59 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 59.5/59.5 kB 177.2 MB/s eta 0:00:00
Downloading safetensors-0.4.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.3/1.3 MB 197.8 MB/s eta 0:00:00
Downloading tokenizers-0.14.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.8 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.8/3.8 MB 191.2 MB/s eta 0:00:00
Downloading huggingface_hub-0.17.3-py3-none-any.whl (295 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 295.0/295.0 kB 199.8 MB/s eta 0:00:00
Building wheels for collected packages: vllm
Building wheel for vllm (pyproject.toml): started
Building wheel for vllm (pyproject.toml): finished with status 'error'
error: subprocess-exited-with-error
× Building wheel for vllm (pyproject.toml) did not run successfully.
│ exit code: 1
╰─> [140 lines of output]
running bdist_wheel
running build
running build_py
creating build
creating build/lib.linux-x86_64-cpython-38
creating build/lib.linux-x86_64-cpython-38/vllm
copying vllm/sequence.py -> build/lib.linux-x86_64-cpython-38/vllm
copying vllm/block.py -> build/lib.linux-x86_64-cpython-38/vllm
copying vllm/outputs.py -> build/lib.linux-x86_64-cpython-38/vllm
copying vllm/sampling_params.py -> build/lib.linux-x86_64-cpython-38/vllm
copying vllm/logger.py -> build/lib.linux-x86_64-cpython-38/vllm
copying vllm/utils.py -> build/lib.linux-x86_64-cpython-38/vllm
copying vllm/__init__.py -> build/lib.linux-x86_64-cpython-38/vllm
copying vllm/config.py -> build/lib.linux-x86_64-cpython-38/vllm
creating build/lib.linux-x86_64-cpython-38/vllm/transformers_utils
copying vllm/transformers_utils/tokenizer.py -> build/lib.linux-x86_64-cpython-38/vllm/transformers_utils
copying vllm/transformers_utils/__init__.py -> build/lib.linux-x86_64-cpython-38/vllm/transformers_utils
copying vllm/transformers_utils/config.py -> build/lib.linux-x86_64-cpython-38/vllm/transformers_utils
creating build/lib.linux-x86_64-cpython-38/vllm/model_executor
copying vllm/model_executor/input_metadata.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor
copying vllm/model_executor/weight_utils.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor
copying vllm/model_executor/model_loader.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor
copying vllm/model_executor/utils.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor
copying vllm/model_executor/__init__.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor
creating build/lib.linux-x86_64-cpython-38/vllm/engine
copying vllm/engine/arg_utils.py -> build/lib.linux-x86_64-cpython-38/vllm/engine
copying vllm/engine/__init__.py -> build/lib.linux-x86_64-cpython-38/vllm/engine
copying vllm/engine/llm_engine.py -> build/lib.linux-x86_64-cpython-38/vllm/engine
copying vllm/engine/async_llm_engine.py -> build/lib.linux-x86_64-cpython-38/vllm/engine
copying vllm/engine/ray_utils.py -> build/lib.linux-x86_64-cpython-38/vllm/engine
creating build/lib.linux-x86_64-cpython-38/vllm/worker
copying vllm/worker/__init__.py -> build/lib.linux-x86_64-cpython-38/vllm/worker
copying vllm/worker/worker.py -> build/lib.linux-x86_64-cpython-38/vllm/worker
copying vllm/worker/cache_engine.py -> build/lib.linux-x86_64-cpython-38/vllm/worker
creating build/lib.linux-x86_64-cpython-38/vllm/core
copying vllm/core/scheduler.py -> build/lib.linux-x86_64-cpython-38/vllm/core
copying vllm/core/policy.py -> build/lib.linux-x86_64-cpython-38/vllm/core
copying vllm/core/block_manager.py -> build/lib.linux-x86_64-cpython-38/vllm/core
copying vllm/core/__init__.py -> build/lib.linux-x86_64-cpython-38/vllm/core
creating build/lib.linux-x86_64-cpython-38/vllm/entrypoints
copying vllm/entrypoints/api_server.py -> build/lib.linux-x86_64-cpython-38/vllm/entrypoints
copying vllm/entrypoints/__init__.py -> build/lib.linux-x86_64-cpython-38/vllm/entrypoints
copying vllm/entrypoints/llm.py -> build/lib.linux-x86_64-cpython-38/vllm/entrypoints
creating build/lib.linux-x86_64-cpython-38/vllm/transformers_utils/configs
copying vllm/transformers_utils/configs/mpt.py -> build/lib.linux-x86_64-cpython-38/vllm/transformers_utils/configs
copying vllm/transformers_utils/configs/baichuan.py -> build/lib.linux-x86_64-cpython-38/vllm/transformers_utils/configs
copying vllm/transformers_utils/configs/qwen.py -> build/lib.linux-x86_64-cpython-38/vllm/transformers_utils/configs
copying vllm/transformers_utils/configs/__init__.py -> build/lib.linux-x86_64-cpython-38/vllm/transformers_utils/configs
copying vllm/transformers_utils/configs/falcon.py -> build/lib.linux-x86_64-cpython-38/vllm/transformers_utils/configs
creating build/lib.linux-x86_64-cpython-38/vllm/model_executor/layers
copying vllm/model_executor/layers/attention.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/layers
copying vllm/model_executor/layers/sampler.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/layers
copying vllm/model_executor/layers/activation.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/layers
copying vllm/model_executor/layers/__init__.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/layers
copying vllm/model_executor/layers/layernorm.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/layers
creating build/lib.linux-x86_64-cpython-38/vllm/model_executor/parallel_utils
copying vllm/model_executor/parallel_utils/parallel_state.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/parallel_utils
copying vllm/model_executor/parallel_utils/__init__.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/parallel_utils
creating build/lib.linux-x86_64-cpython-38/vllm/model_executor/models
copying vllm/model_executor/models/mpt.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/models
copying vllm/model_executor/models/opt.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/models
copying vllm/model_executor/models/internlm.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/models
copying vllm/model_executor/models/gpt2.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/models
copying vllm/model_executor/models/baichuan.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/models
copying vllm/model_executor/models/gpt_j.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/models
copying vllm/model_executor/models/gpt_neox.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/models
copying vllm/model_executor/models/llama.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/models
copying vllm/model_executor/models/qwen.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/models
copying vllm/model_executor/models/bloom.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/models
copying vllm/model_executor/models/__init__.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/models
copying vllm/model_executor/models/falcon.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/models
copying vllm/model_executor/models/gpt_bigcode.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/models
creating build/lib.linux-x86_64-cpython-38/vllm/model_executor/parallel_utils/tensor_parallel
copying vllm/model_executor/parallel_utils/tensor_parallel/random.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/parallel_utils/tensor_parallel
copying vllm/model_executor/parallel_utils/tensor_parallel/mappings.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/parallel_utils/tensor_parallel
copying vllm/model_executor/parallel_utils/tensor_parallel/layers.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/parallel_utils/tensor_parallel
copying vllm/model_executor/parallel_utils/tensor_parallel/utils.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/parallel_utils/tensor_parallel
copying vllm/model_executor/parallel_utils/tensor_parallel/__init__.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/parallel_utils/tensor_parallel
creating build/lib.linux-x86_64-cpython-38/vllm/entrypoints/openai
copying vllm/entrypoints/openai/protocol.py -> build/lib.linux-x86_64-cpython-38/vllm/entrypoints/openai
copying vllm/entrypoints/openai/api_server.py -> build/lib.linux-x86_64-cpython-38/vllm/entrypoints/openai
copying vllm/entrypoints/openai/__init__.py -> build/lib.linux-x86_64-cpython-38/vllm/entrypoints/openai
running build_ext
/tmp/pip-build-env-u_96mrb3/overlay/lib/python3.8/site-packages/torch/nn/modules/transformer.py:20: UserWarning: Failed to initialize NumPy: numpy.core.multiarray failed to import (Triggered internally at ../torch/csrc/utils/tensor_numpy.cpp:84.)
device: torch.device = torch.device(torch._C._get_default_device()), # torch.device('cpu'),
No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda'
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in <module>
main()
File "/usr/local/lib/python3.8/dist-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main
json_out['return_val'] = hook(**hook_input['kwargs'])
File "/usr/local/lib/python3.8/dist-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 251, in build_wheel
return _build_backend().build_wheel(wheel_directory, config_settings,
File "/tmp/pip-build-env-u_96mrb3/overlay/lib/python3.8/site-packages/setuptools/build_meta.py", line 434, in build_wheel
return self._build_with_temp_dir(
File "/tmp/pip-build-env-u_96mrb3/overlay/lib/python3.8/site-packages/setuptools/build_meta.py", line 419, in _build_with_temp_dir
self.run_setup()
File "/tmp/pip-build-env-u_96mrb3/overlay/lib/python3.8/site-packages/setuptools/build_meta.py", line 341, in run_setup
exec(code, locals())
File "<string>", line 145, in <module>
File "/tmp/pip-build-env-u_96mrb3/overlay/lib/python3.8/site-packages/setuptools/__init__.py", line 103, in setup
return distutils.core.setup(**attrs)
File "/tmp/pip-build-env-u_96mrb3/overlay/lib/python3.8/site-packages/setuptools/_distutils/core.py", line 185, in setup
return run_commands(dist)
File "/tmp/pip-build-env-u_96mrb3/overlay/lib/python3.8/site-packages/setuptools/_distutils/core.py", line 201, in run_commands
dist.run_commands()
File "/tmp/pip-build-env-u_96mrb3/overlay/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 969, in run_commands
self.run_command(cmd)
File "/tmp/pip-build-env-u_96mrb3/overlay/lib/python3.8/site-packages/setuptools/dist.py", line 989, in run_command
super().run_command(command)
File "/tmp/pip-build-env-u_96mrb3/overlay/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
cmd_obj.run()
File "/tmp/pip-build-env-u_96mrb3/overlay/lib/python3.8/site-packages/wheel/bdist_wheel.py", line 364, in run
self.run_command("build")
File "/tmp/pip-build-env-u_96mrb3/overlay/lib/python3.8/site-packages/setuptools/_distutils/cmd.py", line 318, in run_command
self.distribution.run_command(command)
File "/tmp/pip-build-env-u_96mrb3/overlay/lib/python3.8/site-packages/setuptools/dist.py", line 989, in run_command
super().run_command(command)
File "/tmp/pip-build-env-u_96mrb3/overlay/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
cmd_obj.run()
File "/tmp/pip-build-env-u_96mrb3/overlay/lib/python3.8/site-packages/setuptools/_distutils/command/build.py", line 131, in run
self.run_command(cmd_name)
File "/tmp/pip-build-env-u_96mrb3/overlay/lib/python3.8/site-packages/setuptools/_distutils/cmd.py", line 318, in run_command
self.distribution.run_command(command)
File "/tmp/pip-build-env-u_96mrb3/overlay/lib/python3.8/site-packages/setuptools/dist.py", line 989, in run_command
super().run_command(command)
File "/tmp/pip-build-env-u_96mrb3/overlay/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
cmd_obj.run()
File "/tmp/pip-build-env-u_96mrb3/overlay/lib/python3.8/site-packages/setuptools/command/build_ext.py", line 88, in run
_build_ext.run(self)
File "/tmp/pip-build-env-u_96mrb3/overlay/lib/python3.8/site-packages/setuptools/_distutils/command/build_ext.py", line 345, in run
self.build_extensions()
File "/tmp/pip-build-env-u_96mrb3/overlay/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 525, in build_extensions
_check_cuda_version(compiler_name, compiler_version)
File "/tmp/pip-build-env-u_96mrb3/overlay/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 413, in _check_cuda_version
raise RuntimeError(CUDA_MISMATCH_MESSAGE.format(cuda_str_version, torch.version.cuda))
RuntimeError:
The detected CUDA version (11.8) mismatches the version that was used to compile
PyTorch (12.1). Please make sure to use the same CUDA versions.
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for vllm
Failed to build vllm
ERROR: Could not build wheels for vllm, which is required to install pyproject.toml-based projects
Terminating task due to error: failed to run builder command "python -m pip install typing-extensions==4.5.0 'vllm @ git+https://github.com/vllm-project/vllm.git@805de738f618f8b47ab0d450423d23db1e636fa2' "
Caused by:
container exit status: 1
Runner failed with exception: task exited with failure, status = exit status: 1
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /Users/alexkim/mambaforge/bin/modal:8 in <module> │
│ │
│ 7 │ sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0]) │
│ ❱ 8 │ sys.exit(main()) │
│ 9 │
│ │
│ /Users/alexkim/mambaforge/lib/python3.10/site-packages/modal/__main__.py:9 in main │
│ │
│ 8 │ setup_rich_traceback() │
│ ❱ 9 │ entrypoint_cli() │
│ 10 │
│ │
│ /Users/alexkim/mambaforge/lib/python3.10/site-packages/click/core.py:1157 in __call__ │
│ │
│ 1156 │ │ """Alias for :meth:`main`.""" │
│ ❱ 1157 │ │ return self.main(*args, **kwargs) │
│ 1158 │
│ │
│ /Users/alexkim/mambaforge/lib/python3.10/site-packages/typer/core.py:778 in main │
│ │
│ 777 │ ) -> Any: │
│ ❱ 778 │ │ return _main( │
│ 779 │ │ │ self, │
│ │
│ /Users/alexkim/mambaforge/lib/python3.10/site-packages/typer/core.py:216 in _main │
│ │
│ 215 │ │ │ with self.make_context(prog_name, args, **extra) as ctx: │
│ ❱ 216 │ │ │ │ rv = self.invoke(ctx) │
│ 217 │ │ │ │ if not standalone_mode: │
│ │
│ /Users/alexkim/mambaforge/lib/python3.10/site-packages/click/core.py:1688 in invoke │
│ │
│ 1687 │ │ │ │ with sub_ctx: │
│ ❱ 1688 │ │ │ │ │ return _process_result(sub_ctx.command.invoke(sub_ctx)) │
│ 1689 │
│ │
│ /Users/alexkim/mambaforge/lib/python3.10/site-packages/click/core.py:1688 in invoke │
│ │
│ 1687 │ │ │ │ with sub_ctx: │
│ ❱ 1688 │ │ │ │ │ return _process_result(sub_ctx.command.invoke(sub_ctx)) │
│ 1689 │
│ │
│ /Users/alexkim/mambaforge/lib/python3.10/site-packages/click/core.py:1434 in invoke │
│ │
│ 1433 │ │ if self.callback is not None: │
│ ❱ 1434 │ │ │ return ctx.invoke(self.callback, **ctx.params) │
│ 1435 │
│ │
│ /Users/alexkim/mambaforge/lib/python3.10/site-packages/click/core.py:783 in invoke │
│ │
│ 782 │ │ │ with ctx: │
│ ❱ 783 │ │ │ │ return __callback(*args, **kwargs) │
│ 784 │
│ │
│ /Users/alexkim/mambaforge/lib/python3.10/site-packages/click/decorators.py:33 in new_func │
│ │
│ 32 │ def new_func(*args: "P.args", **kwargs: "P.kwargs") -> "R": │
│ ❱ 33 │ │ return f(get_current_context(), *args, **kwargs) │
│ 34 │
│ │
│ /Users/alexkim/mambaforge/lib/python3.10/site-packages/modal/cli/run.py:145 in f │
│ │
│ 144 │ │ │
│ ❱ 145 │ │ with run_stub( │
│ 146 │ │ │ stub, │
│ │
│ /Users/alexkim/mambaforge/lib/python3.10/site-packages/synchronicity/synchronizer.py:497 in │
│ proxy_method │
│ │
│ /Users/alexkim/mambaforge/lib/python3.10/site-packages/synchronicity/combined_types.py:26 in │
│ __call__ │
│ │
│ /Users/alexkim/mambaforge/lib/python3.10/contextlib.py:199 in __aenter__ │
│ │
│ 198 │ │ try: │
│ ❱ 199 │ │ │ return await anext(self.gen) │
│ 200 │ │ except StopAsyncIteration: │
│ │
│ /Users/alexkim/mambaforge/lib/python3.10/site-packages/modal/runner.py:88 in _run_stub │
│ │
│ 87 │ │ │ # Create all members │
│ ❱ 88 │ │ │ await app._create_all_objects( │
│ 89 │ │ │ │ stub._blueprint, post_init_state, environment_name, shell=shell, output_ │
│ │
│ /Users/alexkim/mambaforge/lib/python3.10/site-packages/modal/app.py:103 in _create_all_objects │
│ │
│ 102 │ │ │ │ existing_object_id = tag_to_object_id.get(tag) │
│ ❱ 103 │ │ │ │ await resolver.load(obj, existing_object_id) │
│ 104 │ │ │ │ self._tag_to_object_id[tag] = obj.object_id │
│ │
│ /Users/alexkim/mambaforge/lib/python3.10/site-packages/modal/_resolver.py:126 in load │
│ │
│ 125 │ │ │
│ ❱ 126 │ │ return await cached_future │
│ 127 │
│ │
│ /Users/alexkim/mambaforge/lib/python3.10/site-packages/modal/_resolver.py:102 in loader │
│ │
│ 101 │ │ │ async def loader(): │
│ ❱ 102 │ │ │ │ await obj._load(obj, self, existing_object_id) │
│ 103 │ │ │ │ if existing_object_id is not None and obj.object_id != existing_object_i │
│ │
│ /Users/alexkim/mambaforge/lib/python3.10/site-packages/modal/image.py:176 in _load │
│ │
│ 175 │ │ │ for image in base_images.values(): │
│ ❱ 176 │ │ │ │ base_image_ids.append((await resolver.load(image)).object_id) │
│ 177 │ │ │ base_images_pb2s = [ │
│ │
│ /Users/alexkim/mambaforge/lib/python3.10/site-packages/modal/_resolver.py:126 in load │
│ │
│ 125 │ │ │
│ ❱ 126 │ │ return await cached_future │
│ 127 │
│ │
│ /Users/alexkim/mambaforge/lib/python3.10/site-packages/modal/_resolver.py:102 in loader │
│ │
│ 101 │ │ │ async def loader(): │
│ ❱ 102 │ │ │ │ await obj._load(obj, self, existing_object_id) │
│ 103 │ │ │ │ if existing_object_id is not None and obj.object_id != existing_object_i │
│ │
│ /Users/alexkim/mambaforge/lib/python3.10/site-packages/modal/image.py:176 in _load │
│ │
│ 175 │ │ │ for image in base_images.values(): │
│ ❱ 176 │ │ │ │ base_image_ids.append((await resolver.load(image)).object_id) │
│ 177 │ │ │ base_images_pb2s = [ │
│ │
│ /Users/alexkim/mambaforge/lib/python3.10/site-packages/modal/_resolver.py:126 in load │
│ │
│ 125 │ │ │
│ ❱ 126 │ │ return await cached_future │
│ 127 │
│ │
│ /Users/alexkim/mambaforge/lib/python3.10/site-packages/modal/_resolver.py:102 in loader │
│ │
│ 101 │ │ │ async def loader(): │
│ ❱ 102 │ │ │ │ await obj._load(obj, self, existing_object_id) │
│ 103 │ │ │ │ if existing_object_id is not None and obj.object_id != existing_object_i │
│ │
│ /Users/alexkim/mambaforge/lib/python3.10/site-packages/modal/image.py:176 in _load │
│ │
│ 175 │ │ │ for image in base_images.values(): │
│ ❱ 176 │ │ │ │ base_image_ids.append((await resolver.load(image)).object_id) │
│ 177 │ │ │ base_images_pb2s = [ │
│ │
│ /Users/alexkim/mambaforge/lib/python3.10/site-packages/modal/_resolver.py:126 in load │
│ │
│ 125 │ │ │
│ ❱ 126 │ │ return await cached_future │
│ 127 │
│ │
│ /Users/alexkim/mambaforge/lib/python3.10/site-packages/modal/_resolver.py:102 in loader │
│ │
│ 101 │ │ │ async def loader(): │
│ ❱ 102 │ │ │ │ await obj._load(obj, self, existing_object_id) │
│ 103 │ │ │ │ if existing_object_id is not None and obj.object_id != existing_object_i │
│ │
│ /Users/alexkim/mambaforge/lib/python3.10/site-packages/modal/image.py:296 in _load │
│ │
│ 295 │ │ │ if result.status == api_pb2.GenericResult.GENERIC_STATUS_FAILURE: │
│ ❱ 296 │ │ │ │ raise RemoteError(f"Image build for {image_id} failed with the exception │
│ 297 │ │ │ elif result.status == api_pb2.GenericResult.GENERIC_STATUS_TERMINATED: │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
RemoteError: Image build for im-bvJc9XyO2U9rSetK1p4yUT failed with the exception:
task exited with failure, status = exit status: 1
this a1111 example is broken and gives the issue below.
I think the issue is that the setup is just out of date for a1111
╰─$ modal run a1111_webui.py 2 ↵
✓ Initialized. View run at https://modal.com/nicholaskao1029/apps/ap-LdFciiKaOuin89Vcq9BGg4
Building image im-8pFRuirFJZ4P0wH3Lin4Fm
=> Step 0: FROM base
=> Step 1: RUN cd /webui && . venv/bin/activate && python -c 'from modules import shared_init, initialize; shared_init.initialize(); initialize.initialize()'
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/webui/modules/shared_init.py", line 5, in <module>
from modules import shared
File "/webui/modules/shared.py", line 3, in <module>
import gradio as gr
File "/webui/venv/lib/python3.10/site-packages/gradio/__init__.py", line 3, in <module>
import gradio.components as components
File "/webui/venv/lib/python3.10/site-packages/gradio/components/__init__.py", line 1, in <module>
from gradio.components.annotated_image import AnnotatedImage
File "/webui/venv/lib/python3.10/site-packages/gradio/components/annotated_image.py", line 12, in <module>
from gradio import utils
File "/webui/venv/lib/python3.10/site-packages/gradio/utils.py", line 353, in <module>
class AsyncRequest:
File "/webui/venv/lib/python3.10/site-packages/gradio/utils.py", line 372, in AsyncRequest
client = httpx.AsyncClient()
File "/webui/venv/lib/python3.10/site-packages/httpx/_client.py", line 1397, in __init__
self._transport = self._init_transport(
File "/webui/venv/lib/python3.10/site-packages/httpx/_client.py", line 1445, in _init_transport
return AsyncHTTPTransport(
File "/webui/venv/lib/python3.10/site-packages/httpx/_transports/default.py", line 275, in __init__
self._pool = httpcore.AsyncConnectionPool(
TypeError: AsyncConnectionPool.__init__() got an unexpected keyword argument 'socket_options'
Terminating task due to error: failed to run builder command "cd /webui && . venv/bin/activate && python -c 'from modules import shared_init, initialize; shared_init.initialize(); initialize.initialize()'"
Caused by:
container exit status: 1
Runner failed with exception: task exited with failure, status = exit status: 1
Stopping app - uncaught exception raised locally: RemoteError('Image build for im-8pFRuirFJZ4P0wH3Lin4Fm failed with the exception:\ntask exited with failure, status = exit status: 1').
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /mnt/nvme_drive/Work/sd/modal/a1111modal/venv/bin/modal:8 in <module> │
│ │
│ 7 │ sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0]) │
│ ❱ 8 │ sys.exit(main()) │
│ 9 │
│ │
│ /mnt/nvme_drive/Work/sd/modal/a1111modal/venv/lib/python3.10/site-packages/modal/__main__.py:9 │
│ in main │
│ │
│ 8 │ setup_rich_traceback() │
│ ❱ 9 │ entrypoint_cli() │
│ 10 │
│ │
│ /mnt/nvme_drive/Work/sd/modal/a1111modal/venv/lib/python3.10/site-packages/click/core.py:1157 in │
│ __call__ │
│ │
│ 1156 │ │ """Alias for :meth:`main`.""" │
│ ❱ 1157 │ │ return self.main(*args, **kwargs) │
│ 1158 │
│ │
│ /mnt/nvme_drive/Work/sd/modal/a1111modal/venv/lib/python3.10/site-packages/typer/core.py:778 in │
│ main │
│ │
│ 777 │ ) -> Any: │
│ ❱ 778 │ │ return _main( │
│ 779 │ │ │ self, │
│ │
│ /mnt/nvme_drive/Work/sd/modal/a1111modal/venv/lib/python3.10/site-packages/typer/core.py:216 in │
│ _main │
│ │
│ 215 │ │ │ with self.make_context(prog_name, args, **extra) as ctx: │
│ ❱ 216 │ │ │ │ rv = self.invoke(ctx) │
│ 217 │ │ │ │ if not standalone_mode: │
│ │
│ /mnt/nvme_drive/Work/sd/modal/a1111modal/venv/lib/python3.10/site-packages/click/core.py:1688 in │
│ invoke │
│ │
│ 1687 │ │ │ │ with sub_ctx: │
│ ❱ 1688 │ │ │ │ │ return _process_result(sub_ctx.command.invoke(sub_ctx)) │
│ 1689 │
│ │
│ /mnt/nvme_drive/Work/sd/modal/a1111modal/venv/lib/python3.10/site-packages/click/core.py:1688 in │
│ invoke │
│ │
│ 1687 │ │ │ │ with sub_ctx: │
│ ❱ 1688 │ │ │ │ │ return _process_result(sub_ctx.command.invoke(sub_ctx)) │
│ 1689 │
│ │
│ /mnt/nvme_drive/Work/sd/modal/a1111modal/venv/lib/python3.10/site-packages/click/core.py:1434 in │
│ invoke │
│ │
│ 1433 │ │ if self.callback is not None: │
│ ❱ 1434 │ │ │ return ctx.invoke(self.callback, **ctx.params) │
│ 1435 │
│ │
│ /mnt/nvme_drive/Work/sd/modal/a1111modal/venv/lib/python3.10/site-packages/click/core.py:783 in │
│ invoke │
│ │
│ 782 │ │ │ with ctx: │
│ ❱ 783 │ │ │ │ return __callback(*args, **kwargs) │
│ 784 │
│ │
│ /mnt/nvme_drive/Work/sd/modal/a1111modal/venv/lib/python3.10/site-packages/click/decorators.py:3 │
│ 3 in new_func │
│ │
│ 32 │ def new_func(*args: "P.args", **kwargs: "P.kwargs") -> "R": │
│ ❱ 33 │ │ return f(get_current_context(), *args, **kwargs) │
│ 34 │
│ │
│ /mnt/nvme_drive/Work/sd/modal/a1111modal/venv/lib/python3.10/site-packages/modal/cli/run.py:145 │
│ in f │
│ │
│ 144 │ │ │
│ ❱ 145 │ │ with run_stub( │
│ 146 │ │ │ stub, │
│ │
│ /mnt/nvme_drive/Work/sd/modal/a1111modal/venv/lib/python3.10/site-packages/synchronicity/synchro │
│ nizer.py:497 in proxy_method │
│ │
│ /mnt/nvme_drive/Work/sd/modal/a1111modal/venv/lib/python3.10/site-packages/synchronicity/combine │
│ d_types.py:26 in __call__ │
│ │
│ /usr/lib/python3.10/contextlib.py:199 in __aenter__ │
│ │
│ 198 │ │ try: │
│ ❱ 199 │ │ │ return await anext(self.gen) │
│ 200 │ │ except StopAsyncIteration: │
│ │
│ /mnt/nvme_drive/Work/sd/modal/a1111modal/venv/lib/python3.10/site-packages/modal/runner.py:140 │
│ in _run_stub │
│ │
│ 139 │ │ │ exc_info = e │
│ ❱ 140 │ │ │ raise e │
│ 141 │ │ finally: │
│ │
│ /mnt/nvme_drive/Work/sd/modal/a1111modal/venv/lib/python3.10/site-packages/modal/runner.py:94 in │
│ _run_stub │
│ │
│ 93 │ │ │ # Create all members │
│ ❱ 94 │ │ │ await app._create_all_objects( │
│ 95 │ │ │ │ stub._indexed_objects, app_state, environment_name, shell=shell, output_ │
│ │
│ /mnt/nvme_drive/Work/sd/modal/a1111modal/venv/lib/python3.10/site-packages/modal/app.py:106 in │
│ _create_all_objects │
│ │
│ 105 │ │ │ │ existing_object_id = tag_to_object_id.get(tag) │
│ ❱ 106 │ │ │ │ await resolver.load(obj, existing_object_id) │
│ 107 │ │ │ │ self._tag_to_object_id[tag] = obj.object_id │
│ │
│ /mnt/nvme_drive/Work/sd/modal/a1111modal/venv/lib/python3.10/site-packages/modal/_resolver.py:13 │
│ 1 in load │
│ │
│ 130 │ │ │
│ ❱ 131 │ │ return await cached_future │
│ 132 │
│ │
│ /mnt/nvme_drive/Work/sd/modal/a1111modal/venv/lib/python3.10/site-packages/modal/_resolver.py:10 │
│ 4 in loader │
│ │
│ 103 │ │ │ │ # TODO(erikbern): do we need existing_object_id for those? │
│ ❱ 104 │ │ │ │ await asyncio.gather(*[self.load(dep) for dep in obj.deps()]) │
│ 105 │
│ │
│ /mnt/nvme_drive/Work/sd/modal/a1111modal/venv/lib/python3.10/site-packages/modal/_resolver.py:13 │
│ 1 in load │
│ │
│ 130 │ │ │
│ ❱ 131 │ │ return await cached_future │
│ 132 │
│ │
│ /mnt/nvme_drive/Work/sd/modal/a1111modal/venv/lib/python3.10/site-packages/modal/_resolver.py:10 │
│ 7 in loader │
│ │
│ 106 │ │ │ │ # Load the object itself │
│ ❱ 107 │ │ │ │ await obj._load(obj, self, existing_object_id) │
│ 108 │ │ │ │ if existing_object_id is not None and obj.object_id != existing_object_i │
│ │
│ /mnt/nvme_drive/Work/sd/modal/a1111modal/venv/lib/python3.10/site-packages/modal/image.py:318 in │
│ _load │
│ │
│ 317 │ │ │ if result.status == api_pb2.GenericResult.GENERIC_STATUS_FAILURE: │
│ ❱ 318 │ │ │ │ raise RemoteError(f"Image build for {image_id} failed with the exception │
│ 319 │ │ │ elif result.status == api_pb2.GenericResult.GENERIC_STATUS_TERMINATED: │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
RemoteError: Image build for im-8pFRuirFJZ4P0wH3Lin4Fm failed with the exception:
task exited with failure, status = exit status: 1
Hi! I was looking for a way to build a GPT-3 Slack chatbot. I'd like the bot to work like this: in a Slack channel, respond to @mention. I heard good things about Modal and would like to try. Would you consider adding an example?
Related example: https://autocode.com/openai/threads/build-your-own-chatgpt-discord-bot-using-autocode-and-openai-84403434/
Hi modal team,
First of all thank you for such an amazing product, I am wondering if it is possible to retrieve the billing costs for a specific function run. If not, is it planned?
Hands down one of the best things I've used. Endless possibilities and surprisingly user-friendly.
It's so easy to use and distribute it's amazing. I've been playing with it for a few days and am already integrating cloud storage, and local file conversion with CUDA support, transcription, and LLMs.
I'm very spoiled due to ChatGPT handling a billion libraries but not this one.
Yesterday's hydration error got me to suggest an improved RAG implementation. It was my very first time dealing with this technology and the solution was nowhere to be found. It would be great to get a complex, permutable example on demand.
I humbly suggest adding Modal's documentation and GitHub examples to the RAG guide.
Using it in this way could improve the speed of development for newcomers, and lower compute costs during debugging.
Thank you for your consideration. Best of Luck!
while creating a container Image, how to install python3 packages?
I installed Modal in Jupyter Notebook, Win 10 machine and ran the following code:
!pip install modal-client
!modal token new
I was prompted to sign in a new browser, got confetti.
`
import modal
stub = modal.Stub("example-get-started")
@stub.function
def square(x):
print("This code is running on a remote worker!")
return x**2
@stub.local_entrypoint
def main():
print("the square is", square.call(42))
`
Received below error message:
Any idea what could be causing it ? Thanks.
In order to avoid the model initialization, I used modal lifecyle.
@stub.cls(....)
class sample_cls() :
def __enter__(self) :
## initilization
@method()
def sample_func() :
## some work
Now if I directly want to create a web_endpoint out of this class method, how can I do that?
Please help, thanks.
I've been struggling to get a Llama 2 with AutoGPTQ working, problem because I'm not building the image correctly. It would be very much appreciated if there was an example that showed the correct way to do this with the CUDA dependencies. Thanks!
When executing the first example:
import modal
stub = modal.Stub("example-hello-world")
@stub.function()
def square(x):
print("This code is running on a remote worker!")
return x**2
@stub.local_entrypoint()
def main():
print("the square is", square.remote(42))
Running it with either python example.py
or with modal run example.py
, the following error raises:
AttributeError: module 'modal' has no attribute 'Stub'
Tried changing versions of the modal package and nothing worked.
I'm using windows 10 and a brand new conda environment.
I got this error when run modal serve 06_gpu_and_ml/controlnet/controlnet_gradio_demos.py
My python version is
Python 3.11.3 (main, May 15 2023, 18:01:31) [Clang 14.0.6 ] on darwin
Building wheel for tokenizers (pyproject.toml): started
Building wheel for tokenizers (pyproject.toml): finished with status 'error'
error: subprocess-exited-with-error
× Building wheel for tokenizers (pyproject.toml) did not run successfully.
│ exit code: 1
╰─> [51 lines of output]
running bdist_wheel
running build
running build_py
creating build
creating build/lib.linux-x86_64-cpython-311
creating build/lib.linux-x86_64-cpython-311/tokenizers
copying py_src/tokenizers/__init__.py -> build/lib.linux-x86_64-cpython-311/tokenizers
creating build/lib.linux-x86_64-cpython-311/tokenizers/models
copying py_src/tokenizers/models/__init__.py -> build/lib.linux-x86_64-cpython-311/tokenizers/models
creating build/lib.linux-x86_64-cpython-311/tokenizers/decoders
copying py_src/tokenizers/decoders/__init__.py -> build/lib.linux-x86_64-cpython-311/tokenizers/decoders
creating build/lib.linux-x86_64-cpython-311/tokenizers/normalizers
copying py_src/tokenizers/normalizers/__init__.py -> build/lib.linux-x86_64-cpython-311/tokenizers/normalizers
creating build/lib.linux-x86_64-cpython-311/tokenizers/pre_tokenizers
copying py_src/tokenizers/pre_tokenizers/__init__.py -> build/lib.linux-x86_64-cpython-311/tokenizers/pre_tokenizers
Hi all,
is there any way to remove a shared volume?
modal run dreambooth_app.py
Error no file named scheduler_config.json found in directory /model.
Traceback (most recent call last):
File "/pkg/modal/_container_entrypoint.py", line 346, in handle_user_exception
yield
File "/pkg/modal/_container_entrypoint.py", line 510, in call_function_sync
enter_res = enter_method()
File "/root/dreambooth_app.py", line 281, in __enter__
ddim = DDIMScheduler.from_pretrained(MODEL_DIR, subfolder="scheduler")
File "/root/src/diffusers/schedulers/scheduling_utils.py", line 134, in from_pretrained
config, kwargs = cls.load_config(
File "/root/src/diffusers/configuration_utils.py", line 320, in load_config
raise EnvironmentError(
OSError: Error no file named scheduler_config.json found in directory /model.
Runner failed with exception: OSError('Error no file named scheduler_config.json found in directory /model.')
while running https://modal.com/docs/examples/serve_streamlit#sessions
Is it possible to run an application for photogrammetry (creating 3D models from images)? For example, I use Agisoft Metashape, they have a client for Linux.
import os
from typing import Dict
from modal import Image, Secret, Stub, method, gpu, web_endpoint
def download_model_to_folder():
from huggingface_hub import snapshot_download
snapshot_download(
"Phind/Phind-CodeLlama-34B-v2",
# "codellama/CodeLlama-13b-Instruct-hf",
local_dir="/model",
token=os.environ["HUGGINGFACE_TOKEN"],
)
MODEL_DIR = "/model"
image = (
Image.from_dockerhub("nvcr.io/nvidia/pytorch:22.12-py3")
.pip_install(
"torch==2.0.1", index_url="https://download.pytorch.org/whl/cu118"
)
# Pinned to 08/15/2023
.pip_install(
"vllm @ git+https://github.com/vllm-project/vllm.git@main",
"typing-extensions==4.5.0", # >=4.6 causes typing issues
)
.pip_install("hf-transfer~=0.1")
.env({"HF_HUB_ENABLE_HF_TRANSFER": "1"})
.run_function(
download_model_to_folder,
secret=Secret.from_name("huggingface"),
timeout=60 * 20,
)
)
stub = Stub("phind", image=image)
@stub.cls(gpu=gpu.A100(memory=40, count=4), secret=Secret.from_name("huggingface"))
class Model:
def __enter__(self):
from vllm import LLM
# Load the model. Tip: MPT models may require `trust_remote_code=true`.
self.llm = LLM(MODEL_DIR, tensor_parallel_size=4)
self.template = "<<SYS>>\n{system}\n<</SYS>>\n\n[INST]{user}[/INST]"
@web_endpoint(method="POST")
def generate(self, params: Dict):
from vllm import SamplingParams
print(f"Received: {params}")
prompts = [
self.template.format(system="", user=params['prompt'])
]
sampling_params = SamplingParams(
temperature=0,
top_p=1,
max_tokens=800,
presence_penalty=1.15,
)
result = self.llm.generate(prompts, sampling_params)
num_tokens = 0
for output in result:
num_tokens += len(output.outputs[0].token_ids)
print(output.prompt, output.outputs[0].text, "\n\n", sep="")
print(f"Generated {num_tokens} tokens")
@stub.local_entrypoint()
def main():
model = Model()
Upon running modal serve vllm_modal_phind.py
:
✓ Initialized. View app at https://modal.com/apps/ap-EYh6LtFTbbGNwPObPqC0oD
✓ Created objects.
├── 🔨 Created download_model_to_folder.
├── 🔨 Created mount /home/user/src/project/scratch/vllm_modal_phind.py
├── 🔨 Created Model.generate.
└── 🔨 Created mount /home/user/src/project/scratch/vllm_modal_phind.py
️️⚡️ Serving... hit Ctrl-C to stop!
└── Watching /home/user/src/phind/scratch.
⠴ Running app...
No endpoint?
Issues I read before:
App/Stub URL #346
Converting a class method directly to an endpoint #310
Hi, I am trying to run the eg of Llama-2 and it runs fine but the app stops my goal is to deploy the llama-2 API that I can use with the front end. So is Modal.com a good choice for me? currently, I can see an "example-tgi-Llama-2-70b-chat-hf" app in my Modal dashboard, But when I try to run the run.py script it says "NotFoundError: App not found"
I don't code in Python but could someone help me?
run.py:
import modal
f = modal.Function.lookup("example-tgi-Llama-2-70b-chat-hf", "Model.generate")
f.remote("What is the story about the fox and grapes?")
Just trying out this on a podcast I listen to and the transcription seems to be always missing the last couple of sentences.
e.g.
import sys
import modal
from typing import Dict
from modal import Image,Volume, web_endpoint, enter, method
project_image = (Image.debian_slim(python_version="3.10")
.apt_install("git","ffmpeg","libsm6","libxext6")
.pip_install_from_requirements("requirements.txt")
.run_commands("git clone https://github.com/Arslan-Mehmood1/cog_Wav2Lip_COLAB_working.git",
"cd cog_Wav2Lip_COLAB_working && pip install -r requirements.txt"))
vol = Volume.persisted("wav2lip_volume")
stub = modal.Stub("wav2lip")
with project_image.imports():
# import importlib
# import sys
# sys.path.append("cog_Wav2Lip_COLAB_working")
# module_name = "cog_Wav2Lip_COLAB_working"
# # Import the module dynamically
# wav2lip = importlib.import_module(module_name)
# import git
import gdown
import zipfile
import argparse
import math
import os
import platform
import subprocess
import cv2
import numpy as np
import torch
from tqdm import tqdm
import audio
# # from face_detect import face_rect
from models import Wav2Lip
from batch_face import RetinaFace
from time import time
@stub.cls(gpu="T4", image=project_image, volumes={"/data": vol}, container_idle_timeout=300)
class Model:
@enter()
def enter(self, extract_to_directory="/data"):
# Download the necessary checkpoints
zip_file_url = "https://drive.google.com/file/d/1_AlLfRcu-82u9Wf0U2y8A4F5MamoNQnu/view?usp=drive_link"
zip_file_path = os.path.join(extract_to_directory, 'ckpts.zip')
gdown.download(url=zip_file_url, output=zip_file_path, quiet=False, fuzzy=True)
# Create the target directory if it doesn't exist
os.makedirs(extract_to_directory, exist_ok=True)
# Extract contents to the target directory
with zipfile.ZipFile(zip_file_path, 'r') as zip_ref:
zip_ref.extractall(extract_to_directory)
# Remove the downloaded ZIP file
os.remove(zip_file_path)
print(f"Successfully downloaded and extracted to '{extract_to_directory}'.")
print("\n\n\t",os.listdir(extract_to_directory),"\n\n")
print("\n\n\t",os.listdir(),"\n\n")
vol.commit()
@method()
def predict():
print("...Hi")
@stub.function(image=project_image, volumes={"/data": vol}, gpu="T4", container_idle_timeout=300)
@web_endpoint(method="POST",label="predict")
def main():
response = Model().predict.remote()
return "Hi"
(modal) arslan@Folio-1040-G2:~/nixense_vixion/To-Do/cog_wav2lip_modal$ modal serve modal_wav2lip.py
✓ Initialized. View run at https://modal.com/arslan-mehmood1/apps/ap-LqXMIgbHdlauqSsfbir7SW
✓ Created objects.
├── 🔨 Created mount /home/arslan/nixense_vixion/To-Do/cog_wav2lip_modal/modal_wav2lip.py
├── 🔨 Created Model.predict.
└── 🔨 Created main => https://arslan-mehmood1--predict-dev.modal.run
⚡️ Serving... hit Ctrl-C to stop!
└── Watching /home/arslan/nixense_vixion/To-Do/cog_wav2lip_modal.
Traceback (most recent call last):
File "/usr/local/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/local/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/pkg/modal/_container_entrypoint.py", line 850, in
main(container_args, client)
File "/pkg/modal/_container_entrypoint.py", line 805, in main
container_app.hydrate_function_deps(imp_fun.function, dep_object_ids)
File "/pkg/synchronicity/synchronizer.py", line 497, in proxy_method
return wrapped_method(instance, *args, **kwargs)
File "/pkg/synchronicity/synchronizer.py", line 398, in f_wrapped
res = f(*args, **kwargs)
File "/pkg/modal/app.py", line 325, in hydrate_function_deps
obj._hydrate(object_id, self._client, metadata)
File "/pkg/modal/object.py", line 107, in _hydrate
self._hydrate_metadata(metadata)
File "/pkg/modal/image.py", line 164, in _hydrate_metadata
raise exc
File "/pkg/modal/image.py", line 1361, in imports
yield
File "/root/modal_wav2lip.py", line 36, in
import audio
ModuleNotFoundError: No module named 'audio'
The audio.py is present in the repo which I cloned.
tl;dr: dreambooth_app.py
demo not running, would expect to run as is. Seems to be an issue with package versions but I can't figure out what the correct combination should be.
I'm trying to run the dreambooth_app.py
(using: modal run dreambooth_app.py
) but unfortunately I'm getting an error relating to the version of accelerate. I'm using the code exactly as is, I've just swapped in links to my own images.
Traceback (most recent call last):
File "/pkg/modal/_container_entrypoint.py", line 351, in handle_input_exception
yield
File "/pkg/modal/_container_entrypoint.py", line 437, in run_inputs
res = imp_fun.fun(*args, **kwargs)
File "/root/dreambooth_app.py", line 206, in train
from accelerate.utils import write_basic_config
File "/usr/local/lib/python3.10/site-packages/accelerate/__init__.py", line 3, in <module>
from .accelerator import Accelerator
File "/usr/local/lib/python3.10/site-packages/accelerate/accelerator.py", line 33, in <module>
from .checkpointing import load_accelerator_state, load_custom_state, save_accelerator_state, save_custom_state
File "/usr/local/lib/python3.10/site-packages/accelerate/checkpointing.py", line 24, in <module>
from .utils import (
File "/usr/local/lib/python3.10/site-packages/accelerate/utils/__init__.py", line 119, in <module>
from .megatron_lm import (
File "/usr/local/lib/python3.10/site-packages/accelerate/utils/megatron_lm.py", line 32, in <module>
from transformers.modeling_outputs import (
File "/usr/local/lib/python3.10/site-packages/transformers/__init__.py", line 26, in <module>
from . import dependency_versions_check
File "/usr/local/lib/python3.10/site-packages/transformers/dependency_versions_check.py", line 57, in <module>
require_version_core(deps[pkg])
File "/usr/local/lib/python3.10/site-packages/transformers/utils/versions.py", line 117, in require_version_core
return require_version(requirement, hint)
File "/usr/local/lib/python3.10/site-packages/transformers/utils/versions.py", line 111, in require_version
_compare_versions(op, got_ver, want_ver, requirement, pkg, hint)
File "/usr/local/lib/python3.10/site-packages/transformers/utils/versions.py", line 44, in _compare_versions
raise ImportError(
ImportError: accelerate>=0.20.3 is required for a normal functioning of this module, but found accelerate==0.19.0.
Try: pip install transformers -U or pip install -e '.[dev]' if you're working with git main
When I specify accelerate>=0.20.3
I get past the pip install part of the code but I get an error when subprocess is called:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/xformers/ops/fmha/triton.py", line 17, in <module>
from flash_attn.flash_attn_triton import (
ModuleNotFoundError: No module named 'flash_attn
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/root/examples/dreambooth/train_dreambooth.py", line 32, in <module>
import diffusers
File "/root/src/diffusers/__init__.py", line 35, in <module>
from .models import (
File "/root/src/diffusers/models/__init__.py", line 19, in <module>
from .autoencoder_kl import AutoencoderKL
File "/root/src/diffusers/models/autoencoder_kl.py", line 23, in <module>
from .vae import Decoder, DecoderOutput, DiagonalGaussianDistribution, Encoder
File "/root/src/diffusers/models/vae.py", line 22, in <module>
from .unet_2d_blocks import UNetMidBlock2D, get_down_block, get_up_block
File "/root/src/diffusers/models/unet_2d_blocks.py", line 18, in <module>
from .attention import AttentionBlock
File "/root/src/diffusers/models/attention.py", line 22, in <module>
from .cross_attention import CrossAttention
File "/root/src/diffusers/models/cross_attention.py", line 25, in <module>
import xformers.ops
File "/usr/local/lib/python3.10/site-packages/xformers/ops/__init__.py", line 8, in <module>
from .fmha import (
File "/usr/local/lib/python3.10/site-packages/xformers/ops/fmha/__init__.py", line 10, in <module>
from . import cutlass, decoder, flash, small_k, triton
File "/usr/local/lib/python3.10/site-packages/xformers/ops/fmha/triton.py", line 39, in <module>
flash_attn = import_module_from_path(
File "/usr/local/lib/python3.10/site-packages/xformers/ops/fmha/triton.py", line 36, in import_module_from_path
spec.loader.exec_module(module)
File "<frozen importlib._bootstrap_external>", line 879, in exec_module
File "<frozen importlib._bootstrap_external>", line 1016, in get_code
File "<frozen importlib._bootstrap_external>", line 1073, in get_data
FileNotFoundError: [Errno 2] No such file or directory: '/root/third_party/flash-attention/flash_attn/flash_attn_triton.py'
Traceback (most recent call last):
File "/usr/local/bin/accelerate", line 8, in <module>
sys.exit(main())
File "/usr/local/lib/python3.10/site-packages/accelerate/commands/accelerate_cli.py", line 45, in main
args.func(args)
File "/usr/local/lib/python3.10/site-packages/accelerate/commands/launch.py", line 979, in launch_command
simple_launcher(args)
File "/usr/local/lib/python3.10/site-packages/accelerate/commands/launch.py", line 628, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
Installing flash_attn then I get an error about torch not being installed. So I then tried to install flash_attn after torch is installed and then I get the following error:
Traceback (most recent call last):
File "<string>", line 2, in <module>
File "<pip-setuptools-caller>", line 34, in <module>
File "/tmp/pip-install-ji7oryv0/flash-attn_4dc31d3a534f4bcbb9e25d023a0889bd/setup.py", line 108, in <module>
raise_if_cuda_home_none("flash_attn")
File "/tmp/pip-install-ji7oryv0/flash-attn_4dc31d3a534f4bcbb9e25d023a0889bd/setup.py", line 55, in raise_if_cuda_home_none
raise RuntimeError(
RuntimeError: flash_attn was requested, but nvcc was not found. Are you sure your environment has nvcc available? If you're installing within a container from https://hub.docker.com/r/pytorch/pytorch, only images whose names contain 'devel' will provide nvcc.
Warning: Torch did not find available GPUs on this system.
If your intention is to cross-compile, this is not an error.
By default, Apex will cross-compile for Pascal (compute capabilities 6.0, 6.1, 6.2),
Volta (compute capability 7.0), Turing (compute capability 7.5),
and, if the CUDA version is >= 11.0, Ampere (compute capability 8.0).
If you wish to cross-compile for a single specific architecture,
export TORCH_CUDA_ARCH_LIST="compute capability" before running setup.py.`
Anyway, I'm sure the issue is due to some sort of mismatch in versions but I'm not able to figure out the correct combinaton. Does anyone happen to know what the correct setup is? Or is there some other underlying issue?
Hello
Apologies if this is not the correct repo to post such an issue.
When I run modal token new
I get the following error:
Traceback (most recent call last):
File "/home/anton/git/developer/.venv/bin/modal", line 5, in <module>
from modal.__main__ import main
File "/home/anton/git/developer/.venv/lib/pypy3.9/site-packages/modal/__init__.py", line 6, in <module>
from .dict import Dict
File "/home/anton/git/developer/.venv/lib/pypy3.9/site-packages/modal/dict.py", line 9, in <module>
from ._serialization import deserialize, serialize
File "/home/anton/git/developer/.venv/lib/pypy3.9/site-packages/modal/_serialization.py", line 6, in <module>
import cloudpickle
File "/home/anton/git/developer/.venv/lib/pypy3.9/site-packages/cloudpickle/__init__.py", line 4, in <module>
from cloudpickle.cloudpickle import * # noqa
File "/home/anton/git/developer/.venv/lib/pypy3.9/site-packages/cloudpickle/cloudpickle.py", line 57, in <module>
from .compat import pickle
File "/home/anton/git/developer/.venv/lib/pypy3.9/site-packages/cloudpickle/compat.py", line 13, in <module>
from _pickle import Pickler # noqa: F401
ModuleNotFoundError: No module named '_pickle'
I am using the following python version:
Python 3.9.16 (feeb267ead3e6771d3f2f49b83e1894839f64fb7, Dec 29 2022, 14:23:21)
If I am calling a modal function from a Django backend, is there any I can set the API tokens ID and Secret in the call of the function lookup? Because I would be using digital ocean apps for the Djano backend, which might not give me access to the console letting me set up the tokens.
Or do you have to set up the tokens beforehand instead of calling them programmatically?
hey is there any way we can set up a ML web app API utilizing the injob queue and model loading instance such that that when a API call comes a job id is returned , this job id initializes a model class in which a ML model is loaded into memory and utilized, now if another API call comes within this moment it utilizes the same initiated model rather than loading another model.
So i love modal, but i'm using pipenv and so can't use the pip install/poetry install functions, i've experimented with using a few run commands like, copying the pipfiles, then pipenv installing them, but they seem to run everytime i deploy even if i haven't changed either file, but anyway those do technically work but it just takes forever bc it installs every python dependency every time, also i would prefer to be able to specify stuff in the dockerfile anyway.
If i just do from_dockerfile with a dockerfile like such
FROM python:3.11.5-bookworm
RUN apt-get update && apt-get install -y --no-install-recommends \
build-essential \
&& rm -rf /var/lib/apt/lists/*
RUN pip install pipenv
WORKDIR /app
COPY Pipfile Pipfile.lock /app/
RUN pipenv install --categories='packages cluster' --deploy --system
COPY . /app/
That part works but i notice that there's a second step that happens after this where pip installs modal requirements, and after that all of my fastapi stuff breaks, so right now i have just literally added a pipenv comand to install over them at the end and that does work
image = modal.Image.from_dockerfile(
"modal.Dockerfile", context_mount=assets
).run_commands("pipenv install --categories='packages cluster' --deploy --system")
But this feels so inelegant? Just wondering how you'd recommend i do this, thanks!
Greetings! I have the example Modal Podcast Transcriber example running locally AND have the Podchaser API keys. However, it is unclear WHERE these keys need to go.
Does the Modal Secret code stub need to be added to a config file?
Or does the Modal Secret only work if the project is hosted on Modal.com?
Thank you!
Is it possible to pip_install
from a public github repo? It seems like it should be the same syntax as a requirements.txt
but this hasn't worked for me.
(modal) arslan@Folio-1040-G2:~/nixense_vixion/To-Do/cog_wav2lip_modal$ modal volume ls wav2lip_volume results
Directory listing of 'results' in 'wav2lip_volume'
┏━━━━━━━━━━━━━━━━━━━━┳━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━┓
┃ filename ┃ type ┃ created/modified ┃ size ┃
┡━━━━━━━━━━━━━━━━━━━━╇━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━┩
│ results/result.mp4 │ dir │ 2024-01-22 20:21:49+05:00 │ 6 B │
└────────────────────┴──────┴───────────────────────────┴──────┘
(modal) arslan@Folio-1040-G2:~/nixense_vixion/To-Do/cog_wav2lip_modal$ modal volume get wav2lip_volume results/result.mp4
[20:34:04] Requesting results/result.mp4 volume.py:229
Usage: modal volume get [OPTIONS] VOLUME_NAME REMOTE_PATH [LOCAL_DESTINATION]
Try 'modal volume get --help' for help.
╭─ Error ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ Path is a not a regular file: b'results/result.mp4' │
╰──────────────────────────────────────────────────────────────────────────
70B Please? Pretty please?
[error] When I locally served the pod transcriber example, the current pip install torchaudio version gives error
ERROR: Could not find a version that satisfies the requirement torchaudio==0.12.1 (from versions: 2.0.0, 2.0.1, 2.0.2, 2.1.0)
ERROR: No matching distribution found for torchaudio==0.12.1
[fix] Fixed by changing "torchaudio==0.12.1" to "torchaudio==2.1.0" (latest version) in main.py, line 37
https://modal-labs--whisper-pod-transcriber-fastapi-app.modal.run/ is also down right now, not sure if that's related to this error or not.
your site seems to be down when prompting for new token by running "modal token new" from terminal ,also no info on how to get token on site
Not sure if I'm doing something wrong, but just trying to run these two examples as is using the latest code on main
for the examples repo:
modal deploy stable_diffusion_xl.py
and then visiting the website generates "modal-http: internal server error: user function failed with status Failure: NameError("name 'FastAPI' is not defined")"
modal run dreambooth_app.py
generates "ValueError: Could not find the operator torchvision::nms. Please make sure you have already registered the operator and (if registered from C++) loaded it via torch.ops.load_library."
Apologies if I'm doing something wrong. Really excited to play with these cool examples!
Hi! I tried the transcriber (https://modal-labs--whisper-pod-transcriber-fastapi-app.modal.run/) on non-English content, and it didn't work. Perhaps you could state on the site that only English content is supported or change the Whisper model to the one that allows for any language?
Also, a small typo on the Stable Diffusion page (https://modal.com/docs/guide/ex/stable_diffusion_cli) "Stable Ddiffusion"
The 'https://modal-labs--whisper-pod-transcriber-fastapi-app.modal.run/#/' is wonderful, but it only shows the transcript in webpage. Is there any plan to add the download function to it to let users download the srt format transcripts?
i think the mixtral tgi example is broken
`
2023-12-20T12:33:48.503017Z INFO text_generation_launcher: Files are already present on the host. Skipping download.
2023-12-20T12:33:49.729338Z INFO download: text_generation_launcher: Successfully downloaded weights.
2023-12-20T12:33:49.731995Z INFO shard-manager: text_generation_launcher: Starting shard rank=0
2023-12-20T12:33:49.732105Z INFO shard-manager: text_generation_launcher: Starting shard rank=1
2023-12-20T12:33:49.750997Z INFO shard-manager: text_generation_launcher: Starting shard rank=2
2023-12-20T12:33:49.751712Z INFO shard-manager: text_generation_launcher: Starting shard rank=3
2023-12-20T12:33:59.806288Z INFO shard-manager: text_generation_launcher: Waiting for shard to be ready... rank=1
2023-12-20T12:33:59.806311Z INFO shard-manager: text_generation_launcher: Waiting for shard to be ready... rank=2
2023-12-20T12:33:59.806288Z INFO shard-manager: text_generation_launcher: Waiting for shard to be ready... rank=0
2023-12-20T12:33:59.806325Z INFO shard-manager: text_generation_launcher: Waiting for shard to be ready... rank=3
2023-12-20T12:34:08.721881Z WARN text_generation_launcher: Disabling exllama v2 and using v1 instead because there are issues when sharding
2023-12-20T12:34:08.721948Z WARN text_generation_launcher: Disabling exllama v2 and using v1 instead because there are issues when sharding
2023-12-20T12:34:08.721884Z WARN text_generation_launcher: Disabling exllama v2 and using v1 instead because there are issues when sharding
2023-12-20T12:34:08.722307Z WARN text_generation_launcher: Disabling exllama v2 and using v1 instead because there are issues when sharding
2023-12-20T12:34:09.822779Z INFO shard-manager: text_generation_launcher: Waiting for shard to be ready... rank=0
2023-12-20T12:34:09.822815Z INFO shard-manager: text_generation_launcher: Waiting for shard to be ready... rank=2
2023-12-20T12:34:09.898596Z INFO shard-manager: text_generation_launcher: Waiting for shard to be ready... rank=1
2023-12-20T12:34:09.898623Z INFO shard-manager: text_generation_launcher: Waiting for shard to be ready... rank=3
2023-12-20T12:34:18.485392Z ERROR text_generation_launcher: Error when initializing model
Traceback (most recent call last):
File "/opt/conda/bin/text-generation-server", line 8, in
sys.exit(app())
File "/opt/conda/lib/python3.10/site-packages/typer/main.py", line 311, in call
return get_command(self)(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1157, in call
return self.main(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/typer/core.py", line 778, in main
return _main(
File "/opt/conda/lib/python3.10/site-packages/typer/core.py", line 216, in _main
rv = self.invoke(ctx)
File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1688, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/typer/main.py", line 683, in wrapper
return callback(**use_params) # type: ignore
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/cli.py", line 89, in serve
server.serve(
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/server.py", line 215, in serve
asyncio.run(
File "/opt/conda/lib/python3.10/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 636, in run_until_complete
self.run_forever()
File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 603, in run_forever
self._run_once()
File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 1909, in _run_once
handle._run()
File "/opt/conda/lib/python3.10/asyncio/events.py", line 80, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/server.py", line 161, in serve_inner
model = get_model(
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/models/init.py", line 310, in get_model
return FlashMixtral(
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/models/flash_mixtral.py", line 21, in init
super(FlashMixtral, self).init(
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/models/flash_mistral.py", line 318, in init
SLIDING_WINDOW_BLOCKS = math.ceil(config.sliding_window / BLOCK_SIZE)
TypeError: unsupported operand type(s) for /: 'NoneType' and 'int'
2023-12-20T12:34:18.498142Z ERROR text_generation_launcher: Error when initializing model
Traceback (most recent call last):
File "/opt/conda/bin/text-generation-server", line 8, in
sys.exit(app())
File "/opt/conda/lib/python3.10/site-packages/typer/main.py", line 311, in call
return get_command(self)(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1157, in call
return self.main(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/typer/core.py", line 778, in main
return _main(
File "/opt/conda/lib/python3.10/site-packages/typer/core.py", line 216, in _main
rv = self.invoke(ctx)
File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1688, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/typer/main.py", line 683, in wrapper
return callback(**use_params) # type: ignore
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/cli.py", line 89, in serve
server.serve(
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/server.py", line 215, in serve
asyncio.run(
File "/opt/conda/lib/python3.10/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 636, in run_until_complete
self.run_forever()
File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 603, in run_forever
self._run_once()
File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 1909, in _run_once
handle._run()
File "/opt/conda/lib/python3.10/asyncio/events.py", line 80, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/server.py", line 161, in serve_inner
model = get_model(
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/models/init.py", line 310, in get_model
return FlashMixtral(
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/models/flash_mixtral.py", line 21, in init
super(FlashMixtral, self).init(
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/models/flash_mistral.py", line 318, in init
SLIDING_WINDOW_BLOCKS = math.ceil(config.sliding_window / BLOCK_SIZE)
TypeError: unsupported operand type(s) for /: 'NoneType' and 'int'
2023-12-20T12:34:18.510689Z ERROR text_generation_launcher: Error when initializing model
Traceback (most recent call last):
File "/opt/conda/bin/text-generation-server", line 8, in
sys.exit(app())
File "/opt/conda/lib/python3.10/site-packages/typer/main.py", line 311, in call
return get_command(self)(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1157, in call
return self.main(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/typer/core.py", line 778, in main
return _main(
File "/opt/conda/lib/python3.10/site-packages/typer/core.py", line 216, in _main
rv = self.invoke(ctx)
File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1688, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/typer/main.py", line 683, in wrapper
return callback(**use_params) # type: ignore
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/cli.py", line 89, in serve
server.serve(
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/server.py", line 215, in serve
asyncio.run(
File "/opt/conda/lib/python3.10/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 636, in run_until_complete
self.run_forever()
File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 603, in run_forever
self._run_once()
File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 1909, in _run_once
handle._run()
File "/opt/conda/lib/python3.10/asyncio/events.py", line 80, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/server.py", line 161, in serve_inner
model = get_model(
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/models/init.py", line 310, in get_model
return FlashMixtral(
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/models/flash_mixtral.py", line 21, in init
super(FlashMixtral, self).init(
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/models/flash_mistral.py", line 318, in init
SLIDING_WINDOW_BLOCKS = math.ceil(config.sliding_window / BLOCK_SIZE)
TypeError: unsupported operand type(s) for /: 'NoneType' and 'int'
2023-12-20T12:34:18.542525Z ERROR text_generation_launcher: Error when initializing model
Traceback (most recent call last):
File "/opt/conda/bin/text-generation-server", line 8, in
sys.exit(app())
File "/opt/conda/lib/python3.10/site-packages/typer/main.py", line 311, in call
return get_command(self)(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1157, in call
return self.main(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/typer/core.py", line 778, in main
return _main(
File "/opt/conda/lib/python3.10/site-packages/typer/core.py", line 216, in _main
rv = self.invoke(ctx)
File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1688, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/typer/main.py", line 683, in wrapper
return callback(**use_params) # type: ignore
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/cli.py", line 89, in serve
server.serve(
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/server.py", line 215, in serve
asyncio.run(
File "/opt/conda/lib/python3.10/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 636, in run_until_complete
self.run_forever()
File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 603, in run_forever
self._run_once()
File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 1909, in _run_once
handle._run()
File "/opt/conda/lib/python3.10/asyncio/events.py", line 80, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/server.py", line 161, in serve_inner
model = get_model(
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/models/init.py", line 310, in get_model
return FlashMixtral(
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/models/flash_mixtral.py", line 21, in init
super(FlashMixtral, self).init(
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/models/flash_mistral.py", line 318, in init
SLIDING_WINDOW_BLOCKS = math.ceil(config.sliding_window / BLOCK_SIZE)
TypeError: unsupported operand type(s) for /: 'NoneType' and 'int'
2023-12-20T12:34:19.909381Z INFO shard-manager: text_generation_launcher: Waiting for shard to be ready... rank=1
2023-12-20T12:34:19.909411Z INFO shard-manager: text_generation_launcher: Waiting for shard to be ready... rank=0
2023-12-20T12:34:19.909390Z INFO shard-manager: text_generation_launcher: Waiting for shard to be ready... rank=3
2023-12-20T12:34:19.909394Z INFO shard-manager: text_generation_launcher: Waiting for shard to be ready... rank=2
2023-12-20T12:34:20.412017Z ERROR shard-manager: text_generation_launcher: Shard complete standard error output:
╭───────────────────── Traceback (most recent call last) ──────────────────────╮
│ /opt/conda/lib/python3.10/site-packages/text_generation_server/cli.py:89 in │
│ serve │
│ │
│ 86 │ │ raise RuntimeError( │
│ 87 │ │ │ "Only 1 can be set between dtype
and quantize
, as they │
│ 88 │ │ ) │
│ ❱ 89 │ server.serve( │
│ 90 │ │ model_id, │
│ 91 │ │ revision, │
│ 92 │ │ sharded, │
│ │
│ /opt/conda/lib/python3.10/site-packages/text_generation_server/server.py:215 │
│ in serve │
│ │
│ 212 │ │ │ logger.info("Signal received. Shutting down") │
│ 213 │ │ │ await server.stop(0) │
│ 214 │ │
│ ❱ 215 │ asyncio.run( │
│ 216 │ │ serve_inner( │
│ 217 │ │ │ model_id, revision, sharded, quantize, speculate, dtype, t │
│ 218 │ │ ) │
│ │
│ /opt/conda/lib/python3.10/asyncio/runners.py:44 in run │
│ │
│ 41 │ │ events.set_event_loop(loop) │
│ 42 │ │ if debug is not None: │
│ 43 │ │ │ loop.set_debug(debug) │
│ ❱ 44 │ │ return loop.run_until_complete(main) │
│ 45 │ finally: │
│ 46 │ │ try: │
│ 47 │ │ │ _cancel_all_tasks(loop) │
│ │
│ /opt/conda/lib/python3.10/asyncio/base_events.py:649 in run_until_complete │
│ │
│ 646 │ │ if not future.done(): │
│ 647 │ │ │ raise RuntimeError('Event loop stopped before Future comp │
│ 648 │ │ │
│ ❱ 649 │ │ return future.result() │
│ 650 │ │
│ 651 │ def stop(self): │
│ 652 │ │ """Stop running the event loop. │
│ │
│ /opt/conda/lib/python3.10/site-packages/text_generation_server/server.py:161 │
│ in serve_inner │
│ │
│ 158 │ │ │ server_urls = [local_url] │
│ 159 │ │ │
│ 160 │ │ try: │
│ ❱ 161 │ │ │ model = get_model( │
│ 162 │ │ │ │ model_id, │
│ 163 │ │ │ │ revision, │
│ 164 │ │ │ │ sharded, │
│ │
│ /opt/conda/lib/python3.10/site-packages/text_generation_server/models/__init │
│ .py:310 in get_model │
│ │
│ 307 │ │
│ 308 │ if model_type == "mixtral": │
│ 309 │ │ if MIXTRAL: │
│ ❱ 310 │ │ │ return FlashMixtral( │
│ 311 │ │ │ │ model_id, │
│ 312 │ │ │ │ revision, │
│ 313 │ │ │ │ quantize=quantize, │
│ │
│ /opt/conda/lib/python3.10/site-packages/text_generation_server/models/flash │
│ mixtral.py:21 in init │
│ │
│ 18 │ │ dtype: Optional[torch.dtype] = None, │
│ 19 │ │ trust_remote_code: bool = False, │
│ 20 │ ): │
│ ❱ 21 │ │ super(FlashMixtral, self).init( │
│ 22 │ │ │ config_cls=MixtralConfig, │
│ 23 │ │ │ model_cls=FlashMixtralForCausalLM, │
│ 24 │ │ │ model_id=model_id, │
│ │
│ /opt/conda/lib/python3.10/site-packages/text_generation_server/models/flash │
│ mistral.py:318 in init │
│ │
│ 315 │ │ │
│ 316 │ │ # Set context windows │
│ 317 │ │ SLIDING_WINDOW = config.sliding_window │
│ ❱ 318 │ │ SLIDING_WINDOW_BLOCKS = math.ceil(config.sliding_window / BLOC │
│ 319 │ │ │
│ 320 │ │ torch.distributed.barrier(group=self.process_group) │
│ 321 │
╰──────────────────────────────────────────────────────────────────────────────╯
TypeError: unsupported operand type(s) for /: 'NoneType' and 'int' rank=0
2023-12-20T12:34:20.460680Z ERROR text_generation_launcher: Shard 0 failed to start
2023-12-20T12:34:20.460780Z INFO text_generation_launcher: Shutting down shards
2023-12-20T12:34:20.513445Z ERROR shard-manager: text_generation_launcher: Shard complete standard error output:
╭───────────────────── Traceback (most recent call last) ──────────────────────╮
│ /opt/conda/lib/python3.10/site-packages/text_generation_server/cli.py:89 in │
│ serve │
│ │
│ 86 │ │ raise RuntimeError( │
│ 87 │ │ │ "Only 1 can be set between dtype
and quantize
, as they │
│ 88 │ │ ) │
│ ❱ 89 │ server.serve( │
│ 90 │ │ model_id, │
│ 91 │ │ revision, │
│ 92 │ │ sharded, │
│ │
│ /opt/conda/lib/python3.10/site-packages/text_generation_server/server.py:215 │
│ in serve │
│ │
│ 212 │ │ │ logger.info("Signal received. Shutting down") │
│ 213 │ │ │ await server.stop(0) │
│ 214 │ │
│ ❱ 215 │ asyncio.run( │
│ 216 │ │ serve_inner( │
│ 217 │ │ │ model_id, revision, sharded, quantize, speculate, dtype, t │
│ 218 │ │ ) │
│ │
│ /opt/conda/lib/python3.10/asyncio/runners.py:44 in run │
│ │
│ 41 │ │ events.set_event_loop(loop) │
│ 42 │ │ if debug is not None: │
│ 43 │ │ │ loop.set_debug(debug) │
│ ❱ 44 │ │ return loop.run_until_complete(main) │
│ 45 │ finally: │
│ 46 │ │ try: │
│ 47 │ │ │ _cancel_all_tasks(loop) │
│ │
│ /opt/conda/lib/python3.10/asyncio/base_events.py:649 in run_until_complete │
│ │
│ 646 │ │ if not future.done(): │
│ 647 │ │ │ raise RuntimeError('Event loop stopped before Future comp │
│ 648 │ │ │
│ ❱ 649 │ │ return future.result() │
│ 650 │ │
│ 651 │ def stop(self): │
│ 652 │ │ """Stop running the event loop. │
│ │
│ /opt/conda/lib/python3.10/site-packages/text_generation_server/server.py:161 │
│ in serve_inner │
│ │
│ 158 │ │ │ server_urls = [local_url] │
│ 159 │ │ │
│ 160 │ │ try: │
│ ❱ 161 │ │ │ model = get_model( │
│ 162 │ │ │ │ model_id, │
│ 163 │ │ │ │ revision, │
│ 164 │ │ │ │ sharded, │
│ │
│ /opt/conda/lib/python3.10/site-packages/text_generation_server/models/__init │
│ .py:310 in get_model │
│ │
│ 307 │ │
│ 308 │ if model_type == "mixtral": │
│ 309 │ │ if MIXTRAL: │
│ ❱ 310 │ │ │ return FlashMixtral( │
│ 311 │ │ │ │ model_id, │
│ 312 │ │ │ │ revision, │
│ 313 │ │ │ │ quantize=quantize, │
│ │
│ /opt/conda/lib/python3.10/site-packages/text_generation_server/models/flash │
│ mixtral.py:21 in init │
│ │
│ 18 │ │ dtype: Optional[torch.dtype] = None, │
│ 19 │ │ trust_remote_code: bool = False, │
│ 20 │ ): │
│ ❱ 21 │ │ super(FlashMixtral, self).init( │
│ 22 │ │ │ config_cls=MixtralConfig, │
│ 23 │ │ │ model_cls=FlashMixtralForCausalLM, │
│ 24 │ │ │ model_id=model_id, │
│ │
│ /opt/conda/lib/python3.10/site-packages/text_generation_server/models/flash │
│ mistral.py:318 in init │
│ │
│ 315 │ │ │
│ 316 │ │ # Set context windows │
│ 317 │ │ SLIDING_WINDOW = config.sliding_window │
│ ❱ 318 │ │ SLIDING_WINDOW_BLOCKS = math.ceil(config.sliding_window / BLOC │
│ 319 │ │ │
│ 320 │ │ torch.distributed.barrier(group=self.process_group) │
│ 321 │
╰──────────────────────────────────────────────────────────────────────────────╯
TypeError: unsupported operand type(s) for /: 'NoneType' and 'int' rank=3
`
Created huggingface-secrets
at: https://modal.com/secrets
Executed the below commands in the CLI:
export HUGGINGFACE_TOKEN=huggingface-secrets
git clone https://github.com/modal-labs/modal-examples
cd modal-examples
modal run 06_gpu_and_ml/vllm_inference.py
got the below error:
NotFoundError: No secret named huggingface - you can add secrets to your account at https://modal.com/secrets
Is there any documented process by which I can have a modal container running, for instance, MongoDB and then connect this to another modal python file running in a separate container.
To do this I need to know the url or IP of the instance running the DB. I imagine this might be done using a Dockerfile but I was wondering if this is supported through the modal api?
hello, I want to experience modal system. according to the readme, i need a token, so I log in with github account, but the settings page is always automatically jump to waitlist page. please help me, thanks!
hey the link used in pod_transcriber
has a depreciated
pip_install(
"https://github.com/openai/whisper/archive/9f70a352f9f8630ab3aa0d06af5cb9532bd8c21d.tar.gz",
which gives error while transcribing audios ,so instead to installing from this you can simply use
pip_install(
"git+https://github.com/openai/whisper.git",
which is working perfectly
Due to the issue described here: pytube/pytube#1768
It looks like YouTube has changed their backend API.
Hands down one of the best things I've used. Endless possibilities and surprisingly beginner-friendly.
It's so easy to use and distribute it's amazing. I've been playing with it for a few days and am already integrating cloud storage, local file conversion with CUDA support, transcription, and LLMs.
I'm very spoiled due to ChatGPT handling a billion libraries but not this one.
Yesterday's hydration error got me to suggest an improved RAG implementation. It was my very first time dealing with this technology and the solution was nowhere to be found. It would be great to get a complex, permutable example on demand.
I humbly suggest adding Modal documentation, and GitHub examples to the RAG guide.
Using documentation RAG this way would improve the development speed and lower compute costs during initial learning and debugging.
Thank you for your consideration. Best of Luck!
I'm using a FastAPI app and encountered difficulties while implementing WebSockets. Is WebSocket support available in Modal? If so, are there any specific considerations or limitations to be aware of?
import logging
from typing import Dict
from fastapi import FastAPI, WebSocketDisconnect, websockets
from fastapi.websockets import WebSocket
from modal import Stub, Image, Secret, asgi_app
web_app = FastAPI()
agent_image = Image.debian_slim().pip_install(
"langchain", "openai", "google-search-results", "google-api-python-client")
stub = Stub("sellerAgent", image=agent_image, secrets=[Secret.from_name(
"my-openai-secret"), Secret.from_name("my-googlecloud-secret")])
@web_app.get("/")
async def read_main():
return {"msg": "Hello World"}
@web_app.websocket('/ws')
async def websocket_endpoint(websocket: WebSocket):
await websocket.accept()
while True:
try:
# Receive and send back the client message
user_msg = await websocket.receive_text()
print(user_msg)
except WebSocketDisconnect:
logging.info("WebSocketDisconnect")
# TODO try to reconnect with back-off
break
except websockets.ConnectionClosedOK:
logging.info("ConnectionClosedOK")
# TODO handle this?
break
except Exception as e:
logging.error(e)
@stub.function()
@asgi_app()
def fastapi_app():
return web_app
I really like the intent of the modal framework, however I cannot figure out how to determine the url of an app that has been started like:
modal serve modal_start.py
I've looked at the docs that state about the url construction include the https://[user-id]--[stub_name]-[function_name].modal.run
But no combination seems to work for me. Couldn't the public url be output on the command line - or via another tool or on the dashboard.
Any help gratefully received.
Thanks.
I have tried running this example a few times but I keep running into the following error:
`
Traceback (most recent call last):
File "/pkg/modal/_container_entrypoint.py", line 346, in handle_user_exception
yield
File "/pkg/modal/_container_entrypoint.py", line 575, in call_function_async
enter_res = enter_method()
File "/root/vllm.py", line 91, in enter
from vllm.engine.arg_utils import AsyncEngineArgs
ModuleNotFoundError: No module named 'vllm.engine'; 'vllm' is not a package
Runner failed with exception: ModuleNotFoundError("No module named 'vllm.engine'; 'vllm' is not a package")
Stopping app - uncaught exception raised locally: ModuleNotFoundError("No module named 'vllm.engine'; 'vllm' is not a package").
ModuleNotFoundError: No module named 'vllm.engine'; 'vllm' is not a package
`
Should vllm be installed locally?
Shouldn't the script be executed within modal cloud's container instances?
Thank you for your time.
When playing around with dreambooth example, the documentation said you can kick off a training job with the command modal run dreambooth_app.py::stub.train
. When doing so, you may encounter an error as AttributeError: 'str' object has no attribute 'xxx'
.
The reproduction code is simple, just have a snippet as
import modal
from dataclasses import dataclass
stub = modal.Stub('dataclass-test')
@dataclass
class SharedConfig:
a: str = 'a'
@stub.function()
def test(config=SharedConfig()):
print(f'Type: {type(config)}, Content: {config}')
@stub.local_entrypoint()
def main():
test.call()
when running command modal run test.py
, you may have normal output as
ubuntu:~/projects/modal_test$ modal run test.py
✓ Initialized. View app at xxx
✓ Created objects.
├── 🔨 Created test.
├── 🔨 Created mount /home/ubuntu/projects/modal_test/test.py
└── 🔨 Created fix.
Type: <class 'test.SharedConfig'>, Content: SharedConfig(a='a')
✓ App completed.
when kicking off the stub function directly modal run test.py::stub.test
, you may notice that config
became a string.
ubuntu:~/projects/modal_test$ modal run test.py::stub.test
✓ Initialized. View app at xxx
✓ Created objects.
├── 🔨 Created test.
├── 🔨 Created mount /home/ubuntu/projects/modal_test/test.py
└── 🔨 Created fix.
Type: <class 'str'>, Content: SharedConfig(a='a')
✓ App completed.
From my point of view, I think this behavior is unexpected.
A fix workaround is rather simple, just initialize the config
inside the function rather sent as a parameter.
@stub.function()
def fix():
config = SharedConfig()
print(f'Type: {type(config)}, Content: {config}')
Now, running stub function modal run test.py::stub.fix
will behave correctly.
ubuntu:~/projects/modal_test$ modal run test.py::stub.fix
✓ Initialized. View app at xxx
✓ Created objects.
├── 🔨 Created test.
├── 🔨 Created mount /home/ubuntu/projects/modal_test/test.py
└── 🔨 Created fix.
Type: <class 'test.SharedConfig'>, Content: SharedConfig(a='a')
✓ App completed.
However, I thought this might worth examining and I guessed it might be something related to serialization ? (not sure what really happened under the hood)
I'm trying to use the text_generation_inference.py example and getting an AttributeError: 'Function' object has no attribute 'remote'
future: <Task finished name='Task-231' coro=<FastAPI.call() done, defined at /opt/conda/lib/python3.9/site-packages/fastapi/applications.py:267> exception=AttributeError("'Function' object has no attribute 'remote'")>
Traceback (most recent call last):
is there any way we can docker pull from private dockerhub repos while defining Image.from_registry?
This is my first time trying to fine tune a stable diffusion model so there's a lot I don't understand yet.
Full stack:
Traceback (most recent call last):
File "/root/sd-scripts/sdxl_train.py", line 753, in <module>
train(args)
File "/root/sd-scripts/sdxl_train.py", line 567, in train
accelerator.backward(loss)
File "/usr/local/lib/python3.10/site-packages/accelerate/accelerator.py", line 1983, in backward
self.scaler.scale(loss).backward(**kwargs)
File "/usr/local/lib/python3.10/site-packages/torch/_tensor.py", line 492, in backward
torch.autograd.backward(
File "/usr/local/lib/python3.10/site-packages/torch/autograd/__init__.py", line 251, in backward
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
File "/usr/local/lib/python3.10/site-packages/torch/autograd/function.py", line 288, in apply
return user_fn(self, *args)
File "/usr/local/lib/python3.10/site-packages/torch/utils/checkpoint.py", line 271, in backward
outputs = ctx.run_function(*detached_inputs)
File "/root/sd-scripts/library/sdxl_original_unet.py", line 643, in custom_forward
return func(*inputs)
File "/root/sd-scripts/library/sdxl_original_unet.py", line 633, in forward_body
hidden_states = self.ff(self.norm3(hidden_states)) + hidden_states
File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/root/sd-scripts/library/sdxl_original_unet.py", line 577, in forward
hidden_states = module(hidden_states)
File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/root/sd-scripts/library/sdxl_original_unet.py", line 556, in forward
return hidden_states * self.gelu(gate)
RuntimeError: NVML_SUCCESS == r INTERNAL ASSERT FAILED at "../c10/cuda/CUDACachingAllocator.cpp":1154, please report a bug to PyTorch.
Running this on modal.com A10G function with this command:
accelerate launch /root/sd-scripts/sdxl_train.py \
--pretrained_model_name_or_path=/vol/sd_xl_base_1.0.safetensors \
--dataset_config=/vol/d835023a-4ed5-4269-b323-9952930d3786/training_config.toml \
--output_dir=/vol/d835023a-4ed5-4269-b323-9952930d3786 \
--train_batch_size=1 \
--output_name=model.safetensors \
--save_model_as=safetensors \
--learning_rate=4e-7 \
--optimizer_type=adafactor \
--xformers \
--mixed_precision=fp16 \
--cache_latents \
--cache_text_encoder_outputs \
--gradient_checkpointing \
--no_half_vae
Docker image setup:
Image.debian_slim(python_version="3.10")
.apt_install(["ffmpeg"])
.pip_install(
"accelerate==0.23",
"boto3",
"opencv-python-headless",
"pytorch-lightning",
"tensorboard",
"safetensors",
"toml",
"voluptuous",
"open-clip-torch",
"huggingface-hub",
"datasets~=2.13",
"diffusers[torch]",
"einops",
"ftfy",
"smart_open",
"transformers==4.30.2",
"torch==2.0.1",
"torchvision",
"torchaudio",
"triton",
"tomli-w",
)
.pip_install("xformers", pre=True)
training config is this converted to TOML:
{
'general': {
'enable_bucket': True
},
'datasets': [
{
'resolution': 1024,
'batch_size': 1,
'subsets': [
{
'image_dir': f"/vol/d835023a-4ed5-4269-b323-9952930d3786/training_images",
'class_tokens': 'shs face',
'keep_tokens': 2,
},
]
},
]
}
Hi there!
I was experimenting and trying out the new text-generation-inference example/tutorial. The GPU config states to use 2 80ram A100 gpu's but Modal seems to only allow 20 and 40?
Awesome product though and really enjoying playing around!
Best,
Caesar
having timeout issue when uploading large model on modal nfs put vol_name, any workaround on this?
Hi Team,
How can we do a modal lookup for a stub function defined within a class?
I am referring to your HuggingFace batch inference example - https://modal.com/docs/guide/ex/batch_inference_using_huggingface
Any help is appreciated.
Thanks!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.