modal-labs / modal-examples Goto Github PK

Examples of programs built using Modal

License: MIT License

Python 81.40% JavaScript 3.83% HTML 2.40% TypeScript 2.40% CSS 4.06% Svelte 5.09% Jupyter Notebook 0.57% Shell 0.15% Jinja 0.11%

cloud distributed gpu machine-learning modal python pytorch serverless stable-diffusion web

modal-examples's People

Contributors

Stargazers

Watchers

Forkers

c00renut nduatik trellixvulnteam charlesfrye naifalqahtani usergit nebo333 vegarsti litanlitudan zmzlois octaflop xcke ravi03071991 touristshaun chilidor markhng525 simon-lawyer ssmele techthiyanes djben datalayer-externals dongreenberg chunzhimoe ken2190 lgs kr4t0n stanbiryukov idvorkin ctavolazzi einfachalex110 reedlaw pigstyalleycoders maxscheel aiscot kimishen mazinhozanelato anishpdalal ukituki matthewspear jhines2k7 asghar765 daodedo kumar045 taishikato qqq-tech darkhorse12022023 snowcittysolutions paixai pahdolabs hcchengithub scottrblock anhmike esther119 flexchar rexqin sorokinvld ryanmccauley haiderasad inayet kousun12 smallcloudco hcw087 pmukherj ar2427 abhishekc-bhs camenduru mharris021 c0c1 sumerjoshi michaeleliot 5l1v3r1 petrastore rnori-harv weslien golvellius32 thesocialdiner ego archit15singh rgreer4 yaominzh guydoesit79 suryatmodulus pingren kylechio mjdhasan ltcevil josssiiiah kazuph fenglui nicholaskao1029 oneseco-media modsushi runsascoded tailagency div99 yosun jxnl ayushmore chrislee973 pbadeer

modal-examples's Issues

"Fast inference with vLLM (Llama 2 13B)" example is broken

As per https://modal.com/docs/guide/ex/vllm_inference, I ran:

git clone https://github.com/modal-labs/modal-examples
cd modal-examples
modal run 06_gpu_and_ml/vllm_inference.py

Here's the error that I got:

Downloading ray-2.7.1-cp38-cp38-manylinux2014_x86_64.whl (62.5 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 62.5/62.5 MB 178.5 MB/s eta 0:00:00
Downloading transformers-4.34.0-py3-none-any.whl (7.7 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 7.7/7.7 MB 163.4 MB/s eta 0:00:00
Downloading xformers-0.0.22-cp38-cp38-manylinux2014_x86_64.whl (211.6 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 211.6/211.6 MB 165.8 MB/s eta 0:00:00
Downloading ninja-1.11.1.1-py2.py3-none-manylinux1_x86_64.manylinux_2_5_x86_64.whl (307 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 307.2/307.2 kB 204.4 MB/s eta 0:00:00
Downloading uvicorn-0.23.2-py3-none-any.whl (59 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 59.5/59.5 kB 177.2 MB/s eta 0:00:00
Downloading safetensors-0.4.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.3/1.3 MB 197.8 MB/s eta 0:00:00
Downloading tokenizers-0.14.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.8 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.8/3.8 MB 191.2 MB/s eta 0:00:00
Downloading huggingface_hub-0.17.3-py3-none-any.whl (295 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 295.0/295.0 kB 199.8 MB/s eta 0:00:00
Building wheels for collected packages: vllm
  Building wheel for vllm (pyproject.toml): started
  Building wheel for vllm (pyproject.toml): finished with status 'error'
  error: subprocess-exited-with-error
  
  × Building wheel for vllm (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> [140 lines of output]
      running bdist_wheel
      running build
      running build_py
      creating build
      creating build/lib.linux-x86_64-cpython-38
      creating build/lib.linux-x86_64-cpython-38/vllm
      copying vllm/sequence.py -> build/lib.linux-x86_64-cpython-38/vllm
      copying vllm/block.py -> build/lib.linux-x86_64-cpython-38/vllm
      copying vllm/outputs.py -> build/lib.linux-x86_64-cpython-38/vllm
      copying vllm/sampling_params.py -> build/lib.linux-x86_64-cpython-38/vllm
      copying vllm/logger.py -> build/lib.linux-x86_64-cpython-38/vllm
      copying vllm/utils.py -> build/lib.linux-x86_64-cpython-38/vllm
      copying vllm/__init__.py -> build/lib.linux-x86_64-cpython-38/vllm
      copying vllm/config.py -> build/lib.linux-x86_64-cpython-38/vllm
      creating build/lib.linux-x86_64-cpython-38/vllm/transformers_utils
      copying vllm/transformers_utils/tokenizer.py -> build/lib.linux-x86_64-cpython-38/vllm/transformers_utils
      copying vllm/transformers_utils/__init__.py -> build/lib.linux-x86_64-cpython-38/vllm/transformers_utils
      copying vllm/transformers_utils/config.py -> build/lib.linux-x86_64-cpython-38/vllm/transformers_utils
      creating build/lib.linux-x86_64-cpython-38/vllm/model_executor
      copying vllm/model_executor/input_metadata.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor
      copying vllm/model_executor/weight_utils.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor
      copying vllm/model_executor/model_loader.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor
      copying vllm/model_executor/utils.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor
      copying vllm/model_executor/__init__.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor
      creating build/lib.linux-x86_64-cpython-38/vllm/engine
      copying vllm/engine/arg_utils.py -> build/lib.linux-x86_64-cpython-38/vllm/engine
      copying vllm/engine/__init__.py -> build/lib.linux-x86_64-cpython-38/vllm/engine
      copying vllm/engine/llm_engine.py -> build/lib.linux-x86_64-cpython-38/vllm/engine
      copying vllm/engine/async_llm_engine.py -> build/lib.linux-x86_64-cpython-38/vllm/engine
      copying vllm/engine/ray_utils.py -> build/lib.linux-x86_64-cpython-38/vllm/engine
      creating build/lib.linux-x86_64-cpython-38/vllm/worker
      copying vllm/worker/__init__.py -> build/lib.linux-x86_64-cpython-38/vllm/worker
      copying vllm/worker/worker.py -> build/lib.linux-x86_64-cpython-38/vllm/worker
      copying vllm/worker/cache_engine.py -> build/lib.linux-x86_64-cpython-38/vllm/worker
      creating build/lib.linux-x86_64-cpython-38/vllm/core
      copying vllm/core/scheduler.py -> build/lib.linux-x86_64-cpython-38/vllm/core
      copying vllm/core/policy.py -> build/lib.linux-x86_64-cpython-38/vllm/core
      copying vllm/core/block_manager.py -> build/lib.linux-x86_64-cpython-38/vllm/core
      copying vllm/core/__init__.py -> build/lib.linux-x86_64-cpython-38/vllm/core
      creating build/lib.linux-x86_64-cpython-38/vllm/entrypoints
      copying vllm/entrypoints/api_server.py -> build/lib.linux-x86_64-cpython-38/vllm/entrypoints
      copying vllm/entrypoints/__init__.py -> build/lib.linux-x86_64-cpython-38/vllm/entrypoints
      copying vllm/entrypoints/llm.py -> build/lib.linux-x86_64-cpython-38/vllm/entrypoints
      creating build/lib.linux-x86_64-cpython-38/vllm/transformers_utils/configs
      copying vllm/transformers_utils/configs/mpt.py -> build/lib.linux-x86_64-cpython-38/vllm/transformers_utils/configs
      copying vllm/transformers_utils/configs/baichuan.py -> build/lib.linux-x86_64-cpython-38/vllm/transformers_utils/configs
      copying vllm/transformers_utils/configs/qwen.py -> build/lib.linux-x86_64-cpython-38/vllm/transformers_utils/configs
      copying vllm/transformers_utils/configs/__init__.py -> build/lib.linux-x86_64-cpython-38/vllm/transformers_utils/configs
      copying vllm/transformers_utils/configs/falcon.py -> build/lib.linux-x86_64-cpython-38/vllm/transformers_utils/configs
      creating build/lib.linux-x86_64-cpython-38/vllm/model_executor/layers
      copying vllm/model_executor/layers/attention.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/layers
      copying vllm/model_executor/layers/sampler.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/layers
      copying vllm/model_executor/layers/activation.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/layers
      copying vllm/model_executor/layers/__init__.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/layers
      copying vllm/model_executor/layers/layernorm.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/layers
      creating build/lib.linux-x86_64-cpython-38/vllm/model_executor/parallel_utils
      copying vllm/model_executor/parallel_utils/parallel_state.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/parallel_utils
      copying vllm/model_executor/parallel_utils/__init__.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/parallel_utils
      creating build/lib.linux-x86_64-cpython-38/vllm/model_executor/models
      copying vllm/model_executor/models/mpt.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/models
      copying vllm/model_executor/models/opt.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/models
      copying vllm/model_executor/models/internlm.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/models
      copying vllm/model_executor/models/gpt2.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/models
      copying vllm/model_executor/models/baichuan.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/models
      copying vllm/model_executor/models/gpt_j.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/models
      copying vllm/model_executor/models/gpt_neox.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/models
      copying vllm/model_executor/models/llama.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/models
      copying vllm/model_executor/models/qwen.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/models
      copying vllm/model_executor/models/bloom.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/models
      copying vllm/model_executor/models/__init__.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/models
      copying vllm/model_executor/models/falcon.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/models
      copying vllm/model_executor/models/gpt_bigcode.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/models
      creating build/lib.linux-x86_64-cpython-38/vllm/model_executor/parallel_utils/tensor_parallel
      copying vllm/model_executor/parallel_utils/tensor_parallel/random.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/parallel_utils/tensor_parallel
      copying vllm/model_executor/parallel_utils/tensor_parallel/mappings.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/parallel_utils/tensor_parallel
      copying vllm/model_executor/parallel_utils/tensor_parallel/layers.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/parallel_utils/tensor_parallel
      copying vllm/model_executor/parallel_utils/tensor_parallel/utils.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/parallel_utils/tensor_parallel
      copying vllm/model_executor/parallel_utils/tensor_parallel/__init__.py -> build/lib.linux-x86_64-cpython-38/vllm/model_executor/parallel_utils/tensor_parallel
      creating build/lib.linux-x86_64-cpython-38/vllm/entrypoints/openai
      copying vllm/entrypoints/openai/protocol.py -> build/lib.linux-x86_64-cpython-38/vllm/entrypoints/openai
      copying vllm/entrypoints/openai/api_server.py -> build/lib.linux-x86_64-cpython-38/vllm/entrypoints/openai
      copying vllm/entrypoints/openai/__init__.py -> build/lib.linux-x86_64-cpython-38/vllm/entrypoints/openai
      running build_ext
      /tmp/pip-build-env-u_96mrb3/overlay/lib/python3.8/site-packages/torch/nn/modules/transformer.py:20: UserWarning: Failed to initialize NumPy: numpy.core.multiarray failed to import (Triggered internally at ../torch/csrc/utils/tensor_numpy.cpp:84.)
        device: torch.device = torch.device(torch._C._get_default_device()),  # torch.device('cpu'),
      No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda'
      Traceback (most recent call last):
        File "/usr/local/lib/python3.8/dist-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in <module>
          main()
        File "/usr/local/lib/python3.8/dist-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main
          json_out['return_val'] = hook(**hook_input['kwargs'])
        File "/usr/local/lib/python3.8/dist-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 251, in build_wheel
          return _build_backend().build_wheel(wheel_directory, config_settings,
        File "/tmp/pip-build-env-u_96mrb3/overlay/lib/python3.8/site-packages/setuptools/build_meta.py", line 434, in build_wheel
          return self._build_with_temp_dir(
        File "/tmp/pip-build-env-u_96mrb3/overlay/lib/python3.8/site-packages/setuptools/build_meta.py", line 419, in _build_with_temp_dir
          self.run_setup()
        File "/tmp/pip-build-env-u_96mrb3/overlay/lib/python3.8/site-packages/setuptools/build_meta.py", line 341, in run_setup
          exec(code, locals())
        File "<string>", line 145, in <module>
        File "/tmp/pip-build-env-u_96mrb3/overlay/lib/python3.8/site-packages/setuptools/__init__.py", line 103, in setup
          return distutils.core.setup(**attrs)
        File "/tmp/pip-build-env-u_96mrb3/overlay/lib/python3.8/site-packages/setuptools/_distutils/core.py", line 185, in setup
          return run_commands(dist)
        File "/tmp/pip-build-env-u_96mrb3/overlay/lib/python3.8/site-packages/setuptools/_distutils/core.py", line 201, in run_commands
          dist.run_commands()
        File "/tmp/pip-build-env-u_96mrb3/overlay/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 969, in run_commands
          self.run_command(cmd)
        File "/tmp/pip-build-env-u_96mrb3/overlay/lib/python3.8/site-packages/setuptools/dist.py", line 989, in run_command
          super().run_command(command)
        File "/tmp/pip-build-env-u_96mrb3/overlay/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
          cmd_obj.run()
        File "/tmp/pip-build-env-u_96mrb3/overlay/lib/python3.8/site-packages/wheel/bdist_wheel.py", line 364, in run
          self.run_command("build")
        File "/tmp/pip-build-env-u_96mrb3/overlay/lib/python3.8/site-packages/setuptools/_distutils/cmd.py", line 318, in run_command
          self.distribution.run_command(command)
        File "/tmp/pip-build-env-u_96mrb3/overlay/lib/python3.8/site-packages/setuptools/dist.py", line 989, in run_command
          super().run_command(command)
        File "/tmp/pip-build-env-u_96mrb3/overlay/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
          cmd_obj.run()
        File "/tmp/pip-build-env-u_96mrb3/overlay/lib/python3.8/site-packages/setuptools/_distutils/command/build.py", line 131, in run
          self.run_command(cmd_name)
        File "/tmp/pip-build-env-u_96mrb3/overlay/lib/python3.8/site-packages/setuptools/_distutils/cmd.py", line 318, in run_command
          self.distribution.run_command(command)
        File "/tmp/pip-build-env-u_96mrb3/overlay/lib/python3.8/site-packages/setuptools/dist.py", line 989, in run_command
          super().run_command(command)
        File "/tmp/pip-build-env-u_96mrb3/overlay/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
          cmd_obj.run()
        File "/tmp/pip-build-env-u_96mrb3/overlay/lib/python3.8/site-packages/setuptools/command/build_ext.py", line 88, in run
          _build_ext.run(self)
        File "/tmp/pip-build-env-u_96mrb3/overlay/lib/python3.8/site-packages/setuptools/_distutils/command/build_ext.py", line 345, in run
          self.build_extensions()
        File "/tmp/pip-build-env-u_96mrb3/overlay/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 525, in build_extensions
          _check_cuda_version(compiler_name, compiler_version)
        File "/tmp/pip-build-env-u_96mrb3/overlay/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 413, in _check_cuda_version
          raise RuntimeError(CUDA_MISMATCH_MESSAGE.format(cuda_str_version, torch.version.cuda))
      RuntimeError:
      The detected CUDA version (11.8) mismatches the version that was used to compile
      PyTorch (12.1). Please make sure to use the same CUDA versions.
      
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for vllm
Failed to build vllm
ERROR: Could not build wheels for vllm, which is required to install pyproject.toml-based projects
Terminating task due to error: failed to run builder command "python -m pip install typing-extensions==4.5.0 'vllm @ git+https://github.com/vllm-project/vllm.git@805de738f618f8b47ab0d450423d23db1e636fa2' "

Caused by:
    container exit status: 1
Runner failed with exception: task exited with failure, status = exit status: 1
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /Users/alexkim/mambaforge/bin/modal:8 in <module>                                                │
│                                                                                                  │
│   7 │   sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])                         │
│ ❱ 8 │   sys.exit(main())                                                                         │
│   9                                                                                              │
│                                                                                                  │
│ /Users/alexkim/mambaforge/lib/python3.10/site-packages/modal/__main__.py:9 in main               │
│                                                                                                  │
│    8 │   setup_rich_traceback()                                                                  │
│ ❱  9 │   entrypoint_cli()                                                                        │
│   10                                                                                             │
│                                                                                                  │
│ /Users/alexkim/mambaforge/lib/python3.10/site-packages/click/core.py:1157 in __call__            │
│                                                                                                  │
│   1156 │   │   """Alias for :meth:`main`."""                                                     │
│ ❱ 1157 │   │   return self.main(*args, **kwargs)                                                 │
│   1158                                                                                           │
│                                                                                                  │
│ /Users/alexkim/mambaforge/lib/python3.10/site-packages/typer/core.py:778 in main                 │
│                                                                                                  │
│   777 │   ) -> Any:                                                                              │
│ ❱ 778 │   │   return _main(                                                                      │
│   779 │   │   │   self,                                                                          │
│                                                                                                  │
│ /Users/alexkim/mambaforge/lib/python3.10/site-packages/typer/core.py:216 in _main                │
│                                                                                                  │
│   215 │   │   │   with self.make_context(prog_name, args, **extra) as ctx:                       │
│ ❱ 216 │   │   │   │   rv = self.invoke(ctx)                                                      │
│   217 │   │   │   │   if not standalone_mode:                                                    │
│                                                                                                  │
│ /Users/alexkim/mambaforge/lib/python3.10/site-packages/click/core.py:1688 in invoke              │
│                                                                                                  │
│   1687 │   │   │   │   with sub_ctx:                                                             │
│ ❱ 1688 │   │   │   │   │   return _process_result(sub_ctx.command.invoke(sub_ctx))               │
│   1689                                                                                           │
│                                                                                                  │
│ /Users/alexkim/mambaforge/lib/python3.10/site-packages/click/core.py:1688 in invoke              │
│                                                                                                  │
│   1687 │   │   │   │   with sub_ctx:                                                             │
│ ❱ 1688 │   │   │   │   │   return _process_result(sub_ctx.command.invoke(sub_ctx))               │
│   1689                                                                                           │
│                                                                                                  │
│ /Users/alexkim/mambaforge/lib/python3.10/site-packages/click/core.py:1434 in invoke              │
│                                                                                                  │
│   1433 │   │   if self.callback is not None:                                                     │
│ ❱ 1434 │   │   │   return ctx.invoke(self.callback, **ctx.params)                                │
│   1435                                                                                           │
│                                                                                                  │
│ /Users/alexkim/mambaforge/lib/python3.10/site-packages/click/core.py:783 in invoke               │
│                                                                                                  │
│    782 │   │   │   with ctx:                                                                     │
│ ❱  783 │   │   │   │   return __callback(*args, **kwargs)                                        │
│    784                                                                                           │
│                                                                                                  │
│ /Users/alexkim/mambaforge/lib/python3.10/site-packages/click/decorators.py:33 in new_func        │
│                                                                                                  │
│    32 │   def new_func(*args: "P.args", **kwargs: "P.kwargs") -> "R":                            │
│ ❱  33 │   │   return f(get_current_context(), *args, **kwargs)                                   │
│    34                                                                                            │
│                                                                                                  │
│ /Users/alexkim/mambaforge/lib/python3.10/site-packages/modal/cli/run.py:145 in f                 │
│                                                                                                  │
│   144 │   │                                                                                      │
│ ❱ 145 │   │   with run_stub(                                                                     │
│   146 │   │   │   stub,                                                                          │
│                                                                                                  │
│ /Users/alexkim/mambaforge/lib/python3.10/site-packages/synchronicity/synchronizer.py:497 in      │
│ proxy_method                                                                                     │
│                                                                                                  │
│ /Users/alexkim/mambaforge/lib/python3.10/site-packages/synchronicity/combined_types.py:26 in     │
│ __call__                                                                                         │
│                                                                                                  │
│ /Users/alexkim/mambaforge/lib/python3.10/contextlib.py:199 in __aenter__                         │
│                                                                                                  │
│   198 │   │   try:                                                                               │
│ ❱ 199 │   │   │   return await anext(self.gen)                                                   │
│   200 │   │   except StopAsyncIteration:                                                         │
│                                                                                                  │
│ /Users/alexkim/mambaforge/lib/python3.10/site-packages/modal/runner.py:88 in _run_stub           │
│                                                                                                  │
│    87 │   │   │   # Create all members                                                           │
│ ❱  88 │   │   │   await app._create_all_objects(                                                 │
│    89 │   │   │   │   stub._blueprint, post_init_state, environment_name, shell=shell, output_   │
│                                                                                                  │
│ /Users/alexkim/mambaforge/lib/python3.10/site-packages/modal/app.py:103 in _create_all_objects   │
│                                                                                                  │
│   102 │   │   │   │   existing_object_id = tag_to_object_id.get(tag)                             │
│ ❱ 103 │   │   │   │   await resolver.load(obj, existing_object_id)                               │
│   104 │   │   │   │   self._tag_to_object_id[tag] = obj.object_id                                │
│                                                                                                  │
│ /Users/alexkim/mambaforge/lib/python3.10/site-packages/modal/_resolver.py:126 in load            │
│                                                                                                  │
│   125 │   │                                                                                      │
│ ❱ 126 │   │   return await cached_future                                                         │
│   127                                                                                            │
│                                                                                                  │
│ /Users/alexkim/mambaforge/lib/python3.10/site-packages/modal/_resolver.py:102 in loader          │
│                                                                                                  │
│   101 │   │   │   async def loader():                                                            │
│ ❱ 102 │   │   │   │   await obj._load(obj, self, existing_object_id)                             │
│   103 │   │   │   │   if existing_object_id is not None and obj.object_id != existing_object_i   │
│                                                                                                  │
│ /Users/alexkim/mambaforge/lib/python3.10/site-packages/modal/image.py:176 in _load               │
│                                                                                                  │
│    175 │   │   │   for image in base_images.values():                                            │
│ ❱  176 │   │   │   │   base_image_ids.append((await resolver.load(image)).object_id)             │
│    177 │   │   │   base_images_pb2s = [                                                          │
│                                                                                                  │
│ /Users/alexkim/mambaforge/lib/python3.10/site-packages/modal/_resolver.py:126 in load            │
│                                                                                                  │
│   125 │   │                                                                                      │
│ ❱ 126 │   │   return await cached_future                                                         │
│   127                                                                                            │
│                                                                                                  │
│ /Users/alexkim/mambaforge/lib/python3.10/site-packages/modal/_resolver.py:102 in loader          │
│                                                                                                  │
│   101 │   │   │   async def loader():                                                            │
│ ❱ 102 │   │   │   │   await obj._load(obj, self, existing_object_id)                             │
│   103 │   │   │   │   if existing_object_id is not None and obj.object_id != existing_object_i   │
│                                                                                                  │
│ /Users/alexkim/mambaforge/lib/python3.10/site-packages/modal/image.py:176 in _load               │
│                                                                                                  │
│    175 │   │   │   for image in base_images.values():                                            │
│ ❱  176 │   │   │   │   base_image_ids.append((await resolver.load(image)).object_id)             │
│    177 │   │   │   base_images_pb2s = [                                                          │
│                                                                                                  │
│ /Users/alexkim/mambaforge/lib/python3.10/site-packages/modal/_resolver.py:126 in load            │
│                                                                                                  │
│   125 │   │                                                                                      │
│ ❱ 126 │   │   return await cached_future                                                         │
│   127                                                                                            │
│                                                                                                  │
│ /Users/alexkim/mambaforge/lib/python3.10/site-packages/modal/_resolver.py:102 in loader          │
│                                                                                                  │
│   101 │   │   │   async def loader():                                                            │
│ ❱ 102 │   │   │   │   await obj._load(obj, self, existing_object_id)                             │
│   103 │   │   │   │   if existing_object_id is not None and obj.object_id != existing_object_i   │
│                                                                                                  │
│ /Users/alexkim/mambaforge/lib/python3.10/site-packages/modal/image.py:176 in _load               │
│                                                                                                  │
│    175 │   │   │   for image in base_images.values():                                            │
│ ❱  176 │   │   │   │   base_image_ids.append((await resolver.load(image)).object_id)             │
│    177 │   │   │   base_images_pb2s = [                                                          │
│                                                                                                  │
│ /Users/alexkim/mambaforge/lib/python3.10/site-packages/modal/_resolver.py:126 in load            │
│                                                                                                  │
│   125 │   │                                                                                      │
│ ❱ 126 │   │   return await cached_future                                                         │
│   127                                                                                            │
│                                                                                                  │
│ /Users/alexkim/mambaforge/lib/python3.10/site-packages/modal/_resolver.py:102 in loader          │
│                                                                                                  │
│   101 │   │   │   async def loader():                                                            │
│ ❱ 102 │   │   │   │   await obj._load(obj, self, existing_object_id)                             │
│   103 │   │   │   │   if existing_object_id is not None and obj.object_id != existing_object_i   │
│                                                                                                  │
│ /Users/alexkim/mambaforge/lib/python3.10/site-packages/modal/image.py:296 in _load               │
│                                                                                                  │
│    295 │   │   │   if result.status == api_pb2.GenericResult.GENERIC_STATUS_FAILURE:             │
│ ❱  296 │   │   │   │   raise RemoteError(f"Image build for {image_id} failed with the exception  │
│    297 │   │   │   elif result.status == api_pb2.GenericResult.GENERIC_STATUS_TERMINATED:        │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
RemoteError: Image build for im-bvJc9XyO2U9rSetK1p4yUT failed with the exception:
task exited with failure, status = exit status: 1

a1111 example

this a1111 example is broken and gives the issue below.

I think the issue is that the setup is just out of date for a1111

╰─$ modal run a1111_webui.py                                                                                                            2 ↵
✓ Initialized. View run at https://modal.com/nicholaskao1029/apps/ap-LdFciiKaOuin89Vcq9BGg4
Building image im-8pFRuirFJZ4P0wH3Lin4Fm

=> Step 0: FROM base

=> Step 1: RUN cd /webui && . venv/bin/activate && python -c 'from modules import shared_init, initialize; shared_init.initialize(); initialize.initialize()'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/webui/modules/shared_init.py", line 5, in <module>
    from modules import shared
  File "/webui/modules/shared.py", line 3, in <module>
    import gradio as gr
  File "/webui/venv/lib/python3.10/site-packages/gradio/__init__.py", line 3, in <module>
    import gradio.components as components
  File "/webui/venv/lib/python3.10/site-packages/gradio/components/__init__.py", line 1, in <module>
    from gradio.components.annotated_image import AnnotatedImage
  File "/webui/venv/lib/python3.10/site-packages/gradio/components/annotated_image.py", line 12, in <module>
    from gradio import utils
  File "/webui/venv/lib/python3.10/site-packages/gradio/utils.py", line 353, in <module>
    class AsyncRequest:
  File "/webui/venv/lib/python3.10/site-packages/gradio/utils.py", line 372, in AsyncRequest
    client = httpx.AsyncClient()
  File "/webui/venv/lib/python3.10/site-packages/httpx/_client.py", line 1397, in __init__
    self._transport = self._init_transport(
  File "/webui/venv/lib/python3.10/site-packages/httpx/_client.py", line 1445, in _init_transport
    return AsyncHTTPTransport(
  File "/webui/venv/lib/python3.10/site-packages/httpx/_transports/default.py", line 275, in __init__
    self._pool = httpcore.AsyncConnectionPool(
TypeError: AsyncConnectionPool.__init__() got an unexpected keyword argument 'socket_options'
Terminating task due to error: failed to run builder command "cd /webui && . venv/bin/activate && python -c 'from modules import shared_init, initialize; shared_init.initialize(); initialize.initialize()'"

Caused by:
    container exit status: 1
Runner failed with exception: task exited with failure, status = exit status: 1
Stopping app - uncaught exception raised locally: RemoteError('Image build for im-8pFRuirFJZ4P0wH3Lin4Fm failed with the exception:\ntask exited with failure, status = exit status: 1').
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /mnt/nvme_drive/Work/sd/modal/a1111modal/venv/bin/modal:8 in <module>                            │
│                                                                                                  │
│   7 │   sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])                         │
│ ❱ 8 │   sys.exit(main())                                                                         │
│   9                                                                                              │
│                                                                                                  │
│ /mnt/nvme_drive/Work/sd/modal/a1111modal/venv/lib/python3.10/site-packages/modal/__main__.py:9   │
│ in main                                                                                          │
│                                                                                                  │
│    8 │   setup_rich_traceback()                                                                  │
│ ❱  9 │   entrypoint_cli()                                                                        │
│   10                                                                                             │
│                                                                                                  │
│ /mnt/nvme_drive/Work/sd/modal/a1111modal/venv/lib/python3.10/site-packages/click/core.py:1157 in │
│ __call__                                                                                         │
│                                                                                                  │
│   1156 │   │   """Alias for :meth:`main`."""                                                     │
│ ❱ 1157 │   │   return self.main(*args, **kwargs)                                                 │
│   1158                                                                                           │
│                                                                                                  │
│ /mnt/nvme_drive/Work/sd/modal/a1111modal/venv/lib/python3.10/site-packages/typer/core.py:778 in  │
│ main                                                                                             │
│                                                                                                  │
│   777 │   ) -> Any:                                                                              │
│ ❱ 778 │   │   return _main(                                                                      │
│   779 │   │   │   self,                                                                          │
│                                                                                                  │
│ /mnt/nvme_drive/Work/sd/modal/a1111modal/venv/lib/python3.10/site-packages/typer/core.py:216 in  │
│ _main                                                                                            │
│                                                                                                  │
│   215 │   │   │   with self.make_context(prog_name, args, **extra) as ctx:                       │
│ ❱ 216 │   │   │   │   rv = self.invoke(ctx)                                                      │
│   217 │   │   │   │   if not standalone_mode:                                                    │
│                                                                                                  │
│ /mnt/nvme_drive/Work/sd/modal/a1111modal/venv/lib/python3.10/site-packages/click/core.py:1688 in │
│ invoke                                                                                           │
│                                                                                                  │
│   1687 │   │   │   │   with sub_ctx:                                                             │
│ ❱ 1688 │   │   │   │   │   return _process_result(sub_ctx.command.invoke(sub_ctx))               │
│   1689                                                                                           │
│                                                                                                  │
│ /mnt/nvme_drive/Work/sd/modal/a1111modal/venv/lib/python3.10/site-packages/click/core.py:1688 in │
│ invoke                                                                                           │
│                                                                                                  │
│   1687 │   │   │   │   with sub_ctx:                                                             │
│ ❱ 1688 │   │   │   │   │   return _process_result(sub_ctx.command.invoke(sub_ctx))               │
│   1689                                                                                           │
│                                                                                                  │
│ /mnt/nvme_drive/Work/sd/modal/a1111modal/venv/lib/python3.10/site-packages/click/core.py:1434 in │
│ invoke                                                                                           │
│                                                                                                  │
│   1433 │   │   if self.callback is not None:                                                     │
│ ❱ 1434 │   │   │   return ctx.invoke(self.callback, **ctx.params)                                │
│   1435                                                                                           │
│                                                                                                  │
│ /mnt/nvme_drive/Work/sd/modal/a1111modal/venv/lib/python3.10/site-packages/click/core.py:783 in  │
│ invoke                                                                                           │
│                                                                                                  │
│    782 │   │   │   with ctx:                                                                     │
│ ❱  783 │   │   │   │   return __callback(*args, **kwargs)                                        │
│    784                                                                                           │
│                                                                                                  │
│ /mnt/nvme_drive/Work/sd/modal/a1111modal/venv/lib/python3.10/site-packages/click/decorators.py:3 │
│ 3 in new_func                                                                                    │
│                                                                                                  │
│    32 │   def new_func(*args: "P.args", **kwargs: "P.kwargs") -> "R":                            │
│ ❱  33 │   │   return f(get_current_context(), *args, **kwargs)                                   │
│    34                                                                                            │
│                                                                                                  │
│ /mnt/nvme_drive/Work/sd/modal/a1111modal/venv/lib/python3.10/site-packages/modal/cli/run.py:145  │
│ in f                                                                                             │
│                                                                                                  │
│   144 │   │                                                                                      │
│ ❱ 145 │   │   with run_stub(                                                                     │
│   146 │   │   │   stub,                                                                          │
│                                                                                                  │
│ /mnt/nvme_drive/Work/sd/modal/a1111modal/venv/lib/python3.10/site-packages/synchronicity/synchro │
│ nizer.py:497 in proxy_method                                                                     │
│                                                                                                  │
│ /mnt/nvme_drive/Work/sd/modal/a1111modal/venv/lib/python3.10/site-packages/synchronicity/combine │
│ d_types.py:26 in __call__                                                                        │
│                                                                                                  │
│ /usr/lib/python3.10/contextlib.py:199 in __aenter__                                              │
│                                                                                                  │
│   198 │   │   try:                                                                               │
│ ❱ 199 │   │   │   return await anext(self.gen)                                                   │
│   200 │   │   except StopAsyncIteration:                                                         │
│                                                                                                  │
│ /mnt/nvme_drive/Work/sd/modal/a1111modal/venv/lib/python3.10/site-packages/modal/runner.py:140   │
│ in _run_stub                                                                                     │
│                                                                                                  │
│   139 │   │   │   exc_info = e                                                                   │
│ ❱ 140 │   │   │   raise e                                                                        │
│   141 │   │   finally:                                                                           │
│                                                                                                  │
│ /mnt/nvme_drive/Work/sd/modal/a1111modal/venv/lib/python3.10/site-packages/modal/runner.py:94 in │
│ _run_stub                                                                                        │
│                                                                                                  │
│    93 │   │   │   # Create all members                                                           │
│ ❱  94 │   │   │   await app._create_all_objects(                                                 │
│    95 │   │   │   │   stub._indexed_objects, app_state, environment_name, shell=shell, output_   │
│                                                                                                  │
│ /mnt/nvme_drive/Work/sd/modal/a1111modal/venv/lib/python3.10/site-packages/modal/app.py:106 in   │
│ _create_all_objects                                                                              │
│                                                                                                  │
│   105 │   │   │   │   existing_object_id = tag_to_object_id.get(tag)                             │
│ ❱ 106 │   │   │   │   await resolver.load(obj, existing_object_id)                               │
│   107 │   │   │   │   self._tag_to_object_id[tag] = obj.object_id                                │
│                                                                                                  │
│ /mnt/nvme_drive/Work/sd/modal/a1111modal/venv/lib/python3.10/site-packages/modal/_resolver.py:13 │
│ 1 in load                                                                                        │
│                                                                                                  │
│   130 │   │                                                                                      │
│ ❱ 131 │   │   return await cached_future                                                         │
│   132                                                                                            │
│                                                                                                  │
│ /mnt/nvme_drive/Work/sd/modal/a1111modal/venv/lib/python3.10/site-packages/modal/_resolver.py:10 │
│ 4 in loader                                                                                      │
│                                                                                                  │
│   103 │   │   │   │   # TODO(erikbern): do we need existing_object_id for those?                 │
│ ❱ 104 │   │   │   │   await asyncio.gather(*[self.load(dep) for dep in obj.deps()])              │
│   105                                                                                            │
│                                                                                                  │
│ /mnt/nvme_drive/Work/sd/modal/a1111modal/venv/lib/python3.10/site-packages/modal/_resolver.py:13 │
│ 1 in load                                                                                        │
│                                                                                                  │
│   130 │   │                                                                                      │
│ ❱ 131 │   │   return await cached_future                                                         │
│   132                                                                                            │
│                                                                                                  │
│ /mnt/nvme_drive/Work/sd/modal/a1111modal/venv/lib/python3.10/site-packages/modal/_resolver.py:10 │
│ 7 in loader                                                                                      │
│                                                                                                  │
│   106 │   │   │   │   # Load the object itself                                                   │
│ ❱ 107 │   │   │   │   await obj._load(obj, self, existing_object_id)                             │
│   108 │   │   │   │   if existing_object_id is not None and obj.object_id != existing_object_i   │
│                                                                                                  │
│ /mnt/nvme_drive/Work/sd/modal/a1111modal/venv/lib/python3.10/site-packages/modal/image.py:318 in │
│ _load                                                                                            │
│                                                                                                  │
│    317 │   │   │   if result.status == api_pb2.GenericResult.GENERIC_STATUS_FAILURE:             │
│ ❱  318 │   │   │   │   raise RemoteError(f"Image build for {image_id} failed with the exception  │
│    319 │   │   │   elif result.status == api_pb2.GenericResult.GENERIC_STATUS_TERMINATED:        │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
RemoteError: Image build for im-8pFRuirFJZ4P0wH3Lin4Fm failed with the exception:
task exited with failure, status = exit status: 1

Example for GPT-3 Slack chatbot

Hi! I was looking for a way to build a GPT-3 Slack chatbot. I'd like the bot to work like this: in a Slack channel, respond to @mention. I heard good things about Modal and would like to try. Would you consider adding an example?

Function run, get billing costs

Hi modal team,

First of all thank you for such an amazing product, I am wondering if it is possible to retrieve the billing costs for a specific function run. If not, is it planned?

Enhancement: Improve the RAG example using Modal's docs

Hands down one of the best things I've used. Endless possibilities and surprisingly user-friendly.

It's so easy to use and distribute it's amazing. I've been playing with it for a few days and am already integrating cloud storage, and local file conversion with CUDA support, transcription, and LLMs.

I'm very spoiled due to ChatGPT handling a billion libraries but not this one.

Yesterday's hydration error got me to suggest an improved RAG implementation. It was my very first time dealing with this technology and the solution was nowhere to be found. It would be great to get a complex, permutable example on demand.

I humbly suggest adding Modal's documentation and GitHub examples to the RAG guide.

Using it in this way could improve the speed of development for newcomers, and lower compute costs during debugging.

Thank you for your consideration. Best of Luck!

pip3 installation in image

while creating a container Image, how to install python3 packages?

Error with Getting started example - "Can't serialize object <modal.functions._FunctionHandle"

I installed Modal in Jupyter Notebook, Win 10 machine and ran the following code:

!pip install modal-client
!modal token new

I was prompted to sign in a new browser, got confetti.

`
import modal

stub = modal.Stub("example-get-started")

@stub.function
def square(x):
print("This code is running on a remote worker!")
return x**2

@stub.local_entrypoint
def main():
print("the square is", square.call(42))
`
Received below error message:

Any idea what could be causing it ? Thanks.

Converting a class method directly to an endpoint

In order to avoid the model initialization, I used modal lifecyle.

@stub.cls(....)
class sample_cls() :
     def __enter__(self) : 
           ## initilization

     @method()
     def sample_func() :
          ## some work

Now if I directly want to create a web_endpoint out of this class method, how can I do that?

I tried putting @web_endpoint() decorator on the method as well as the class, both did not work.
I tried to create a definition out of the class's scope and put the @web_endpoint decorator, but it is asking me to first convert this definition into a @stub.function first (which i want to avoid for increased latency).

Please help, thanks.

AutoGPTQ Llama2 Example?

I've been struggling to get a Llama 2 with AutoGPTQ working, problem because I'm not building the image correctly. It would be very much appreciated if there was an example that showed the correct way to do this with the CUDA dependencies. Thanks!

AttributeError: module 'modal' has no attribute 'Stub'

When executing the first example:

import modal

stub = modal.Stub("example-hello-world")

@stub.function()
def square(x):
    print("This code is running on a remote worker!")
    return x**2


@stub.local_entrypoint()
def main():
    print("the square is", square.remote(42))

Running it with either python example.py or with modal run example.py, the following error raises:

AttributeError: module 'modal' has no attribute 'Stub'

Tried changing versions of the modal package and nothing worked.

I'm using windows 10 and a brand new conda environment.

controlnet_gradio_demos.py failed

I got this error when run modal serve 06_gpu_and_ml/controlnet/controlnet_gradio_demos.py

My python version is
Python 3.11.3 (main, May 15 2023, 18:01:31) [Clang 14.0.6 ] on darwin

  Building wheel for tokenizers (pyproject.toml): started
  Building wheel for tokenizers (pyproject.toml): finished with status 'error'
  error: subprocess-exited-with-error

  × Building wheel for tokenizers (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> [51 lines of output]
      running bdist_wheel
      running build
      running build_py
      creating build
      creating build/lib.linux-x86_64-cpython-311
      creating build/lib.linux-x86_64-cpython-311/tokenizers
      copying py_src/tokenizers/__init__.py -> build/lib.linux-x86_64-cpython-311/tokenizers
      creating build/lib.linux-x86_64-cpython-311/tokenizers/models
      copying py_src/tokenizers/models/__init__.py -> build/lib.linux-x86_64-cpython-311/tokenizers/models
      creating build/lib.linux-x86_64-cpython-311/tokenizers/decoders
      copying py_src/tokenizers/decoders/__init__.py -> build/lib.linux-x86_64-cpython-311/tokenizers/decoders
      creating build/lib.linux-x86_64-cpython-311/tokenizers/normalizers
      copying py_src/tokenizers/normalizers/__init__.py -> build/lib.linux-x86_64-cpython-311/tokenizers/normalizers
      creating build/lib.linux-x86_64-cpython-311/tokenizers/pre_tokenizers
      copying py_src/tokenizers/pre_tokenizers/__init__.py -> build/lib.linux-x86_64-cpython-311/tokenizers/pre_tokenizers

removal of a shared volume

Hi all,

is there any way to remove a shared volume?

Dreambooth example is missing scheduler_config.json?

modal run dreambooth_app.py
gradio url live
prompt sent...
image returned

Error no file named scheduler_config.json found in directory /model.

Traceback (most recent call last):
  File "/pkg/modal/_container_entrypoint.py", line 346, in handle_user_exception
    yield
  File "/pkg/modal/_container_entrypoint.py", line 510, in call_function_sync
    enter_res = enter_method()
  File "/root/dreambooth_app.py", line 281, in __enter__
    ddim = DDIMScheduler.from_pretrained(MODEL_DIR, subfolder="scheduler")
  File "/root/src/diffusers/schedulers/scheduling_utils.py", line 134, in from_pretrained
    config, kwargs = cls.load_config(
  File "/root/src/diffusers/configuration_utils.py", line 320, in load_config
    raise EnvironmentError(
OSError: Error no file named scheduler_config.json found in directory /model.
Runner failed with exception: OSError('Error no file named scheduler_config.json found in directory /model.')

AttributeError: module 'modal' has no attribute 'forward'

while running https://modal.com/docs/examples/serve_streamlit#sessions

Is it possible to run an application for photogrammetry (creating 3D models from images)?

Is it possible to run an application for photogrammetry (creating 3D models from images)? For example, I use Agisoft Metashape, they have a client for Linux.

@web_endpoint decorator on class method not producing endpoint

import os
from typing import Dict

from modal import Image, Secret, Stub, method, gpu, web_endpoint


def download_model_to_folder():
    from huggingface_hub import snapshot_download

    snapshot_download(
        "Phind/Phind-CodeLlama-34B-v2",
        # "codellama/CodeLlama-13b-Instruct-hf",
        local_dir="/model",
        token=os.environ["HUGGINGFACE_TOKEN"],
    )


MODEL_DIR = "/model"

image = (
    Image.from_dockerhub("nvcr.io/nvidia/pytorch:22.12-py3")
    .pip_install(
        "torch==2.0.1", index_url="https://download.pytorch.org/whl/cu118"
    )
    # Pinned to 08/15/2023
    .pip_install(
        "vllm @ git+https://github.com/vllm-project/vllm.git@main",
        "typing-extensions==4.5.0",  # >=4.6 causes typing issues
    )
    .pip_install("hf-transfer~=0.1")
    .env({"HF_HUB_ENABLE_HF_TRANSFER": "1"})
    .run_function(
        download_model_to_folder,
        secret=Secret.from_name("huggingface"),
        timeout=60 * 20,
    )
)

stub = Stub("phind", image=image)


@stub.cls(gpu=gpu.A100(memory=40, count=4), secret=Secret.from_name("huggingface"))
class Model:
    def __enter__(self):
        from vllm import LLM

        # Load the model. Tip: MPT models may require `trust_remote_code=true`.
        self.llm = LLM(MODEL_DIR, tensor_parallel_size=4)
        self.template = "<<SYS>>\n{system}\n<</SYS>>\n\n[INST]{user}[/INST]"

    @web_endpoint(method="POST")
    def generate(self, params: Dict):
        from vllm import SamplingParams

        print(f"Received: {params}")
        prompts = [
            self.template.format(system="", user=params['prompt'])
        ]
        sampling_params = SamplingParams(
            temperature=0,
            top_p=1,
            max_tokens=800,
            presence_penalty=1.15,
        )
        result = self.llm.generate(prompts, sampling_params)
        num_tokens = 0
        for output in result:
            num_tokens += len(output.outputs[0].token_ids)
            print(output.prompt, output.outputs[0].text, "\n\n", sep="")
        print(f"Generated {num_tokens} tokens")


@stub.local_entrypoint()
def main():
    model = Model()

Upon running modal serve vllm_modal_phind.py:

✓ Initialized. View app at https://modal.com/apps/ap-EYh6LtFTbbGNwPObPqC0oD
✓ Created objects.
├── 🔨 Created download_model_to_folder.
├── 🔨 Created mount /home/user/src/project/scratch/vllm_modal_phind.py
├── 🔨 Created Model.generate.
└── 🔨 Created mount /home/user/src/project/scratch/vllm_modal_phind.py
️️⚡️ Serving... hit Ctrl-C to stop!
└── Watching /home/user/src/phind/scratch.
⠴ Running app...

No endpoint?

Issues I read before:
App/Stub URL #346
Converting a class method directly to an endpoint #310

NotFoundError: App not found

Hi, I am trying to run the eg of Llama-2 and it runs fine but the app stops my goal is to deploy the llama-2 API that I can use with the front end. So is Modal.com a good choice for me? currently, I can see an "example-tgi-Llama-2-70b-chat-hf" app in my Modal dashboard, But when I try to run the run.py script it says "NotFoundError: App not found"

I don't code in Python but could someone help me?

run.py:

import modal

f = modal.Function.lookup("example-tgi-Llama-2-70b-chat-hf", "Model.generate")
f.remote("What is the story about the fox and grapes?")

Whisper example - Missing last few sentences.

Just trying out this on a podcast I listen to and the transcription seems to be always missing the last couple of sentences.

e.g.

https://modal-labs--whisper-pod-transcriber-fastapi-app.modal.run/#/episode/4861529/d3aa80319a5ee09634115b2d97472a9b

how to imports packages from a cloned repo in modal?

import sys
import modal
from typing import Dict
from modal import Image,Volume, web_endpoint, enter, method

project_image = (Image.debian_slim(python_version="3.10")
                .apt_install("git","ffmpeg","libsm6","libxext6")
				.pip_install_from_requirements("requirements.txt")
    			.run_commands("git clone https://github.com/Arslan-Mehmood1/cog_Wav2Lip_COLAB_working.git",
                              "cd cog_Wav2Lip_COLAB_working && pip install -r requirements.txt"))


vol = Volume.persisted("wav2lip_volume")
stub =  modal.Stub("wav2lip")

with project_image.imports():
    # import importlib
    # import sys
    # sys.path.append("cog_Wav2Lip_COLAB_working")    
    # module_name = "cog_Wav2Lip_COLAB_working"
    
    # # Import the module dynamically
    # wav2lip = importlib.import_module(module_name)
    # import git
    import gdown
    import zipfile
    import argparse
    import math
    import os
    import platform
    import subprocess

    import cv2
    import numpy as np
    import torch
    from tqdm import tqdm

    import audio
    # # from face_detect import face_rect
    from models import Wav2Lip

    from batch_face import RetinaFace
    from time import time


@stub.cls(gpu="T4", image=project_image, volumes={"/data": vol}, container_idle_timeout=300)
class Model:
    @enter()
    def enter(self, extract_to_directory="/data"):
        # Download the necessary checkpoints
        zip_file_url = "https://drive.google.com/file/d/1_AlLfRcu-82u9Wf0U2y8A4F5MamoNQnu/view?usp=drive_link"
        zip_file_path = os.path.join(extract_to_directory, 'ckpts.zip')
        gdown.download(url=zip_file_url, output=zip_file_path, quiet=False, fuzzy=True)
        # Create the target directory if it doesn't exist
        os.makedirs(extract_to_directory, exist_ok=True)
        # Extract contents to the target directory
        with zipfile.ZipFile(zip_file_path, 'r') as zip_ref:
            zip_ref.extractall(extract_to_directory)
        # Remove the downloaded ZIP file
        os.remove(zip_file_path)
        print(f"Successfully downloaded and extracted to '{extract_to_directory}'.")
        print("\n\n\t",os.listdir(extract_to_directory),"\n\n")
        print("\n\n\t",os.listdir(),"\n\n")

        vol.commit()

    @method()
    def predict():
        print("...Hi")

@stub.function(image=project_image, volumes={"/data": vol}, gpu="T4", container_idle_timeout=300)
@web_endpoint(method="POST",label="predict")
def main():

    response = Model().predict.remote()
    return "Hi"

(modal) arslan@Folio-1040-G2:~/nixense_vixion/To-Do/cog_wav2lip_modal$ modal serve modal_wav2lip.py
✓ Initialized. View run at https://modal.com/arslan-mehmood1/apps/ap-LqXMIgbHdlauqSsfbir7SW
✓ Created objects.
├── 🔨 Created mount /home/arslan/nixense_vixion/To-Do/cog_wav2lip_modal/modal_wav2lip.py
├── 🔨 Created Model.predict.
└── 🔨 Created main => https://arslan-mehmood1--predict-dev.modal.run
⚡️ Serving... hit Ctrl-C to stop!
└── Watching /home/arslan/nixense_vixion/To-Do/cog_wav2lip_modal.
Traceback (most recent call last):
File "/usr/local/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/local/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/pkg/modal/_container_entrypoint.py", line 850, in
main(container_args, client)
File "/pkg/modal/_container_entrypoint.py", line 805, in main
container_app.hydrate_function_deps(imp_fun.function, dep_object_ids)
File "/pkg/synchronicity/synchronizer.py", line 497, in proxy_method
return wrapped_method(instance, *args, **kwargs)
File "/pkg/synchronicity/synchronizer.py", line 398, in f_wrapped
res = f(*args, **kwargs)
File "/pkg/modal/app.py", line 325, in hydrate_function_deps
obj._hydrate(object_id, self._client, metadata)
File "/pkg/modal/object.py", line 107, in _hydrate
self._hydrate_metadata(metadata)
File "/pkg/modal/image.py", line 164, in _hydrate_metadata
raise exc
File "/pkg/modal/image.py", line 1361, in imports
yield
File "/root/modal_wav2lip.py", line 36, in
import audio
ModuleNotFoundError: No module named 'audio'

The audio.py is present in the repo which I cloned.

Dreambooth example failing, probably due to dependency issue

tl;dr: dreambooth_app.py demo not running, would expect to run as is. Seems to be an issue with package versions but I can't figure out what the correct combination should be.

I'm trying to run the dreambooth_app.py (using: modal run dreambooth_app.py) but unfortunately I'm getting an error relating to the version of accelerate. I'm using the code exactly as is, I've just swapped in links to my own images.

Traceback (most recent call last):
  File "/pkg/modal/_container_entrypoint.py", line 351, in handle_input_exception
    yield
  File "/pkg/modal/_container_entrypoint.py", line 437, in run_inputs
    res = imp_fun.fun(*args, **kwargs)
  File "/root/dreambooth_app.py", line 206, in train
    from accelerate.utils import write_basic_config
  File "/usr/local/lib/python3.10/site-packages/accelerate/__init__.py", line 3, in <module>
    from .accelerator import Accelerator
  File "/usr/local/lib/python3.10/site-packages/accelerate/accelerator.py", line 33, in <module>
    from .checkpointing import load_accelerator_state, load_custom_state, save_accelerator_state, save_custom_state
  File "/usr/local/lib/python3.10/site-packages/accelerate/checkpointing.py", line 24, in <module>
    from .utils import (
  File "/usr/local/lib/python3.10/site-packages/accelerate/utils/__init__.py", line 119, in <module>
    from .megatron_lm import (
  File "/usr/local/lib/python3.10/site-packages/accelerate/utils/megatron_lm.py", line 32, in <module>
    from transformers.modeling_outputs import (
  File "/usr/local/lib/python3.10/site-packages/transformers/__init__.py", line 26, in <module>
    from . import dependency_versions_check
  File "/usr/local/lib/python3.10/site-packages/transformers/dependency_versions_check.py", line 57, in <module>
    require_version_core(deps[pkg])
  File "/usr/local/lib/python3.10/site-packages/transformers/utils/versions.py", line 117, in require_version_core
    return require_version(requirement, hint)
  File "/usr/local/lib/python3.10/site-packages/transformers/utils/versions.py", line 111, in require_version
    _compare_versions(op, got_ver, want_ver, requirement, pkg, hint)
  File "/usr/local/lib/python3.10/site-packages/transformers/utils/versions.py", line 44, in _compare_versions
    raise ImportError(
ImportError: accelerate>=0.20.3 is required for a normal functioning of this module, but found accelerate==0.19.0.
Try: pip install transformers -U or pip install -e '.[dev]' if you're working with git main

When I specify accelerate>=0.20.3 I get past the pip install part of the code but I get an error when subprocess is called:

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/xformers/ops/fmha/triton.py", line 17, in <module>
    from flash_attn.flash_attn_triton import (
ModuleNotFoundError: No module named 'flash_attn

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/root/examples/dreambooth/train_dreambooth.py", line 32, in <module>
    import diffusers
  File "/root/src/diffusers/__init__.py", line 35, in <module>
    from .models import (
  File "/root/src/diffusers/models/__init__.py", line 19, in <module>
    from .autoencoder_kl import AutoencoderKL
  File "/root/src/diffusers/models/autoencoder_kl.py", line 23, in <module>
    from .vae import Decoder, DecoderOutput, DiagonalGaussianDistribution, Encoder
  File "/root/src/diffusers/models/vae.py", line 22, in <module>
    from .unet_2d_blocks import UNetMidBlock2D, get_down_block, get_up_block
  File "/root/src/diffusers/models/unet_2d_blocks.py", line 18, in <module>
    from .attention import AttentionBlock
  File "/root/src/diffusers/models/attention.py", line 22, in <module>
    from .cross_attention import CrossAttention
  File "/root/src/diffusers/models/cross_attention.py", line 25, in <module>
    import xformers.ops
  File "/usr/local/lib/python3.10/site-packages/xformers/ops/__init__.py", line 8, in <module>
    from .fmha import (
  File "/usr/local/lib/python3.10/site-packages/xformers/ops/fmha/__init__.py", line 10, in <module>
    from . import cutlass, decoder, flash, small_k, triton
  File "/usr/local/lib/python3.10/site-packages/xformers/ops/fmha/triton.py", line 39, in <module>
    flash_attn = import_module_from_path(
  File "/usr/local/lib/python3.10/site-packages/xformers/ops/fmha/triton.py", line 36, in import_module_from_path
    spec.loader.exec_module(module)
  File "<frozen importlib._bootstrap_external>", line 879, in exec_module
  File "<frozen importlib._bootstrap_external>", line 1016, in get_code
  File "<frozen importlib._bootstrap_external>", line 1073, in get_data
FileNotFoundError: [Errno 2] No such file or directory: '/root/third_party/flash-attention/flash_attn/flash_attn_triton.py'
Traceback (most recent call last):
  File "/usr/local/bin/accelerate", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.10/site-packages/accelerate/commands/accelerate_cli.py", line 45, in main
    args.func(args)
  File "/usr/local/lib/python3.10/site-packages/accelerate/commands/launch.py", line 979, in launch_command
    simple_launcher(args)
  File "/usr/local/lib/python3.10/site-packages/accelerate/commands/launch.py", line 628, in simple_launcher
    raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)

Installing flash_attn then I get an error about torch not being installed. So I then tried to install flash_attn after torch is installed and then I get the following error:

     Traceback (most recent call last):
        File "<string>", line 2, in <module>
        File "<pip-setuptools-caller>", line 34, in <module>
        File "/tmp/pip-install-ji7oryv0/flash-attn_4dc31d3a534f4bcbb9e25d023a0889bd/setup.py", line 108, in <module>
          raise_if_cuda_home_none("flash_attn")
        File "/tmp/pip-install-ji7oryv0/flash-attn_4dc31d3a534f4bcbb9e25d023a0889bd/setup.py", line 55, in raise_if_cuda_home_none
          raise RuntimeError(
      RuntimeError: flash_attn was requested, but nvcc was not found.  Are you sure your environment has nvcc available?  If you're installing within a container from https://hub.docker.com/r/pytorch/pytorch, only images whose names contain 'devel' will provide nvcc.

      Warning: Torch did not find available GPUs on this system.
       If your intention is to cross-compile, this is not an error.
      By default, Apex will cross-compile for Pascal (compute capabilities 6.0, 6.1, 6.2),
      Volta (compute capability 7.0), Turing (compute capability 7.5),
      and, if the CUDA version is >= 11.0, Ampere (compute capability 8.0).
      If you wish to cross-compile for a single specific architecture,
      export TORCH_CUDA_ARCH_LIST="compute capability" before running setup.py.`

Anyway, I'm sure the issue is due to some sort of mismatch in versions but I'm not able to figure out the correct combinaton. Does anyone happen to know what the correct setup is? Or is there some other underlying issue?

modal token new - error

Hello

Apologies if this is not the correct repo to post such an issue.

When I run modal token new I get the following error:

Traceback (most recent call last):
  File "/home/anton/git/developer/.venv/bin/modal", line 5, in <module>
    from modal.__main__ import main
  File "/home/anton/git/developer/.venv/lib/pypy3.9/site-packages/modal/__init__.py", line 6, in <module>
    from .dict import Dict
  File "/home/anton/git/developer/.venv/lib/pypy3.9/site-packages/modal/dict.py", line 9, in <module>
    from ._serialization import deserialize, serialize
  File "/home/anton/git/developer/.venv/lib/pypy3.9/site-packages/modal/_serialization.py", line 6, in <module>
    import cloudpickle
  File "/home/anton/git/developer/.venv/lib/pypy3.9/site-packages/cloudpickle/__init__.py", line 4, in <module>
    from cloudpickle.cloudpickle import *  # noqa
  File "/home/anton/git/developer/.venv/lib/pypy3.9/site-packages/cloudpickle/cloudpickle.py", line 57, in <module>
    from .compat import pickle
  File "/home/anton/git/developer/.venv/lib/pypy3.9/site-packages/cloudpickle/compat.py", line 13, in <module>
    from _pickle import Pickler  # noqa: F401
ModuleNotFoundError: No module named '_pickle'

I am using the following python version:

Python 3.9.16 (feeb267ead3e6771d3f2f49b83e1894839f64fb7, Dec 29 2022, 14:23:21)

How can I pass the tokens in the modal function lookup?

If I am calling a modal function from a Django backend, is there any I can set the API tokens ID and Secret in the call of the function lookup? Because I would be using digital ocean apps for the Djano backend, which might not give me access to the console letting me set up the tokens.

Or do you have to set up the tokens beforehand instead of calling them programmatically?

combining injob queue and model loading class

hey is there any way we can set up a ML web app API utilizing the injob queue and model loading instance such that that when a API call comes a job id is returned , this job id initializes a model class in which a ML model is loaded into memory and utilized, now if another API call comes within this moment it utilizes the same initiated model rather than loading another model.

How are you meant to use from_dockerfile?

So i love modal, but i'm using pipenv and so can't use the pip install/poetry install functions, i've experimented with using a few run commands like, copying the pipfiles, then pipenv installing them, but they seem to run everytime i deploy even if i haven't changed either file, but anyway those do technically work but it just takes forever bc it installs every python dependency every time, also i would prefer to be able to specify stuff in the dockerfile anyway.

If i just do from_dockerfile with a dockerfile like such

FROM python:3.11.5-bookworm

RUN apt-get update && apt-get install -y --no-install-recommends \
    build-essential \
    && rm -rf /var/lib/apt/lists/*

RUN pip install pipenv

WORKDIR /app
COPY Pipfile Pipfile.lock /app/

RUN pipenv install --categories='packages cluster' --deploy --system

COPY . /app/

That part works but i notice that there's a second step that happens after this where pip installs modal requirements, and after that all of my fastapi stuff breaks, so right now i have just literally added a pipenv comand to install over them at the end and that does work

image = modal.Image.from_dockerfile(
    "modal.Dockerfile", context_mount=assets
).run_commands("pipenv install --categories='packages cluster' --deploy --system")

But this feels so inelegant? Just wondering how you'd recommend i do this, thanks!

Where do the Modal Podcast Transcriber Podchaser API keys go?

Greetings! I have the example Modal Podcast Transcriber example running locally AND have the Podchaser API keys. However, it is unclear WHERE these keys need to go.

Does the Modal Secret code stub need to be added to a config file?
Or does the Modal Secret only work if the project is hosted on Modal.com?

Thank you!

`pip_install` from public github repo

Is it possible to pip_install from a public github repo? It seems like it should be the same syntax as a requirements.txt but this hasn't worked for me.

Path is a not a regular file - modal volume get wav2lip_volume results/result.mp4

(modal) arslan@Folio-1040-G2:~/nixense_vixion/To-Do/cog_wav2lip_modal$ modal volume ls wav2lip_volume results
Directory listing of 'results' in 'wav2lip_volume'
┏━━━━━━━━━━━━━━━━━━━━┳━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━┓
┃ filename           ┃ type ┃ created/modified          ┃ size ┃
┡━━━━━━━━━━━━━━━━━━━━╇━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━┩
│ results/result.mp4 │ dir  │ 2024-01-22 20:21:49+05:00 │ 6 B  │
└────────────────────┴──────┴───────────────────────────┴──────┘
(modal) arslan@Folio-1040-G2:~/nixense_vixion/To-Do/cog_wav2lip_modal$ modal volume get wav2lip_volume results/result.mp4
[20:34:04] Requesting results/result.mp4                                                                                                                 volume.py:229
Usage: modal volume get [OPTIONS] VOLUME_NAME REMOTE_PATH [LOCAL_DESTINATION]
Try 'modal volume get --help' for help.
╭─ Error ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ Path is a not a regular file: b'results/result.mp4'                                                                                                                │
╰──────────────────────────────────────────────────────────────────────────

Llama 2 example?

70B Please? Pretty please?

pod transcriber torchaudio version

[error] When I locally served the pod transcriber example, the current pip install torchaudio version gives error

ERROR: Could not find a version that satisfies the requirement torchaudio==0.12.1 (from versions: 2.0.0, 2.0.1, 2.0.2, 2.1.0)
ERROR: No matching distribution found for torchaudio==0.12.1

[fix] Fixed by changing "torchaudio==0.12.1" to "torchaudio==2.1.0" (latest version) in main.py, line 37

https://modal-labs--whisper-pod-transcriber-fastapi-app.modal.run/ is also down right now, not sure if that's related to this error or not.

token for new account

your site seems to be down when prompting for new token by running "modal token new" from terminal ,also no info on how to get token on site

Possible bugs on diffusion example?

Not sure if I'm doing something wrong, but just trying to run these two examples as is using the latest code on main for the examples repo:

https://modal.com/docs/examples/stable_diffusion_xl

modal deploy stable_diffusion_xl.py and then visiting the website generates "modal-http: internal server error: user function failed with status Failure: NameError("name 'FastAPI' is not defined")"

https://modal.com/docs/examples/dreambooth_app

modal run dreambooth_app.py generates "ValueError: Could not find the operator torchvision::nms. Please make sure you have already registered the operator and (if registered from C++) loaded it via torch.ops.load_library."

Apologies if I'm doing something wrong. Really excited to play with these cool examples!

Whisper example - Non-English Audio

Hi! I tried the transcriber (https://modal-labs--whisper-pod-transcriber-fastapi-app.modal.run/) on non-English content, and it didn't work. Perhaps you could state on the site that only English content is supported or change the Whisper model to the one that allows for any language?

Also, a small typo on the Stable Diffusion page (https://modal.com/docs/guide/ex/stable_diffusion_cli) "Stable Ddiffusion"

Download/Export function for the 'whisper_pod_transcriber' instance?

The 'https://modal-labs--whisper-pod-transcriber-fastapi-app.modal.run/#/' is wonderful, but it only shows the transcript in webpage. Is there any plan to add the download function to it to let users download the srt format transcripts?

mixtral example

i think the mixtral tgi example is broken
`
2023-12-20T12:33:48.503017Z INFO text_generation_launcher: Files are already present on the host. Skipping download.

2023-12-20T12:33:49.729338Z INFO download: text_generation_launcher: Successfully downloaded weights.
2023-12-20T12:33:49.731995Z INFO shard-manager: text_generation_launcher: Starting shard rank=0
2023-12-20T12:33:49.732105Z INFO shard-manager: text_generation_launcher: Starting shard rank=1
2023-12-20T12:33:49.750997Z INFO shard-manager: text_generation_launcher: Starting shard rank=2
2023-12-20T12:33:49.751712Z INFO shard-manager: text_generation_launcher: Starting shard rank=3
2023-12-20T12:33:59.806288Z INFO shard-manager: text_generation_launcher: Waiting for shard to be ready... rank=1
2023-12-20T12:33:59.806311Z INFO shard-manager: text_generation_launcher: Waiting for shard to be ready... rank=2
2023-12-20T12:33:59.806288Z INFO shard-manager: text_generation_launcher: Waiting for shard to be ready... rank=0
2023-12-20T12:33:59.806325Z INFO shard-manager: text_generation_launcher: Waiting for shard to be ready... rank=3
2023-12-20T12:34:08.721881Z WARN text_generation_launcher: Disabling exllama v2 and using v1 instead because there are issues when sharding

2023-12-20T12:34:08.721948Z WARN text_generation_launcher: Disabling exllama v2 and using v1 instead because there are issues when sharding

2023-12-20T12:34:08.721884Z WARN text_generation_launcher: Disabling exllama v2 and using v1 instead because there are issues when sharding

2023-12-20T12:34:08.722307Z WARN text_generation_launcher: Disabling exllama v2 and using v1 instead because there are issues when sharding

2023-12-20T12:34:09.822779Z INFO shard-manager: text_generation_launcher: Waiting for shard to be ready... rank=0
2023-12-20T12:34:09.822815Z INFO shard-manager: text_generation_launcher: Waiting for shard to be ready... rank=2
2023-12-20T12:34:09.898596Z INFO shard-manager: text_generation_launcher: Waiting for shard to be ready... rank=1
2023-12-20T12:34:09.898623Z INFO shard-manager: text_generation_launcher: Waiting for shard to be ready... rank=3
2023-12-20T12:34:18.485392Z ERROR text_generation_launcher: Error when initializing model
Traceback (most recent call last):
File "/opt/conda/bin/text-generation-server", line 8, in
sys.exit(app())
File "/opt/conda/lib/python3.10/site-packages/typer/main.py", line 311, in call
return get_command(self)(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1157, in call
return self.main(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/typer/core.py", line 778, in main
return _main(
File "/opt/conda/lib/python3.10/site-packages/typer/core.py", line 216, in _main
rv = self.invoke(ctx)
File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1688, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/typer/main.py", line 683, in wrapper
return callback(**use_params) # type: ignore
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/cli.py", line 89, in serve
server.serve(
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/server.py", line 215, in serve
asyncio.run(
File "/opt/conda/lib/python3.10/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 636, in run_until_complete
self.run_forever()
File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 603, in run_forever
self._run_once()
File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 1909, in _run_once
handle._run()
File "/opt/conda/lib/python3.10/asyncio/events.py", line 80, in _run
self._context.run(self._callback, *self._args)

File "/opt/conda/lib/python3.10/site-packages/text_generation_server/server.py", line 161, in serve_inner
model = get_model(
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/models/init.py", line 310, in get_model
return FlashMixtral(
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/models/flash_mixtral.py", line 21, in init
super(FlashMixtral, self).init(
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/models/flash_mistral.py", line 318, in init
SLIDING_WINDOW_BLOCKS = math.ceil(config.sliding_window / BLOCK_SIZE)
TypeError: unsupported operand type(s) for /: 'NoneType' and 'int'

2023-12-20T12:34:18.498142Z ERROR text_generation_launcher: Error when initializing model
Traceback (most recent call last):
File "/opt/conda/bin/text-generation-server", line 8, in
sys.exit(app())
File "/opt/conda/lib/python3.10/site-packages/typer/main.py", line 311, in call
return get_command(self)(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1157, in call
return self.main(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/typer/core.py", line 778, in main
return _main(
File "/opt/conda/lib/python3.10/site-packages/typer/core.py", line 216, in _main
rv = self.invoke(ctx)
File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1688, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/typer/main.py", line 683, in wrapper
return callback(**use_params) # type: ignore
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/cli.py", line 89, in serve
server.serve(
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/server.py", line 215, in serve
asyncio.run(
File "/opt/conda/lib/python3.10/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 636, in run_until_complete
self.run_forever()
File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 603, in run_forever
self._run_once()
File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 1909, in _run_once
handle._run()
File "/opt/conda/lib/python3.10/asyncio/events.py", line 80, in _run
self._context.run(self._callback, *self._args)

File "/opt/conda/lib/python3.10/site-packages/text_generation_server/server.py", line 161, in serve_inner
model = get_model(
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/models/init.py", line 310, in get_model
return FlashMixtral(
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/models/flash_mixtral.py", line 21, in init
super(FlashMixtral, self).init(
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/models/flash_mistral.py", line 318, in init
SLIDING_WINDOW_BLOCKS = math.ceil(config.sliding_window / BLOCK_SIZE)
TypeError: unsupported operand type(s) for /: 'NoneType' and 'int'

2023-12-20T12:34:18.510689Z ERROR text_generation_launcher: Error when initializing model
Traceback (most recent call last):
File "/opt/conda/bin/text-generation-server", line 8, in
sys.exit(app())
File "/opt/conda/lib/python3.10/site-packages/typer/main.py", line 311, in call
return get_command(self)(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1157, in call
return self.main(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/typer/core.py", line 778, in main
return _main(
File "/opt/conda/lib/python3.10/site-packages/typer/core.py", line 216, in _main
rv = self.invoke(ctx)
File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1688, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/typer/main.py", line 683, in wrapper
return callback(**use_params) # type: ignore
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/cli.py", line 89, in serve
server.serve(
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/server.py", line 215, in serve
asyncio.run(
File "/opt/conda/lib/python3.10/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 636, in run_until_complete
self.run_forever()
File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 603, in run_forever
self._run_once()
File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 1909, in _run_once
handle._run()
File "/opt/conda/lib/python3.10/asyncio/events.py", line 80, in _run
self._context.run(self._callback, *self._args)

File "/opt/conda/lib/python3.10/site-packages/text_generation_server/server.py", line 161, in serve_inner
model = get_model(
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/models/init.py", line 310, in get_model
return FlashMixtral(
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/models/flash_mixtral.py", line 21, in init
super(FlashMixtral, self).init(
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/models/flash_mistral.py", line 318, in init
SLIDING_WINDOW_BLOCKS = math.ceil(config.sliding_window / BLOCK_SIZE)
TypeError: unsupported operand type(s) for /: 'NoneType' and 'int'

2023-12-20T12:34:18.542525Z ERROR text_generation_launcher: Error when initializing model
Traceback (most recent call last):
File "/opt/conda/bin/text-generation-server", line 8, in
sys.exit(app())
File "/opt/conda/lib/python3.10/site-packages/typer/main.py", line 311, in call
return get_command(self)(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1157, in call
return self.main(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/typer/core.py", line 778, in main
return _main(
File "/opt/conda/lib/python3.10/site-packages/typer/core.py", line 216, in _main
rv = self.invoke(ctx)
File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1688, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/typer/main.py", line 683, in wrapper
return callback(**use_params) # type: ignore
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/cli.py", line 89, in serve
server.serve(
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/server.py", line 215, in serve
asyncio.run(
File "/opt/conda/lib/python3.10/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 636, in run_until_complete
self.run_forever()
File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 603, in run_forever
self._run_once()
File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 1909, in _run_once
handle._run()
File "/opt/conda/lib/python3.10/asyncio/events.py", line 80, in _run
self._context.run(self._callback, *self._args)

File "/opt/conda/lib/python3.10/site-packages/text_generation_server/server.py", line 161, in serve_inner
model = get_model(
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/models/init.py", line 310, in get_model
return FlashMixtral(
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/models/flash_mixtral.py", line 21, in init
super(FlashMixtral, self).init(
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/models/flash_mistral.py", line 318, in init
SLIDING_WINDOW_BLOCKS = math.ceil(config.sliding_window / BLOCK_SIZE)
TypeError: unsupported operand type(s) for /: 'NoneType' and 'int'

2023-12-20T12:34:19.909381Z INFO shard-manager: text_generation_launcher: Waiting for shard to be ready... rank=1
2023-12-20T12:34:19.909411Z INFO shard-manager: text_generation_launcher: Waiting for shard to be ready... rank=0
2023-12-20T12:34:19.909390Z INFO shard-manager: text_generation_launcher: Waiting for shard to be ready... rank=3
2023-12-20T12:34:19.909394Z INFO shard-manager: text_generation_launcher: Waiting for shard to be ready... rank=2
2023-12-20T12:34:20.412017Z ERROR shard-manager: text_generation_launcher: Shard complete standard error output:

╭───────────────────── Traceback (most recent call last) ──────────────────────╮
│ /opt/conda/lib/python3.10/site-packages/text_generation_server/cli.py:89 in │
│ serve │
│ │
│ 86 │ │ raise RuntimeError( │
│ 87 │ │ │ "Only 1 can be set between dtype and quantize, as they │
│ 88 │ │ ) │
│ ❱ 89 │ server.serve( │
│ 90 │ │ model_id, │
│ 91 │ │ revision, │
│ 92 │ │ sharded, │
│ │
│ /opt/conda/lib/python3.10/site-packages/text_generation_server/server.py:215 │
│ in serve │
│ │
│ 212 │ │ │ logger.info("Signal received. Shutting down") │
│ 213 │ │ │ await server.stop(0) │
│ 214 │ │
│ ❱ 215 │ asyncio.run( │
│ 216 │ │ serve_inner( │
│ 217 │ │ │ model_id, revision, sharded, quantize, speculate, dtype, t │
│ 218 │ │ ) │
│ │
│ /opt/conda/lib/python3.10/asyncio/runners.py:44 in run │
│ │
│ 41 │ │ events.set_event_loop(loop) │
│ 42 │ │ if debug is not None: │
│ 43 │ │ │ loop.set_debug(debug) │
│ ❱ 44 │ │ return loop.run_until_complete(main) │
│ 45 │ finally: │
│ 46 │ │ try: │
│ 47 │ │ │ _cancel_all_tasks(loop) │
│ │
│ /opt/conda/lib/python3.10/asyncio/base_events.py:649 in run_until_complete │
│ │
│ 646 │ │ if not future.done(): │
│ 647 │ │ │ raise RuntimeError('Event loop stopped before Future comp │
│ 648 │ │ │
│ ❱ 649 │ │ return future.result() │
│ 650 │ │
│ 651 │ def stop(self): │
│ 652 │ │ """Stop running the event loop. │
│ │
│ /opt/conda/lib/python3.10/site-packages/text_generation_server/server.py:161 │
│ in serve_inner │
│ │
│ 158 │ │ │ server_urls = [local_url] │
│ 159 │ │ │
│ 160 │ │ try: │
│ ❱ 161 │ │ │ model = get_model( │
│ 162 │ │ │ │ model_id, │
│ 163 │ │ │ │ revision, │
│ 164 │ │ │ │ sharded, │
│ │
│ /opt/conda/lib/python3.10/site-packages/text_generation_server/models/__init │
│ .py:310 in get_model │
│ │
│ 307 │ │
│ 308 │ if model_type == "mixtral": │
│ 309 │ │ if MIXTRAL: │
│ ❱ 310 │ │ │ return FlashMixtral( │
│ 311 │ │ │ │ model_id, │
│ 312 │ │ │ │ revision, │
│ 313 │ │ │ │ quantize=quantize, │
│ │
│ /opt/conda/lib/python3.10/site-packages/text_generation_server/models/flash │
│ mixtral.py:21 in init │
│ │
│ 18 │ │ dtype: Optional[torch.dtype] = None, │
│ 19 │ │ trust_remote_code: bool = False, │
│ 20 │ ): │
│ ❱ 21 │ │ super(FlashMixtral, self).init( │
│ 22 │ │ │ config_cls=MixtralConfig, │
│ 23 │ │ │ model_cls=FlashMixtralForCausalLM, │
│ 24 │ │ │ model_id=model_id, │
│ │
│ /opt/conda/lib/python3.10/site-packages/text_generation_server/models/flash │
│ mistral.py:318 in init │
│ │
│ 315 │ │ │
│ 316 │ │ # Set context windows │
│ 317 │ │ SLIDING_WINDOW = config.sliding_window │
│ ❱ 318 │ │ SLIDING_WINDOW_BLOCKS = math.ceil(config.sliding_window / BLOC │
│ 319 │ │ │
│ 320 │ │ torch.distributed.barrier(group=self.process_group) │
│ 321 │
╰──────────────────────────────────────────────────────────────────────────────╯
TypeError: unsupported operand type(s) for /: 'NoneType' and 'int' rank=0
2023-12-20T12:34:20.460680Z ERROR text_generation_launcher: Shard 0 failed to start
2023-12-20T12:34:20.460780Z INFO text_generation_launcher: Shutting down shards
2023-12-20T12:34:20.513445Z ERROR shard-manager: text_generation_launcher: Shard complete standard error output:

Huggingface Token

Created huggingface-secrets at: https://modal.com/secrets

Executed the below commands in the CLI:

export HUGGINGFACE_TOKEN=huggingface-secrets
git clone https://github.com/modal-labs/modal-examples
cd modal-examples
modal run 06_gpu_and_ml/vllm_inference.py

got the below error:

NotFoundError: No secret named huggingface - you can add secrets to your account at https://modal.com/secrets

MongoDB

Is there any documented process by which I can have a modal container running, for instance, MongoDB and then connect this to another modal python file running in a separate container.

To do this I need to know the url or IP of the instance running the DB. I imagine this might be done using a Dockerfile but I was wondering if this is supported through the modal api?

can't not get experience token int web

hello, I want to experience modal system. according to the readme, i need a token, so I log in with github account, but the settings page is always automatically jump to waitlist page. please help me, thanks!

depreciated whisper link in pod transciber

hey the link used in pod_transcriber
has a depreciated

 pip_install(
        "https://github.com/openai/whisper/archive/9f70a352f9f8630ab3aa0d06af5cb9532bd8c21d.tar.gz",

which gives error while transcribing audios ,so instead to installing from this you can simply use

pip_install(
        "git+https://github.com/openai/whisper.git",

which is working perfectly

youtube_face_detection.py is failing

Due to the issue described here: pytube/pytube#1768

It looks like YouTube has changed their backend API.

Improve RAG example using Modal docs

Hands down one of the best things I've used. Endless possibilities and surprisingly beginner-friendly.

It's so easy to use and distribute it's amazing. I've been playing with it for a few days and am already integrating cloud storage, local file conversion with CUDA support, transcription, and LLMs.

I'm very spoiled due to ChatGPT handling a billion libraries but not this one.

I humbly suggest adding Modal documentation, and GitHub examples to the RAG guide.

Using documentation RAG this way would improve the development speed and lower compute costs during initial learning and debugging.

Thank you for your consideration. Best of Luck!

WebSocket Implementation does not work

I'm using a FastAPI app and encountered difficulties while implementing WebSockets. Is WebSocket support available in Modal? If so, are there any specific considerations or limitations to be aware of?

import logging
from typing import Dict
from fastapi import FastAPI, WebSocketDisconnect, websockets
from fastapi.websockets import WebSocket
from modal import Stub, Image, Secret, asgi_app

web_app = FastAPI()
agent_image = Image.debian_slim().pip_install(
    "langchain", "openai", "google-search-results", "google-api-python-client")

stub = Stub("sellerAgent", image=agent_image, secrets=[Secret.from_name(
    "my-openai-secret"), Secret.from_name("my-googlecloud-secret")])

@web_app.get("/")
async def read_main():
    return {"msg": "Hello World"}


@web_app.websocket('/ws')
async def websocket_endpoint(websocket: WebSocket):
    await websocket.accept()
    while True:
        try:
            # Receive and send back the client message
            user_msg = await websocket.receive_text()

            print(user_msg)

        except WebSocketDisconnect:
            logging.info("WebSocketDisconnect")
            # TODO try to reconnect with back-off
            break
        except websockets.ConnectionClosedOK:
            logging.info("ConnectionClosedOK")
            # TODO handle this?
            break

        except Exception as e:
            logging.error(e)
            
@stub.function()
@asgi_app()
def fastapi_app():
    return web_app

App/Stub URL

I really like the intent of the modal framework, however I cannot figure out how to determine the url of an app that has been started like:

modal serve modal_start.py

I've looked at the docs that state about the url construction include the https://[user-id]--[stub_name]-[function_name].modal.run

But no combination seems to work for me. Couldn't the public url be output on the command line - or via another tool or on the dashboard.

Any help gratefully received.

Thanks.

Fast inference with vLLM (Mixtral 8x7B) example

I have tried running this example a few times but I keep running into the following error:

`
Traceback (most recent call last):
File "/pkg/modal/_container_entrypoint.py", line 346, in handle_user_exception
yield
File "/pkg/modal/_container_entrypoint.py", line 575, in call_function_async
enter_res = enter_method()
File "/root/vllm.py", line 91, in enter
from vllm.engine.arg_utils import AsyncEngineArgs
ModuleNotFoundError: No module named 'vllm.engine'; 'vllm' is not a package
Runner failed with exception: ModuleNotFoundError("No module named 'vllm.engine'; 'vllm' is not a package")
Stopping app - uncaught exception raised locally: ModuleNotFoundError("No module named 'vllm.engine'; 'vllm' is not a package").

ModuleNotFoundError: No module named 'vllm.engine'; 'vllm' is not a package
`

Should vllm be installed locally?

Shouldn't the script be executed within modal cloud's container instances?

Thank you for your time.

Dreambooth example - Unexpected behavior when calling a stub function directly

Versions

OS: ubuntu
modal-client: 0.48.1946

Intro

When playing around with dreambooth example, the documentation said you can kick off a training job with the command modal run dreambooth_app.py::stub.train. When doing so, you may encounter an error as AttributeError: 'str' object has no attribute 'xxx'.

Reproduction

The reproduction code is simple, just have a snippet as

import modal

from dataclasses import dataclass

stub = modal.Stub('dataclass-test')


@dataclass
class SharedConfig:
    a: str = 'a'


@stub.function()
def test(config=SharedConfig()):
    print(f'Type: {type(config)}, Content: {config}')


@stub.local_entrypoint()
def main():
    test.call()

when running command modal run test.py, you may have normal output as

ubuntu:~/projects/modal_test$ modal run test.py 
✓ Initialized. View app at xxx
✓ Created objects.
├── 🔨 Created test.
├── 🔨 Created mount /home/ubuntu/projects/modal_test/test.py
└── 🔨 Created fix.
Type: <class 'test.SharedConfig'>, Content: SharedConfig(a='a')
✓ App completed.

when kicking off the stub function directly modal run test.py::stub.test, you may notice that config became a string.

ubuntu:~/projects/modal_test$ modal run test.py::stub.test
✓ Initialized. View app at xxx
✓ Created objects.
├── 🔨 Created test.
├── 🔨 Created mount /home/ubuntu/projects/modal_test/test.py
└── 🔨 Created fix.
Type: <class 'str'>, Content: SharedConfig(a='a')
✓ App completed.

From my point of view, I think this behavior is unexpected.

Workaround

A fix workaround is rather simple, just initialize the config inside the function rather sent as a parameter.

@stub.function()
def fix():
    config = SharedConfig()
    print(f'Type: {type(config)}, Content: {config}')

Now, running stub function modal run test.py::stub.fix will behave correctly.

ubuntu:~/projects/modal_test$ modal run test.py::stub.fix
✓ Initialized. View app at xxx
✓ Created objects.
├── 🔨 Created test.
├── 🔨 Created mount /home/ubuntu/projects/modal_test/test.py
└── 🔨 Created fix.
Type: <class 'test.SharedConfig'>, Content: SharedConfig(a='a')
✓ App completed.

However, I thought this might worth examining and I guessed it might be something related to serialization ? (not sure what really happened under the hood)

text_generation_inference.py - Function' object has no attribute 'remote'

I'm trying to use the text_generation_inference.py example and getting an AttributeError: 'Function' object has no attribute 'remote'

future: <Task finished name='Task-231' coro=<FastAPI.call() done, defined at /opt/conda/lib/python3.9/site-packages/fastapi/applications.py:267> exception=AttributeError("'Function' object has no attribute 'remote'")>
Traceback (most recent call last):

support for private dockerhub

is there any way we can docker pull from private dockerhub repos while defining Image.from_registry?

`sdxl_train.py` Dreambooth - `NVML_SUCCESS == r INTERNAL ASSERT FAILED at "../c10/cuda/CUDACachingAllocator.cpp":1154, please report a bug to PyTorch.`

This is my first time trying to fine tune a stable diffusion model so there's a lot I don't understand yet.

Full stack:

Traceback (most recent call last):
  File "/root/sd-scripts/sdxl_train.py", line 753, in <module>
    train(args)
  File "/root/sd-scripts/sdxl_train.py", line 567, in train
    accelerator.backward(loss)
  File "/usr/local/lib/python3.10/site-packages/accelerate/accelerator.py", line 1983, in backward
    self.scaler.scale(loss).backward(**kwargs)
  File "/usr/local/lib/python3.10/site-packages/torch/_tensor.py", line 492, in backward
    torch.autograd.backward(
  File "/usr/local/lib/python3.10/site-packages/torch/autograd/__init__.py", line 251, in backward
    Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
  File "/usr/local/lib/python3.10/site-packages/torch/autograd/function.py", line 288, in apply
    return user_fn(self, *args)
  File "/usr/local/lib/python3.10/site-packages/torch/utils/checkpoint.py", line 271, in backward
    outputs = ctx.run_function(*detached_inputs)
  File "/root/sd-scripts/library/sdxl_original_unet.py", line 643, in custom_forward
    return func(*inputs)
  File "/root/sd-scripts/library/sdxl_original_unet.py", line 633, in forward_body
    hidden_states = self.ff(self.norm3(hidden_states)) + hidden_states
  File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/root/sd-scripts/library/sdxl_original_unet.py", line 577, in forward
    hidden_states = module(hidden_states)
  File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/root/sd-scripts/library/sdxl_original_unet.py", line 556, in forward
    return hidden_states * self.gelu(gate)
RuntimeError: NVML_SUCCESS == r INTERNAL ASSERT FAILED at "../c10/cuda/CUDACachingAllocator.cpp":1154, please report a bug to PyTorch.

Running this on modal.com A10G function with this command:

accelerate launch /root/sd-scripts/sdxl_train.py \
--pretrained_model_name_or_path=/vol/sd_xl_base_1.0.safetensors \
--dataset_config=/vol/d835023a-4ed5-4269-b323-9952930d3786/training_config.toml \
--output_dir=/vol/d835023a-4ed5-4269-b323-9952930d3786 \
--train_batch_size=1 \
--output_name=model.safetensors \
--save_model_as=safetensors \
--learning_rate=4e-7 \
--optimizer_type=adafactor \
--xformers \
--mixed_precision=fp16 \
--cache_latents \
--cache_text_encoder_outputs \
--gradient_checkpointing \
--no_half_vae

Docker image setup:

     Image.debian_slim(python_version="3.10")
    .apt_install(["ffmpeg"])
    .pip_install(
        "accelerate==0.23",
        "boto3",
        "opencv-python-headless",
        "pytorch-lightning",
        "tensorboard",
        "safetensors",
        "toml",
        "voluptuous",
        "open-clip-torch",
        "huggingface-hub",
        "datasets~=2.13",
        "diffusers[torch]",
        "einops",
        "ftfy",
        "smart_open",
        "transformers==4.30.2",
        "torch==2.0.1",
        "torchvision",
        "torchaudio",
        "triton",
        "tomli-w",
    )
    .pip_install("xformers", pre=True)

training config is this converted to TOML:

{
        'general': {
            'enable_bucket': True
        },
        'datasets': [
            {
                'resolution': 1024,
                'batch_size': 1,
                'subsets': [
                    {
                        'image_dir': f"/vol/d835023a-4ed5-4269-b323-9952930d3786/training_images",
                        'class_tokens': 'shs face',
                        'keep_tokens': 2,
                    },
                ]
            },
        ]
    }

TGI_inference only allowed 40ram for A100

Hi there!

I was experimenting and trying out the new text-generation-inference example/tutorial. The GPU config states to use 2 80ram A100 gpu's but Modal seems to only allow 20 and 40?

Awesome product though and really enjoying playing around!
Best,
Caesar

nfs model upload timeout

having timeout issue when uploading large model on modal nfs put vol_name, any workaround on this?

Modal lookup for class function

Hi Team,

How can we do a modal lookup for a stub function defined within a class?

I am referring to your HuggingFace batch inference example - https://modal.com/docs/guide/ex/batch_inference_using_huggingface

Any help is appreciated.

Thanks!