Git Product home page Git Product logo

audiosep's Introduction

Separate Anything You Describe

arXiv GitHub Stars githubio Open In Colab Hugging Face Spaces Replicate

This repository contains the official implementation of "Separate Anything You Describe".

We introduce AudioSep, a foundation model for open-domain sound separation with natural language queries. AudioSep demonstrates strong separation performance and impressive zero-shot generalization ability on numerous tasks, such as audio event separation, musical instrument separation, and speech enhancement. Check out the separated audio examples on the Demo Page!


Setup

Clone the repository and setup the conda environment:

git clone https://github.com/Audio-AGI/AudioSep.git && \
cd AudioSep && \ 
conda env create -f environment.yml && \
conda activate AudioSep

Download model weights at checkpoint/.

If you're using this checkpoint for the DCASE 2024 Task 9 challenge participation, please note that this checkpoint was trained using audio at 32k Hz, with a window size of 2048 points and a hop size of 320 points in the STFT operation, which is different with the challenge baseline system provided (16k Hz, window size 1024, hop size 160).


Inference

from pipeline import build_audiosep, inference
import torch

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

model = build_audiosep(
      config_yaml='config/audiosep_base.yaml', 
      checkpoint_path='checkpoint/audiosep_base_4M_steps.ckpt', 
      device=device)

audio_file = 'path_to_audio_file'
text = 'textual_description'
output_file='separated_audio.wav'

# AudioSep processes the audio at 32 kHz sampling rate  
inference(model, audio_file, text, output_file, device)

To load directly from Hugging Face, you can do the following:

from models.audiosep import AudioSep
from utils import get_ss_model
import torch

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

ss_model = get_ss_model('config/audiosep_base.yaml')

model = AudioSep.from_pretrained("nielsr/audiosep-demo", ss_model=ss_model)

audio_file = 'path_to_audio_file'
text = 'textual_description'
output_file='separated_audio.wav'

# AudioSep processes the audio at 32 kHz sampling rate  
inference(model, audio_file, text, output_file, device)

Use chunk-based inference to save memory:

inference(model, audio_file, text, output_file, device, use_chunk=True)

Training

To utilize your audio-text paired dataset:

  1. Format your dataset to match our JSON structure. Refer to the provided template at datafiles/template.json.

  2. Update the config/audiosep_base.yaml file by listing your formatted JSON data files under datafiles. For example:

data:
    datafiles:
        - 'datafiles/your_datafile_1.json'
        - 'datafiles/your_datafile_2.json'
        ...

Train AudioSep from scratch:

python train.py --workspace workspace/AudioSep --config_yaml config/audiosep_base.yaml --resume_checkpoint_path checkpoint/ ''

Finetune AudioSep from pretrained checkpoint:

python train.py --workspace workspace/AudioSep --config_yaml config/audiosep_base.yaml --resume_checkpoint_path path_to_checkpoint

Benchmark Evaluation

Download the evaluation data under the evaluation/data folder. The data should be organized as follows:

evaluation:
    data:
        - audioset/
        - audiocaps/
        - vggsound/
        - music/
        - clotho/
        - esc50/

Run benchmark inference script, the results will be saved at eval_logs/

python benchmark.py --checkpoint_path audiosep_base_4M_steps.ckpt

"""
Evaluation Results:

VGGSound Avg SDRi: 9.144, SISDR: 9.043
MUSIC Avg SDRi: 10.508, SISDR: 9.425
ESC-50 Avg SDRi: 10.040, SISDR: 8.810
AudioSet Avg SDRi: 7.739, SISDR: 6.903
AudioCaps Avg SDRi: 8.220, SISDR: 7.189
Clotho Avg SDRi: 6.850, SISDR: 5.242
"""

Cite this work

If you found this tool useful, please consider citing

@article{liu2023separate,
  title={Separate Anything You Describe},
  author={Liu, Xubo and Kong, Qiuqiang and Zhao, Yan and Liu, Haohe and Yuan, Yi, and Liu, Yuzhuo, and Xia, Rui and Wang, Yuxuan, and Plumbley, Mark D and Wang, Wenwu},
  journal={arXiv preprint arXiv:2308.05037},
  year={2023}
}
@inproceedings{liu22w_interspeech,
  title={Separate What You Describe: Language-Queried Audio Source Separation},
  author={Liu, Xubo and Liu, Haohe and Kong, Qiuqiang and Mei, Xinhao and Zhao, Jinzheng and Huang, Qiushi, and Plumbley, Mark D and Wang, Wenwu},
  year=2022,
  booktitle={Proc. Interspeech},
  pages={1801--1805},
}

Contributors :

audiosep's People

Contributors

0armaan025 avatar alienishi avatar badayvedat avatar bhargavshirin avatar chenxwh avatar eltociear avatar farookhnitap avatar kalyani2003 avatar liuxubo717 avatar nielsrogge avatar rs-labhub avatar shivam250702 avatar shresthasurav avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

audiosep's Issues

Unable to load music_speech_audioset model

I tried using the Colab notebook. The first model checkpoint loads without any issue, however, the second model checkpoint leads to an error during the model initialization. Below is the snippet of the code that downloads the model checkpoints and attempts to initialize the model:

model = build_audiosep(
    config_yaml='config/audiosep_base.yaml',
    checkpoint_path=str(models[1][1]),
)

Upon executing the model initialization, a KeyError related to pytorch-lightning_version is encountered, as shown below:

KeyError: 'pytorch-lightning_version'

Additionally, a warning concerning the initialization of RobertaModel with some weights not being used is thrown, although it's unclear if this warning is related to the KeyError.

Some weights of the model checkpoint at roberta-base were not used when initializing RobertaModel: ['lm_head.dense.bias', 'lm_head.decoder.weight', 'lm_head.dense.weight', 'lm_head.layer_norm.weight', 'lm_head.bias', 'lm_head.layer_norm.bias']
- This IS expected if you are initializing RobertaModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).

The issue seems to arise specifically with the second model checkpoint music_speech_audioset_epoch_15_esc_89.98.pt. I would appreciate any guidance or suggestions on how to resolve this KeyError and successfully load the second model checkpoint for further use.

Thank you.

What is the scope of "Anything"?

It is an interesting work and the task it aims to do is as exciting as SAM to me.
But I am not familar with audio research and I do have some questions related to this work.

Firstly, I checked the dataset amd it seems not very complete for "sound separation" or "separate anything in audio".
Actually I tried some samples for "separate vocal from songs", I found no matter use "Human Sounds" or "Vocal" the model cannot separate it even from a very slow and simple "guitar playing and singing" sample. And reversely I tried "acoustic guitar", it contains some vocal which is obvious.
Am I misunderstanding the scope that "songs" do not belong to music and the scope of this work?

Secondly, I would like to ask why it is foundation. It seems multimodal or multiple types of inputs = foundation model as I do not know what it provides for the "downstream tasks". Can someone provide me the insights?

multi-gpu support

Thank you for sharing this project. I am wondering if there is, or will be support for multi-GPU inference. Currently I am unable to run the inference on a 3090
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 5.50 GiB (GPU 0; 23.69 GiB total capacity; 15.94 GiB already allocated; 2.85 GiB free; 19.24 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

how to construct training sample pairs?

Nice work!

How do you construct the training pairs? It looks like you construct the training pairs in the SegmentMixer class. Do you use the same minibatch sources to construct the "mixture" and "target audio" pairs?

There might be one issue:
source1: male speech , s1
source2: another male speech, s2
if mixture = s1 + s2:
as both captions are "male speech", will it confuse the model training?

Is something broken? This is really bad.

Tried locally and didn't get much, then tried HF space and the model didn't really seperate anything. Was it overfit on the demo data? Barely any difference between input and output on a random song from my library with organ, drums, vocal.

How to do speech separation

Hi
I have an audio clip in which two people are talking and there is noise at the same time.. How can I separate the audio of two speaker separately? Thanks!

Conda failed to create environment

Machine: ASUS TUF Dash F15 FX517ZE_FX517ZE
OS: Windows 11 Education 23H2, 64 bit
Version: 10.0.22631 Build 22631
GPU: Nvidia RTX 3050

D:\AudioSep>conda env create -f environment.yml
Retrieving notices: ...working... done
Collecting package metadata (repodata.json): done
Solving environment: failed
Error:
ResolvePackageNotFound:

  • urllib3==1.26.14=py310h06a4308_0
  • jupyter_core==5.3.0=py310h06a4308_0
  • libcusolver==11.3.4.124=h33c3c4e_0
  • cuda-nvrtc==11.6.124=h020bade_0
  • ld_impl_linux-64==2.38=h1181459_1
  • libdeflate==1.17=h5eee18b_0
  • tornado==6.2=py310h5eee18b_0
  • cuda-nvml-dev==11.6.55=haa9ef22_0
  • gmp==6.2.1=h295c915_3
  • openh264==2.1.1=h4ff587b_0
  • pytorch==1.13.1=py3.10_cuda11.6_cudnn8.3.2_0
  • cuda-cuxxfilt==11.6.124=hecbf4f6_0
  • libcusparse-dev==11.7.2.124=hbbe9722_0
  • libcufft-dev==10.7.1.112=ha5ce4c0_0
  • zstd==1.5.4=hc292b87_0
  • freetype==2.12.1=h4a9f257_0
  • gnutls==3.6.15=he1e5248_0
  • tk==8.6.12=h1ccaba5_0
  • lz4-c==1.9.4=h6a678d5_0
  • cuda-nsight==12.1.55=0
  • libcusparse==11.7.2.124=h7538f96_0
  • cuda-nvrtc-dev==11.6.124=h249d397_0
  • brotlipy==0.7.0=py310h7f8727e_1002
  • psutil==5.9.0=py310h5eee18b_0
  • ruamel.yaml.clib==0.2.6=py310h5eee18b_1
  • _openmp_mutex==5.1=1_gnu
  • cryptography==38.0.4=py310h9ce1e76_0
  • libcublas-dev==11.9.2.110=h5c901ab_0
  • libcufile-dev==1.6.0.25=0
  • cuda-nvprune==11.6.124=he22ec0a_0
  • ca-certificates==2023.01.10=h06a4308_0
  • libiconv==1.16=h7f8727e_2
  • libcublas==11.9.2.110=h5e84587_0
  • libwebp-base==1.2.4=h5eee18b_1
  • tqdm==4.64.1=py310h06a4308_0
  • libsodium==1.0.18=h7b6447c_0
  • libunistring==0.9.10=h27cfd23_0
  • gds-tools==1.6.0.25=0
  • libtiff==4.5.0=h6a678d5_2
  • readline==8.2=h5eee18b_0
  • cuda-gdb==12.1.55=0
  • libidn2==2.3.2=h7f8727e_0
  • libtasn1==4.19.0=h5eee18b_0
  • lame==3.100=h7b6447c_0
  • conda-content-trust==0.1.3=py310h06a4308_0
  • bzip2==1.0.8=h7b6447c_0
  • setuptools==65.6.3=py310h06a4308_0
  • libffi==3.4.2=h6a678d5_6
  • requests==2.28.1=py310h06a4308_0
  • libnvjpeg-dev==11.6.2.124=hb5906b9_0
  • nest-asyncio==1.5.6=py310h06a4308_0
  • conda-package-handling==2.0.2=py310h06a4308_0
  • debugpy==1.5.1=py310h295c915_0
  • sqlite==3.40.1=h5082296_0
  • mkl-service==2.4.0=py310h7f8727e_0
  • numpy-base==1.23.5=py310h8e6c178_0
  • conda==23.3.1=py310h06a4308_0
  • libgcc-ng==11.2.0=h1234567_1
  • pip==22.3.1=py310h06a4308_0
  • intel-openmp==2021.4.0=h06a4308_3561
  • libnpp-dev==11.6.3.124=h3c42840_0
  • boltons==23.0.0=py310h06a4308_0
  • pycosat==0.6.4=py310h5eee18b_0
  • pyzmq==23.2.0=py310h6a678d5_0
  • cuda-nvcc==11.6.124=hbba6d2d_0
  • ipython==8.12.0=py310h06a4308_0
  • ncurses==6.4=h6a678d5_0
  • nettle==3.7.3=hbbd107a_1
  • libwebp==1.2.4=h11a3e52_1
  • ruamel.yaml==0.17.21=py310h5eee18b_0
  • zstandard==0.18.0=py310h5eee18b_0
  • cffi==1.15.1=py310h5eee18b_3
  • jpeg==9e=h5eee18b_1
  • xz==5.2.10=h5eee18b_1
  • libuuid==1.41.5=h5eee18b_0
  • certifi==2022.12.7=py310h06a4308_0
  • mkl_random==1.2.2=py310h00e6091_0
  • flit-core==3.8.0=py310h06a4308_0
  • libcufile==1.6.0.25=0
  • libgomp==11.2.0=h1234567_1
  • giflib==5.2.1=h5eee18b_3
  • libpng==1.6.39=h5eee18b_0
  • lerc==3.0=h295c915_0
  • typing_extensions==4.4.0=py310h06a4308_0
  • cuda-cupti==11.6.124=h86345e5_0
  • idna==3.4=py310h06a4308_0
  • libstdcxx-ng==11.2.0=h1234567_1
  • platformdirs==2.5.2=py310h06a4308_0
  • ipykernel==6.19.2=py310h2f386ee_0
  • matplotlib-inline==0.1.6=py310h06a4308_0
  • pluggy==1.0.0=py310h06a4308_1
  • zlib==1.2.13=h5eee18b_0
  • lcms2==2.12=h3be6417_0
  • numpy==1.23.5=py310hd5efca6_0
  • cuda-cccl==11.6.55=hf6102b2_0
  • mkl_fft==1.3.1=py310hd6ae3a3_0
  • libnpp==11.6.3.124=hd2722f0_0
  • libcufft==10.7.1.112=hf425ae0_0
  • comm==0.1.2=py310h06a4308_0
  • packaging==23.0=py310h06a4308_0
  • pysocks==1.7.1=py310h06a4308_0
  • cuda-driver-dev==11.6.55=0
  • cuda-nvtx==11.6.124=h0630a44_0
  • cuda-cudart==11.6.55=he381448_0
  • libnvjpeg==11.6.2.124=hd473ad6_0
  • conda-package-streaming==0.7.0=py310h06a4308_0
  • mkl==2021.4.0=h06a4308_640
  • openssl==1.1.1t=h7f8727e_0
  • cuda-cuobjdump==11.6.124=h2eeebcb_0
  • jupyter_client==8.1.0=py310h06a4308_0
  • cuda-cudart-dev==11.6.55=h42ad0f4_0
  • python==3.10.9=h7a1cb2a_0
  • cuda-samples==11.6.101=h8efea70_0
  • zeromq==4.3.4=h2531618_0
  • toolz==0.12.0=py310h06a4308_0

Feature: Adding contributors section to the README.md file.

There is no Contributors section in readme file .
As we know Contributions are what make the open-source community such an amazing place to learn, inspire, and create.
The Contributors section in a README.md file is important as it acknowledges and gives credit to those who have contributed to a project, fosters community and collaboration, adds transparency and accountability, and helps document the project's history for current and future maintainers. It also serves as a form of recognition, motivating contributors to continue their efforts.
contributors

Query About Template.json

{
"data": [
{
"wav": "path_to_audio_file",
"caption": "textual_desciptions"
}
]
}

Do we put mixtures in here or individual items like flute audio?
Can you provide like an example with actual values filled in the json object? @liuxubo717

Error when using music_speech..._89.98.pt: pytorch-lightning_version

From your paper, I wasn't sure of the role/purpose of music_speech_audioset_epoch_15_esc_89.98.pt

Are these the saved model weights one should use if one wants to focus on separation of musical instruments from one another, say? Or is audiosep_base_4M_steps.ckpt still applicable in such use cases?

When I edited your example inference code from the readme to use music_speech_audioset_epoch_15_esc_89.98.pt on a Linux machine running Ubuntu, I got the following error.

Please clarify the purpose/use of this checkpoint, and if it is meant to be used, whether I need to modify the example inference code further.

Thanks!

Traceback (most recent call last):
File "/home/blah/repos/AudioSep/sayd_infer_example.py", line 6, in
model = build_audiosep(
File "/home/blah/repos/AudioSep/pipeline.py", line 17, in build_audiosep
model = load_ss_model(
File "/home/blah/repos/AudioSep/utils.py", line 387, in load_ss_model
pl_model = AudioSep.load_from_checkpoint(
File "/home/blah/anaconda3/envs/AudioSep/lib/python3.10/site-packages/lightning/pytorch/core/module.py", line 1532, in load_from_checkpoint
loaded = _load_from_checkpoint(
File "/home/blah/anaconda3/envs/AudioSep/lib/python3.10/site-packages/lightning/pytorch/core/saving.py", line 65, in _load_from_checkpoint
checkpoint = _pl_migrate_checkpoint(
File "/home/blah/anaconda3/envs/AudioSep/lib/python3.10/site-packages/lightning/pytorch/utilities/migration/utils.py", line 113, in _pl_migrate_checkpoint
old_version = _get_version(checkpoint)
File "/home/blah/anaconda3/envs/AudioSep/lib/python3.10/site-packages/lightning/pytorch/utilities/migration/utils.py", line 136, in _get_version
return checkpoint["pytorch-lightning_version"]
KeyError: 'pytorch-lightning_version'

Adding Contributors section to the readme.md

Why Contributors section:- A "Contributors" section in a repo gives credit to and acknowledges the people who have helped with the project, fosters a sense of community, and helps others know who to contact for questions or issues related to the project.

Issue type

  • [✅] Docs

Demo Image :-
3

@liuxubo717 kindly assign this issue to me ! I would love to work on it ! thank you !

ImportError: cannot import name 'inference' from 'pipeline'

this is probably just me being really bad at coding, I'm trying to run the example inference code in README and am getting this error:
ImportError: cannot import name 'inference' from 'pipeline' (/home/jordancruz/Tools/AudioSep/pipeline.py)

am I doing something wrong?

Adding code-of-conduct & Contributors.md File to the repo!

code-of-conduct:- We propose adding a comprehensive Code of Conduct to our repository to ensure
a safe, respectful, and inclusive environment for all contributors and users. This code will
serve as a guideline for behavior, promoting diversity, reducing conflicts, and attracting a
wider range of perspectives.

Contributing.md:- A "Contributing.md" file is added to a repository to provide guidelines and
instructions for potential contributors on how to collaborate effectively with the project.
It typically includes information on coding standards, how to submit changes, reporting issues,
and other important details to streamline the contribution process and maintain a healthy
open-source community.

Issue type

  • [✅] Docs

@liuxubo717 kindly assign this issue to me ! I would love to work on it ! Thank You !

Solving environment: failed

Get stuck here when trying setup instructions -

(base) PS C:\Users\flush\AaudioSep> conda env create -f environment.yml
Collecting package metadata (repodata.json): done
Solving environment: failed

ResolvePackageNotFound:

  • libnpp-dev==11.6.3.124=h3c42840_0
  • cryptography==38.0.4=py310h9ce1e76_0
  • conda==23.3.1=py310h06a4308_0
  • conda-package-streaming==0.7.0=py310h06a4308_0
  • cuda-nvrtc-dev==11.6.124=h249d397_0
  • lcms2==2.12=h3be6417_0
  • ---and so on...

RuntimeError: could not create a primitive

I wanted to share the following error I got after trying to run the inference_script in the README updating: query, input and output file.

$ python3 inference_script.py 
/$USER/miniconda3/envs/AudioSep/lib/python3.10/site-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /opt/conda/conda-bld/pytorch_1670525552843/work/aten/src/ATen/native/TensorShape.cpp:3190.)
  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]
Some weights of the model checkpoint at roberta-base were not used when initializing RobertaModel: ['lm_head.layer_norm.bias', 'lm_head.decoder.weight', 'lm_head.bias', 'lm_head.dense.bias', 'lm_head.layer_norm.weight', 'lm_head.dense.weight']
- This IS expected if you are initializing RobertaModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at roberta-base were not used when initializing RobertaModel: ['lm_head.layer_norm.bias', 'lm_head.decoder.weight', 'lm_head.bias', 'lm_head.dense.bias', 'lm_head.layer_norm.weight', 'lm_head.dense.weight']
- This IS expected if you are initializing RobertaModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Load AudioSep model from [checkpoint/audiosep_base_4M_steps.ckpt]
Separate audio from [/my/file/path/file.wav] with textual query [my_textual_query_to_separate]
Traceback (most recent call last):
  File "/file/to/local/audio-agi/AudioSep/inference_script.py", line 16, in <module>
    inference(model, audio_file, text, output_file, device)
  File "/file/to/local/audio-agi/AudioSep/pipeline.py", line 47, in inference
    sep_segment = model.ss_model(input_dict)["waveform"]
  File "/$USER/miniconda3/envs/AudioSep/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/path/to/local/AudioSep/models/resunet.py", line 648, in forward
    output_dict = self.base(
  File "/$USER/miniconda3/envs/AudioSep/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/file/to/local/AudioSep/models/resunet.py", line 555, in forward
    x = self.pre_conv(x)
  File "/$USER/miniconda3/envs/AudioSep/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/$USER/miniconda3/envs/AudioSep/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 463, in forward
    return self._conv_forward(input, self.weight, self.bias)
  File "/$USER/miniconda3/envs/AudioSep/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 459, in _conv_forward
    return F.conv2d(input, weight, bias, self.stride,
RuntimeError: could not create a primitive

This error was created in the latest commit: 2150ca8
In last snippet I changed the paths for readability.

Additionally, on another note, I had a previous issue:

File "/$USER/miniconda3/lib/python3.11/ctypes/__init__.py", line 376, in __init__
    self._handle = _dlopen(self._name, mode)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^
OSError: libcudart.so.12: cannot open shared object file: No such file or directory

That got solved by adding the following to my bashrc:

export LD_LIBRARY_PATH=/us/local/cuda/lib64:$LD_LIBRARY_PATH

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.