Comments (3)
Protip when getting those walls of compiler output is to copy everything into a text editor and search for the string : error:
. In this case the errors are:
/home/n/text-generation-webui/repositories/exllamav2/exllamav2/exllamav2_ext/cpp/safetensors_hip.cpp: At global scope:
/home/n/text-generation-webui/repositories/exllamav2/exllamav2/exllamav2_ext/cpp/safetensors_hip.cpp:267:16: error: expected initializer before ‘dec_lock’
267 | void CUDART_CB dec_lock(hipStream_t stream, hipError_t status, void *user_data)
| ^~~~~~~~
/home/n/text-generation-webui/repositories/exllamav2/exllamav2/exllamav2_ext/cpp/safetensors_hip.cpp: In member function ‘void STFile::load(at::Tensor, size_t, size_t, bool)’:
/home/n/text-generation-webui/repositories/exllamav2/exllamav2/exllamav2_ext/cpp/safetensors_hip.cpp:328:27: warning: ignoring returned value of type ‘hipError_t’, declared with attribute ‘nodiscard’ [-Wunused-result]
328 | hipMemcpyAsync(dst, src, copy_len, hipMemcpyHostToDevice);
| ~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/opt/rocm-5.7.3/include/hip/hip_runtime_api.h:3883:12: note: in call to ‘hipError_t hipMemcpyAsync(void*, const void*, size_t, hipMemcpyKind, hipStream_t)’, declared here
3883 | hipError_t hipMemcpyAsync(void* dst, const void* src, size_t sizeBytes, hipMemcpyKind kind,
| ^~~~~~~~~~~~~~
/opt/rocm-5.7.3/include/hip/hip_runtime_api.h:332:3: note: ‘hipError_t’ declared here
332 | } hipError_t;
| ^~~~~~~~~~
/home/n/text-generation-webui/repositories/exllamav2/exllamav2/exllamav2_ext/cpp/safetensors_hip.cpp:329:40: error: ‘dec_lock’ was not declared in this scope; did you mean ‘clock’?
329 | hipStreamAddCallback(NULL, dec_lock, (void*) page, 0);
| ^~~~~~~~
| clock
I have a 7900XTX on order so I can actually start running and debugging ROCm/HIP stuff myself soon.
But in the meantime I have to assume stream callbacks don't work exactly the same in HIPified CUDA code. GPT4 suggests that the CUDART_CB
macro may not be needed. So if you wouldn't mind, you could try removing that word to see if it compiles.
So around line 266 in exllamav2/exllamav2_ext/cpp/safetensors.cpp you should have:
void dec_lock(cudaStream_t stream, cudaError_t status, void *user_data)
{
#ifdef __linux__
STPage* p = (STPage*) user_data;
p->locks--;
#endif
}
I'd love to hear if it works.
from exllama.
It appears that you added this to the code base - and it does work now!
And thanks for the tip, I'll try to do that next time I have such issue.
[ I sent the whole thing because it did something odd at the beginning -
there were some warnings about ignored packages ... that looks resolved now. ]
This test was using ROCm6.0 on Ubuntu 23.04 torch 2.3.0.20240118+rocm6.0 ... with flash attention 2. Model loads, and answers questions. :)
from exllama.
Related Issues (20)
- list index out of range HOT 1
- doesn't use CUDA_HOME?
- Why can't the llama2 model output EOS id? HOT 4
- finetuned Llama-2-7B-32K-Instruct-GPTQ only returns '\n' HOT 1
- Changing hyper-parameters after initilization without reloading weights from disk.
- llama_cpp_python_cuda is not a supported wheel on this platform HOT 1
- test_benchmark_inference.py broken? HOT 1
- test_inference.py : AttributeError: module 'exllamav2_ext' has no attribute 'rms_norm' HOT 1
- does the benchmark support batch size>1? HOT 1
- Issue with How --gpu_split / -gs argument works. HOT 2
- Using Exllama backend requires all the modules to be on GPU - how? HOT 1
- Occasionally RuntimeError
- Error when using Beam Search
- Does it support safetytensor formate?>
- When will the bfloat16 type of GPTQ algorithm be supported?
- ERROR: Could not install packages due to an OSError: [Errno 13] Permission denied: '/home/exllama/env/lib/python3.11/site-packages/sentencepiece' Check the permissions.
- ValueError: Unrecognized layer: lm_head.q_groups on a new install HOT 2
- piece id is out of range HOT 3
- Run on CPU without AVX2 HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from exllama.