AttributeError: 'ChatCompletionResponse' object has no attribute 'model_dump_json' about qwen-agent HOT 6 OPEN

qwenlm commented on June 12, 2024

AttributeError: 'ChatCompletionResponse' object has no attribute 'model_dump_json'

from qwen-agent.

Comments (6)

JianxinMa commented on June 12, 2024 1

I believe so. We are working on streaming LLM, though it may take some time. Please stay tuned.

from qwen-agent.

jmanhype commented on June 12, 2024

also this error seems to pop up ERROR: Exception in ASGI application
Traceback (most recent call last):
File "/home/batman/dev/test1/qwen_agent_env/lib/python3.10/site-packages/uvicorn/protocols/http/h11_impl.py", line 408, in run_asgi
result = await app( # type: ignore[func-returns-value]
File "/home/batman/dev/test1/qwen_agent_env/lib/python3.10/site-packages/uvicorn/middleware/proxy_headers.py", line 84, in call
return await self.app(scope, receive, send)
File "/home/batman/dev/test1/qwen_agent_env/lib/python3.10/site-packages/fastapi/applications.py", line 292, in call
await super().call(scope, receive, send)
File "/home/batman/dev/test1/qwen_agent_env/lib/python3.10/site-packages/starlette/applications.py", line 122, in call
await self.middleware_stack(scope, receive, send)
File "/home/batman/dev/test1/qwen_agent_env/lib/python3.10/site-packages/starlette/middleware/errors.py", line 184, in call
raise exc
File "/home/batman/dev/test1/qwen_agent_env/lib/python3.10/site-packages/starlette/middleware/errors.py", line 162, in call
await self.app(scope, receive, _send)
File "/home/batman/dev/test1/qwen_agent_env/lib/python3.10/site-packages/starlette/middleware/cors.py", line 83, in call
await self.app(scope, receive, send)
File "/home/batman/dev/test1/qwen_agent_env/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 79, in call
raise exc
File "/home/batman/dev/test1/qwen_agent_env/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 68, in call
await self.app(scope, receive, sender)
File "/home/batman/dev/test1/qwen_agent_env/lib/python3.10/site-packages/fastapi/middleware/asyncexitstack.py", line 20, in call
raise e
File "/home/batman/dev/test1/qwen_agent_env/lib/python3.10/site-packages/fastapi/middleware/asyncexitstack.py", line 17, in call
await self.app(scope, receive, send)
File "/home/batman/dev/test1/qwen_agent_env/lib/python3.10/site-packages/starlette/routing.py", line 718, in call
await route.handle(scope, receive, send)
File "/home/batman/dev/test1/qwen_agent_env/lib/python3.10/site-packages/starlette/routing.py", line 276, in handle
await self.app(scope, receive, send)
File "/home/batman/dev/test1/qwen_agent_env/lib/python3.10/site-packages/starlette/routing.py", line 69, in app
await response(scope, receive, send)
File "/home/batman/dev/test1/qwen_agent_env/lib/python3.10/site-packages/sse_starlette/sse.py", line 233, in call
async with anyio.create_task_group() as task_group:
File "/home/batman/dev/test1/qwen_agent_env/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 597, in aexit
raise exceptions[0]
File "/home/batman/dev/test1/qwen_agent_env/lib/python3.10/site-packages/sse_starlette/sse.py", line 236, in wrap
await func()
File "/home/batman/dev/test1/qwen_agent_env/lib/python3.10/site-packages/sse_starlette/sse.py", line 221, in stream_response
async for data in self.body_iterator:
File "/home/batman/dev/test1/Qwen/openai_api.py", line 432, in predict
for new_response in response_generator:
File "/home/batman/.cache/huggingface/modules/transformers_modules/QWen/QWen-7B-Chat-Int4/b725fe596dce755fe717c5b15e5c8243d5474f66/modeling_qwen.py", line 1273, in stream_generator
for token in self.generate_stream(
File "/home/batman/dev/test1/qwen_agent_env/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 35, in generator_context
response = gen.send(None)
File "/home/batman/dev/test1/qwen_agent_env/lib/python3.10/site-packages/transformers_stream_generator/main.py", line 931, in sample_stream
outputs = self(
File "/home/batman/dev/test1/qwen_agent_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/batman/.cache/huggingface/modules/transformers_modules/QWen/QWen-7B-Chat-Int4/b725fe596dce755fe717c5b15e5c8243d5474f66/modeling_qwen.py", line 1108, in forward
transformer_outputs = self.transformer(
File "/home/batman/dev/test1/qwen_agent_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/batman/.cache/huggingface/modules/transformers_modules/QWen/QWen-7B-Chat-Int4/b725fe596dce755fe717c5b15e5c8243d5474f66/modeling_qwen.py", line 938, in forward
outputs = block(
File "/home/batman/dev/test1/qwen_agent_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/batman/.cache/huggingface/modules/transformers_modules/QWen/QWen-7B-Chat-Int4/b725fe596dce755fe717c5b15e5c8243d5474f66/modeling_qwen.py", line 639, in forward
attn_outputs = self.attn(
File "/home/batman/dev/test1/qwen_agent_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/batman/.cache/huggingface/modules/transformers_modules/QWen/QWen-7B-Chat-Int4/b725fe596dce755fe717c5b15e5c8243d5474f66/modeling_qwen.py", line 564, in forward
attn_output, attn_weight = self._attn(
File "/home/batman/.cache/huggingface/modules/transformers_modules/QWen/QWen-7B-Chat-Int4/b725fe596dce755fe717c5b15e5c8243d5474f66/modeling_qwen.py", line 326, in _attn
attn_weights = attn_weights / torch.full(
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 1.11 GiB (GPU 0; 11.73 GiB total capacity; 9.42 GiB already allocated; 819.75 MiB free; 10.68 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

from qwen-agent.

JianxinMa commented on June 12, 2024

AttributeError: 'ChatCompletionResponse' object has no attribute 'model_dump_json'

Regarding the first error, please check if pip install "pydantic>=2.3.0" helps. Remember to include the double quotes.

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 1.11 GiB (GPU 0; 11.73 GiB total capacity; 9.42 GiB already allocated; 819.75 MiB free; 10.68 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

As for the second issue, Qwen-7B-Chat can consume around 14GB VRAM when handling a sequence of length 8192. Try reducing the sequence length by specifying python run_server.py --max_ref_token 1000.

from qwen-agent.

jmanhype commented on June 12, 2024

thank you for the quick response agin. would streamingllm help the memory issue I understand that 14gb but does this framework would it benefit for implementing this https://github.com/mit-han-lab/streaming-llm

from qwen-agent.

jmanhype commented on June 12, 2024

Thank you. What about this https://x.com/arankomatsuzaki/status/1711401381247242683?s=20

from qwen-agent.

jmanhype commented on June 12, 2024

from qwen-agent.

AttributeError: 'ChatCompletionResponse' object has no attribute 'model_dump_json' about qwen-agent HOT 6 OPEN

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent