Git Product home page Git Product logo

Comments (8)

wenmengzhou avatar wenmengzhou commented on July 18, 2024

执行一下torch.cuda.is_available() 看下输出结果

from modelscope.

xiapeng1110 avatar xiapeng1110 commented on July 18, 2024

感谢!发现是torch版本的问题了。

from modelscope.

xiapeng1110 avatar xiapeng1110 commented on July 18, 2024

您好,我在更换环境后torch可以调用cuda,但运行和上面一致的代码后出现了新的问题。

2022-11-14 03:31:30,992 - modelscope - INFO - PyTorch version 1.13.0 Found.
2022-11-14 03:31:30,997 - modelscope - INFO - Loading ast index from /root/.cache/modelscope/ast_indexer
2022-11-14 03:31:31,018 - modelscope - INFO - Loading done! Current index file version is 1.0.3, with md5 122d6b7767ca662025493ae857fab95e
2022-11-14 03:31:34 | INFO | fairseq.tasks.text_to_speech | Please install tensorboardX: pip install tensorboardX
2022-11-14 03:31:34,899 - modelscope - INFO - initiate model from ./ofa_image-caption_coco_large_en
2022-11-14 03:31:34 | INFO | modelscope | initiate model from ./ofa_image-caption_coco_large_en
2022-11-14 03:31:34,900 - modelscope - INFO - initiate model from location ./ofa_image-caption_coco_large_en.
2022-11-14 03:31:34 | INFO | modelscope | initiate model from location ./ofa_image-caption_coco_large_en.
2022-11-14 03:31:34,903 - modelscope - INFO - initialize model from ./ofa_image-caption_coco_large_en
2022-11-14 03:31:34 | INFO | modelscope | initialize model from ./ofa_image-caption_coco_large_en
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/modelscope/utils/registry.py", line 209, in build_from_cfg
    return obj_cls._instantiate(**args)
  File "/usr/local/lib/python3.8/dist-packages/modelscope/models/base/base_model.py", line 61, in _instantiate
    return cls(**kwargs)
  File "/usr/local/lib/python3.8/dist-packages/modelscope/models/multi_modal/ofa_for_all_tasks.py", line 44, in __init__
    model = OFAModel.from_pretrained(model_dir)
  File "/usr/local/lib/python3.8/dist-packages/transformers/modeling_utils.py", line 2230, in from_pretrained
    model = cls(config, *model_args, **model_kwargs)
  File "/usr/local/lib/python3.8/dist-packages/modelscope/models/multi_modal/ofa/modeling_ofa.py", line 1954, in __init__
    self.encoder = OFAEncoder(config, shared)
  File "/usr/local/lib/python3.8/dist-packages/modelscope/models/multi_modal/ofa/modeling_ofa.py", line 853, in __init__
    self.layernorm_embedding = LayerNorm(embed_dim)
  File "/usr/local/lib/python3.8/dist-packages/modelscope/models/multi_modal/ofa/modeling_ofa.py", line 92, in LayerNorm
    return FusedLayerNorm(normalized_shape, eps, elementwise_affine)
  File "/usr/local/lib/python3.8/dist-packages/apex/normalization/fused_layer_norm.py", line 166, in __init__
    fused_layer_norm_cuda = importlib.import_module("fused_layer_norm_cuda")
  File "/usr/lib/python3.8/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
  File "<frozen importlib._bootstrap>", line 991, in _find_and_load
  File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 657, in _load_unlocked
  File "<frozen importlib._bootstrap>", line 556, in module_from_spec
  File "<frozen importlib._bootstrap_external>", line 1101, in create_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
ImportError: /usr/local/lib/python3.8/dist-packages/fused_layer_norm_cuda.cpython-38-x86_64-linux-gnu.so: undefined symbol: _ZNK2at6Tensor7optionsEv

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/modelscope/utils/registry.py", line 211, in build_from_cfg
    return obj_cls(**args)
  File "/usr/local/lib/python3.8/dist-packages/modelscope/pipelines/multi_modal/image_captioning_pipeline.py", line 32, in __init__
    super().__init__(model=model)
  File "/usr/local/lib/python3.8/dist-packages/modelscope/pipelines/base.py", line 89, in __init__
    self.model = self.initiate_single_model(model)
  File "/usr/local/lib/python3.8/dist-packages/modelscope/pipelines/base.py", line 50, in initiate_single_model
    return Model.from_pretrained(
  File "/usr/local/lib/python3.8/dist-packages/modelscope/models/base/base_model.py", line 122, in from_pretrained
    model = build_model(
  File "/usr/local/lib/python3.8/dist-packages/modelscope/models/builder.py", line 30, in build_model
    return build_from_cfg(
  File "/usr/local/lib/python3.8/dist-packages/modelscope/utils/registry.py", line 214, in build_from_cfg
    raise type(e)(f'{obj_cls.__name__}: {e}')
ImportError: OfaForAllTasks: /usr/local/lib/python3.8/dist-packages/fused_layer_norm_cuda.cpython-38-x86_64-linux-gnu.so: undefined symbol: _ZNK2at6Tensor7optionsEv

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "generate_captions.py", line 18, in <module>
    img_caption_1 = pipeline(Tasks.image_captioning, model=path_1, device='gpu:0') #, device='gpu:5'
  File "/usr/local/lib/python3.8/dist-packages/modelscope/pipelines/builder.py", line 325, in pipeline
    return build_pipeline(cfg, task_name=task)
  File "/usr/local/lib/python3.8/dist-packages/modelscope/pipelines/builder.py", line 242, in build_pipeline
    return build_from_cfg(
  File "/usr/local/lib/python3.8/dist-packages/modelscope/utils/registry.py", line 214, in build_from_cfg
    raise type(e)(f'{obj_cls.__name__}: {e}')
ImportError: ImageCaptioningPipeline: OfaForAllTasks: /usr/local/lib/python3.8/dist-packages/fused_layer_norm_cuda.cpython-38-x86_64-linux-gnu.so: undefined symbol: _ZNK2at6Tensor7optionsEv

希望能得到解答,谢谢!

from modelscope.

wenmengzhou avatar wenmengzhou commented on July 18, 2024

fused_layer_norm_cuda.cpython-38-x86_64-linux-gnu.so: undefined symbol: _ZNK2at6Tensor7optionsEv

看起来是apex安装的有问题,尝试clone代码源码安装试试?

from modelscope.

xiapeng1110 avatar xiapeng1110 commented on July 18, 2024

这个问题需要重新装apex,装apex的时候需要cuda和cudatoolkit的版本一致,我把cuda的版本调整11.1了,然后重装apex再编译。但是我再次测试时,没有报错但日志仍然显示无法调用GPU,日志信息和对应代码如下:

2022-11-15 04:36:49,671 - modelscope - INFO - PyTorch version 1.10.0+cu111 Found.
2022-11-15 04:36:49,780 - modelscope - INFO - Loading ast index from /root/.cache/modelscope/ast_indexer
2022-11-15 04:36:49,780 - modelscope - INFO - No valid ast index found from /root/.cache/modelscope/ast_indexer, rebuilding ast index!
2022-11-15 04:36:49,789 - modelscope - INFO - AST-Scaning the path "/usr/local/lib/python3.8/dist-packages/modelscope" with the following sub folders ['models', 'metrics', 'pipelines', 'preprocessors', 'trainers', 'msdatasets']
2022-11-15 04:37:19,149 - modelscope - INFO - Scaning done! A number of 427 files indexed! Time consumed 29.360363721847534s
2022-11-15 04:37:19,175 - modelscope - INFO - Loading done! Current index file version is 1.0.3, with md5 122d6b7767ca662025493ae857fab95e
2022-11-15 04:37:23 | INFO | fairseq.tasks.text_to_speech | Please install tensorboardX: pip install tensorboardX
2022-11-15 04:37:23,920 - modelscope - INFO - initiate model from ./ofa_image-caption_coco_large_en
2022-11-15 04:37:23 | INFO | modelscope | initiate model from ./ofa_image-caption_coco_large_en
2022-11-15 04:37:23,922 - modelscope - INFO - initiate model from location ./ofa_image-caption_coco_large_en.
2022-11-15 04:37:23 | INFO | modelscope | initiate model from location ./ofa_image-caption_coco_large_en.
2022-11-15 04:37:23,936 - modelscope - INFO - initialize model from ./ofa_image-caption_coco_large_en
2022-11-15 04:37:23 | INFO | modelscope | initialize model from ./ofa_image-caption_coco_large_en
2022-11-15 04:37:36,234 - modelscope - INFO - cuda is not available, using cpu instead.
2022-11-15 04:37:36 | INFO | modelscope | cuda is not available, using cpu instead.
2022-11-15 04:37:36,234 - modelscope - INFO - initialize model from ./ofa_image-caption_coco_large_en
2022-11-15 04:37:36 | INFO | modelscope | initialize model from ./ofa_image-caption_coco_large_en
2022-11-15 04:38:04,987 - modelscope - INFO - cuda is not available, using cpu instead.
2022-11-15 04:38:04 | INFO | modelscope | cuda is not available, using cpu instead.
2022-11-15 04:38:05,131 - modelscope - INFO - initiate model from ./ofa_image-caption_coco_huge_en
2022-11-15 04:38:05 | INFO | modelscope | initiate model from ./ofa_image-caption_coco_huge_en
2022-11-15 04:38:05,133 - modelscope - INFO - initiate model from location ./ofa_image-caption_coco_huge_en.
2022-11-15 04:38:05 | INFO | modelscope | initiate model from location ./ofa_image-caption_coco_huge_en.
2022-11-15 04:38:05,138 - modelscope - INFO - initialize model from ./ofa_image-caption_coco_huge_en
2022-11-15 04:38:05 | INFO | modelscope | initialize model from ./ofa_image-caption_coco_huge_en
2022-11-15 04:38:39,875 - modelscope - INFO - cuda is not available, using cpu instead.
2022-11-15 04:38:39 | INFO | modelscope | cuda is not available, using cpu instead.
2022-11-15 04:38:39,877 - modelscope - INFO - initialize model from ./ofa_image-caption_coco_huge_en
2022-11-15 04:38:39 | INFO | modelscope | initialize model from ./ofa_image-caption_coco_huge_en
2022-11-15 04:40:18,908 - modelscope - INFO - cuda is not available, using cpu instead.
2022-11-15 04:40:18 | INFO | modelscope | cuda is not available, using cpu instead.
from modelscope.pipelines import pipeline
from modelscope.utils.constant import Tasks
from modelscope.outputs import OutputKeys
import os

path_1 = './ofa_image-caption_coco_large_en'
path_2 = './ofa_image-caption_coco_huge_en'

img_dir = './val2014/'

img_caption_1 = pipeline(Tasks.image_captioning, model=path_1, device='gpu:5')
img_caption_2 = pipeline(Tasks.image_captioning, model=path_2, device='gpu:5')

list_caption_1 = []
list_caption_2 = []

for name in os.listdir(img_dir):
	img = img_dir+name
	list_caption_1.append([img,img_caption_1(img)[OutputKeys.CAPTION][0]])
	list_caption_2.append([img,img_caption_2(img)[OutputKeys.CAPTION][0]])

with open('./1.txt', encoding='utf-8', mode='w') as f1:
    for i in list_caption_1:
        for j in i:
            f1.write(str(j)+' ')
        f1.write('\n')

with open('./2.txt', encoding='utf-8', mode='w') as f2:
    for i in list_caption_2:
        for j in i:
            f2.write(str(j)+' ')
        f2.write('\n')

from modelscope.

xiapeng1110 avatar xiapeng1110 commented on July 18, 2024

已经解决了,感谢您。另外,方便问一下pipeline是否支持batch输入呢,使用单条样本执行inference速度较慢。

from modelscope.

zhangyichang avatar zhangyichang commented on July 18, 2024

已经解决了,感谢您。另外,方便问一下pipeline是否支持batch输入呢,使用单条样本执行inference速度较慢。

我们近期会推出支持batch输入的接口,敬请等待。自己修改的话可以参考ofa trainer中使用dataset的方法。

from modelscope.

7carry7 avatar 7carry7 commented on July 18, 2024

modelscope - INFO - cuda is not available, using cpu instead.
在使用modelscope的模型进行ocr识别时,tensorflow可以访问cpu(tf是1.15.0,配cuda10.0),但加载modelscope时会出现上述语句,请问是为什么呢?需要重新下载一个cuda嘛and并没有在网上找到modelscope与cuda的对应关系

from modelscope.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.