Comments (3)
I do not known whether this solution could cause other problems.
I have run successfully by deleting /path/to/conda/miniconda3/envs/tune/lib/python3.10/site-packages/mmengine/config/utils.py:446-448
in function _gather_abs_import_lazyobj
(mmengine version 0.10.4) as following.
def _gather_abs_import_lazyobj(tree: ast.Module,
filename: Optional[str] = None):
"""Experimental implementation of gathering absolute import information."""
if isinstance(filename, str):
filename = filename.encode('unicode_escape').decode()
imported = defaultdict(list)
abs_imported = set()
new_body: List[ast.stmt] = []
# module2node is used to get lineno when Python < 3.10
module2node: dict = dict()
for node in tree.body:
if isinstance(node, ast.Import):
for alias in node.names:
# Skip converting built-in module to LazyObject
# ! LINES TO BE DELETED ! if _is_builtin_module(alias.name):
# ! LINES TO BE DELETED ! new_body.append(node)
# ! LINES TO BE DELETED ! continue
module = alias.name.split('.')[0]
module2node.setdefault(module, node)
imported[module].append(alias)
continue
new_body.append(node)
for key, value in imported.items():
names = [_value.name for _value in value]
if hasattr(value[0], 'lineno'):
lineno = value[0].lineno
else:
lineno = module2node[key].lineno
lazy_module_assign = ast.parse(
f'{key} = LazyObject({names}, location="{filename}, line {lineno}")' # noqa: E501
) # noqa: E501
abs_imported.add(key)
new_body.insert(0, lazy_module_assign.body[0])
tree.body = new_body
return tree, abs_imported
Emphasize it again: I do not known whether this solution could cause other problems.
I also do not understand why we need to skip builtin modules here. It might be for performace consideration.
I suggest removing all special handling code for builtin modules in utils.py
including line 158-182 (_is_builtin_module
definition), line 318-328 in __init__
of class ImportTransformer
, line 422-423 in visit_Import
, and line 446-448 in _gather_abs_import_lazyobj
(mmengine version 0.10.4).
from mmengine.
I met similar errors, and I find it is caused by manually adding imports for builtin modules. For example, I imported os
here.
❯ xtuner train internlm2_7b_qlora_colorist_e5_copy.py
/home/zhenwang/miniconda3/envs/tune/lib/python3.10/site-packages/mmengine/optim/optimizer/zero_optimizer.py:11: DeprecationWarning: `TorchScript` support for functional optimizers is deprecated and will be removed in a future PyTorch release. Consider using the `torch.compile` optimizer instead.
from torch.distributed.optim import \
[2024-08-08 17:16:25,504] [INFO] [real_accelerator.py:203:get_accelerator] Setting ds_accelerator to cuda (auto detect)
[WARNING] Please specify the CUTLASS repo directory as environment variable $CUTLASS_PATH
[WARNING] sparse_attn requires a torch version >= 1.5 and < 2.0 but detected 2.4
[WARNING] using untested triton version (3.0.0), only 1.0.0 is known to be compatible
/home/zhenwang/miniconda3/envs/tune/lib/python3.10/site-packages/deepspeed/runtime/zero/linear.py:49: FutureWarning: `torch.cuda.amp.custom_fwd(args...)` is deprecated. Please use `torch.amp.custom_fwd(args..., device_type='cuda')` instead.
def forward(ctx, input, weight, bias=None):
/home/zhenwang/miniconda3/envs/tune/lib/python3.10/site-packages/deepspeed/runtime/zero/linear.py:67: FutureWarning: `torch.cuda.amp.custom_bwd(args...)` is deprecated. Please use `torch.amp.custom_bwd(args..., device_type='cuda')` instead.
def backward(ctx, grad_output):
/home/zhenwang/miniconda3/envs/tune/lib/python3.10/site-packages/mmengine/optim/optimizer/zero_optimizer.py:11: DeprecationWarning: `TorchScript` support for functional optimizers is deprecated and will be removed in a future PyTorch release. Consider using the `torch.compile` optimizer instead.
from torch.distributed.optim import \
[2024-08-08 17:16:27,826] [INFO] [real_accelerator.py:203:get_accelerator] Setting ds_accelerator to cuda (auto detect)
[WARNING] Please specify the CUTLASS repo directory as environment variable $CUTLASS_PATH
[WARNING] sparse_attn requires a torch version >= 1.5 and < 2.0 but detected 2.4
[WARNING] using untested triton version (3.0.0), only 1.0.0 is known to be compatible
/home/zhenwang/miniconda3/envs/tune/lib/python3.10/site-packages/deepspeed/runtime/zero/linear.py:49: FutureWarning: `torch.cuda.amp.custom_fwd(args...)` is deprecated. Please use `torch.amp.custom_fwd(args..., device_type='cuda')` instead.
def forward(ctx, input, weight, bias=None):
/home/zhenwang/miniconda3/envs/tune/lib/python3.10/site-packages/deepspeed/runtime/zero/linear.py:67: FutureWarning: `torch.cuda.amp.custom_bwd(args...)` is deprecated. Please use `torch.amp.custom_bwd(args..., device_type='cuda')` instead.
def backward(ctx, grad_output):
Traceback (most recent call last):
File "/home/zhenwang/miniconda3/envs/tune/lib/python3.10/site-packages/xtuner/tools/train.py", line 360, in <module>
main()
File "/home/zhenwang/miniconda3/envs/tune/lib/python3.10/site-packages/xtuner/tools/train.py", line 349, in main
runner = Runner.from_cfg(cfg)
File "/home/zhenwang/miniconda3/envs/tune/lib/python3.10/site-packages/mmengine/runner/runner.py", line 461, in from_cfg
cfg = copy.deepcopy(cfg)
File "/home/zhenwang/miniconda3/envs/tune/lib/python3.10/copy.py", line 153, in deepcopy
y = copier(memo)
File "/home/zhenwang/miniconda3/envs/tune/lib/python3.10/site-packages/mmengine/config/config.py", line 1531, in __deepcopy__
super(Config, other).__setattr__(key, copy.deepcopy(value, memo))
File "/home/zhenwang/miniconda3/envs/tune/lib/python3.10/copy.py", line 153, in deepcopy
y = copier(memo)
File "/home/zhenwang/miniconda3/envs/tune/lib/python3.10/site-packages/mmengine/config/config.py", line 144, in __deepcopy__
other[copy.deepcopy(key, memo)] = copy.deepcopy(value, memo)
File "/home/zhenwang/miniconda3/envs/tune/lib/python3.10/copy.py", line 161, in deepcopy
rv = reductor(4)
TypeError: cannot pickle 'module' object
# internlm2_7b_qlora_colorist_e5_copy.py
import os
import torch
...
It seems that Config.fromfile
requires modification.. I will update my solution when I solve the problem.
from mmengine.
I have the same problem, did you solve it?
from mmengine.
Related Issues (20)
- [Bug] 多卡情况下,训练后eval和离线test的精度不能保证一致
- [Bug] 中断后恢复训练报错RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!
- ValueError: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. T HOT 1
- [Bug] 分布式训练代码例子报错, HOT 5
- FSDPStrategy how to set mixed_precision and other params of pytorch
- [Bug] lr is wrong when resume with an LinearLR HOT 1
- [Feature] about iterabledataset or webdataset HOT 1
- [Feature] 训练得到的权重命名问题 HOT 2
- [Feature] 怎样得到每个类别的精确率和召回率等指标? HOT 1
- [Docs] 分布式训练和GPU的选择能不能在配置文件中读取?
- [Feature] 支持VisualDL可视化分析工具
- [Bug] KeyError: 'SentryFishVideo3Dataset is not in the __main__::datasets registry. Please check whether the value of `SentryFishVideo3Dataset` is correct or it was registered as expected. More details can be found at https://mmengine.readthedocs.io/en/latest/advanced_tutorials/config.html#import-the-custom-module'
- [Bug] `misaligned address` during in `SyncBuffersHook` all_reduce when using bf16 with deepspeed
- What parameters are supported by save best HOT 2
- [Docs] A100算力加持!书生大模型实战营第3期全面升级,趣味闯关模式等你开启
- [Feature] Add a parameter for selecting image suffixes in the LocalVisBackend class
- [Docs] after resume training, change param_scheduler is no effective HOT 1
- [Feature] Modifed implementation of BaseDataElement cuda(), cpu(), etc
- 为了方便技术交流,拉了一个多模态大模型技术交流群,有需要的大家可以加入
- [Docs] Is there way to extract model from mmengine back to Pytorch
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mmengine.