Comments (2)
Thanks for your issue. Could you try pulling the most recent repo? I fixed this last week.
from colossalai.
Thanks for the answer, I pulled colossalai==0.3.7, when torch==2.2.1,The following error occurs
File "/opt/conda/lib/python3.9/site-packages/colossalai/auto_parallel/tensor_shard/initialize.py", line 355, in autoparallelize
rst_to_unpack = initialize_model(
File "/opt/conda/lib/python3.9/site-packages/colossalai/auto_parallel/tensor_shard/initialize.py", line 265, in initialize_model
gm = ColoGraphModule(model, graph, model.__class__.__name__)
File "/opt/conda/lib/python3.9/site-packages/colossalai/_analyzer/fx/graph_module.py", line 110, in __init__
super().__init__(root, graph, class_name)
File "/opt/conda/lib/python3.9/site-packages/torch/fx/graph_module.py", line 428, in __init__
self.graph = graph
File "/opt/conda/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1747, in __setattr__
super().__setattr__(name, value)
File "/opt/conda/lib/python3.9/site-packages/torch/fx/graph_module.py", line 472, in graph
self.recompile()
File "/opt/conda/lib/python3.9/site-packages/colossalai/_analyzer/fx/graph_module.py", line 141, in recompile
python_code = self._graph.python_code(root_module="self")
File "/opt/conda/lib/python3.9/site-packages/torch/fx/graph.py", line 1328, in python_code
return self._python_code(root_module, namespace, verbose=verbose)
File "/opt/conda/lib/python3.9/site-packages/torch/fx/graph.py", line 1331, in _python_code
return self._codegen._gen_python_code(self.nodes, root_module, namespace, verbose=verbose)
File "/opt/conda/lib/python3.9/site-packages/colossalai/_analyzer/fx/codegen.py", line 472, in _gen_python_code
return PythonCode(fn_code, globals_)
TypeError: __init__() missing 1 required positional argument: '_lineno_map'
when the torch==2.1.1,The following error occurs
File "/opt/conda/lib/python3.9/site-packages/colossalai/auto_parallel/tensor_shard/initialize.py", line 355, in autoparallelize
rst_to_unpack = initialize_model(
File "/opt/conda/lib/python3.9/site-packages/colossalai/auto_parallel/tensor_shard/initialize.py", line 267, in initialize_model
shape_prop_pass(gm, *meta_args.values())
File "/opt/conda/lib/python3.9/site-packages/colossalai/_analyzer/fx/passes/shape_prop.py", line 269, in shape_prop_pass
ShapeProp(module).propagate(*args, device=_current_device(module))
File "/opt/conda/lib/python3.9/site-packages/colossalai/_analyzer/fx/passes/shape_prop.py", line 253, in propagate
return super().run(*tree_map(wrap_fn, args))
File "/opt/conda/lib/python3.9/site-packages/torch/fx/interpreter.py", line 138, in run
self.env[node] = self.run_node(node)
File "/opt/conda/lib/python3.9/site-packages/colossalai/_analyzer/fx/passes/shape_prop.py", line 116, in run_node
r = getattr(self, n.op)(n.target, args, kwargs)
File "/opt/conda/lib/python3.9/site-packages/torch/fx/interpreter.py", line 312, in call_module
return submod(*args, **kwargs)
File "/opt/conda/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/opt/conda/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/opt/conda/lib/python3.9/site-packages/torch/nn/modules/sparse.py", line 162, in forward
return F.embedding(
File "/opt/conda/lib/python3.9/site-packages/torch/nn/functional.py", line 2202, in embedding
return handle_torch_function(
File "/opt/conda/lib/python3.9/site-packages/torch/overrides.py", line 1577, in handle_torch_function
result = torch_func_method(public_api, types, args, kwargs)
File "/opt/conda/lib/python3.9/site-packages/torch/_tensor.py", line 1386, in __torch_function__
ret = func(*args, **kwargs)
File "/opt/conda/lib/python3.9/site-packages/torch/nn/functional.py", line 2233, in embedding
return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
File "/opt/conda/lib/python3.9/site-packages/colossalai/_analyzer/_subclasses/meta_tensor.py", line 113, in __torch_dispatch__
ret = func(*args, **kwargs)
File "/opt/conda/lib/python3.9/site-packages/torch/_ops.py", line 448, in __call__
return self._op(*args, **kwargs or {})
File "/opt/conda/lib/python3.9/site-packages/torch/_decomp/decompositions.py", line 1141, in embedding
return weight[indices]
File "/opt/conda/lib/python3.9/site-packages/torch/_meta_registrations.py", line 2790, in meta_index_Tensor
return self.new_empty(before_shape + replacement_shape + after_shape)
File "/opt/conda/lib/python3.9/site-packages/torch/_refs/__init__.py", line 4483, in new_empty
return torch.empty(
File "/opt/conda/lib/python3.9/site-packages/colossalai/_analyzer/_subclasses/meta_tensor.py", line 188, in _new
return MetaTensor(
File "/opt/conda/lib/python3.9/site-packages/colossalai/_analyzer/_subclasses/meta_tensor.py", line 60, in __new__
r = torch.Tensor._make_wrapper_subclass(
RuntimeError: !check_has_torch_dispatch(obj) INTERNAL ASSERT FAILED at "../torch/csrc/autograd/python_variable.cpp":1934, please report a bug to PyTorch. While HermeticPyObject was enabled, we attempted to create a tensor subclass with __torch_dispatch__. This violates the invariant that operations in HermeticPyObject have equivalent C++ implementations. If your operator registered from Python operator registration isn't doing anything strange, there may be an internal PyTorch bug involving not appropriately disabling TorchDispatchMode before executing Python op registration.
While executing %transformer_wte : [num_users=1] = call_module[target=transformer.wte](args = (%view,), kwargs = {})
Original traceback:
None
from colossalai.
Related Issues (20)
- [BUG]: OOM when saving 70B model HOT 2
- [DOC]: What is the datasetset used to train the Colossal-Llama-2? HOT 1
- [BUG]: Running ColossalAI in H800 with torch 2.0 HOT 28
- [BUG]: pretraing llama2 using "gemini" plugin, can not resume from saved checkpoints HOT 1
- [BUG] [Shardformer]: Error in blip2 testing with half precision HOT 1
- [FEATURE]: support multiple (partial) backward passes for zero
- [BUG]: re-join str type error_msgs using `\n\t` in general_checkpoint_io
- how to wrapped multiple models with booster HOT 3
- [BUG]: ColossalMoE Train: AssertionError: Parameters are expected to have the same dtype `torch.bfloat16`, but got `torch.float32` HOT 1
- [PROPOSAL]: Fix potential github action smells
- Does colossalai support rocm? HOT 1
- [BUG]: Slack link is invalid HOT 1
- [BUG]: GROK-1 does not support do_sample
- [BUG]: llama2 hybrid_parallel or 3d giving None loss when using pp_size > 1 HOT 6
- [DOC]: torch-version HOT 1
- [BUG]: fine train llama-2-7b-hf prepare data set error , `bos_token` and `eos_token` should be the same with `conversation_template.seps`. HOT 2
- [BUG]: No module named 'dropout_layer_norm'
- [BUG]: TypeError: LlamaInferenceForwards.llama_causal_lm_forward() got an unexpected keyword argument 'shard_config' HOT 1
- [BUG]: docker build cuda extension error HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from colossalai.