kyegomez / zeta Goto Github PK
View Code? Open in Web Editor NEWBuild high-performance AI models with modular building blocks
Home Page: https://zeta.apac.ai
License: Apache License 2.0
Build high-performance AI models with modular building blocks
Home Page: https://zeta.apac.ai
License: Apache License 2.0
In the VisionEmbedding example, run in a colab notebook:
ImportError Traceback (most recent call last)
[<ipython-input-10-de694ebd704c>](https://localhost:8080/#) in <cell line: 2>()
1 import torch
----> 2 from zeta.nn import VisionEmbedding
3
4 # Create an instance of VisionEmbedding
5 vision_embedding = VisionEmbedding(
ImportError: cannot import name 'VisionEmbedding' from 'zeta.nn' (/usr/local/lib/python3.10/dist-packages/zeta/nn/__init__.py)
zeta/nn/init.py imports
from zeta.nn.embeddings import *
So, I thought the fix was to change the example to:
from zeta.nn.embeddings import VisionEmbedding
but that doesn't work:
It appears VisionEmbedding isn't exported in zeta.nn.embeddings init,py
import torch
# Input data (batch of 5 samples with 10 features each)
input_data = torch.randn(5, 10)
# Forward pass through the MLP
output = mlp(input_data)
fails with
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
[<ipython-input-31-8ebb46f7f9e9>](https://localhost:8080/#) in <cell line: 7>()
5
6 # Forward pass through the MLP
----> 7 output = mlp(input_data)
5 frames
[/usr/local/lib/python3.10/dist-packages/torch/nn/modules/linear.py](https://localhost:8080/#) in forward(self, input)
112
113 def forward(self, input: Tensor) -> Tensor:
--> 114 return F.linear(input, self.weight, self.bias)
115
116 def extra_repr(self) -> str:
RuntimeError: mat1 and mat2 shapes cannot be multiplied (5x10 and 20x10)
Realized the example in the docs is wrong. A correct example is (patch incoming):
import torch
from zeta.nn import CustomMLP
# Define the layer sizes
layer_sizes = [5, 10, 1]
# Create the MLP
mlp = CustomMLP(layer_sizes, activation="relu", dropout=0.5)
# Create a random tensor of shape (batch_size, input_size)
x = torch.randn(32, 5)
# Pass the tensor through the MLP
output = mlp(x)
print(output)
The RelativePositionBias example has an incorrect parameter name:
n_heads should be num_heads.
PR incoming.
TypeError Traceback (most recent call last)
[<ipython-input-18-93e65b1f0592>](https://localhost:8080/#) in <cell line: 24>()
22
23 # Example 3: Modify default configurations
---> 24 custom_rel_pos_bias = RelativePositionBias(bidirectional=False, num_buckets=64, max_distance=256, n_heads=8)
TypeError: RelativePositionBias.__init__() got an unexpected keyword argument 'n_heads'
From a container, pytest test/test_init.py:
______________________________ ERROR collecting test_init.py _______________________________
ImportError while importing test module '/usr/src/zeta/tests/test_init.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/usr/local/lib/python3.10/importlib/__init__.py:126: in import_module
return _bootstrap._gcd_import(name[level:], package, level)
test_init.py:1: in <module>
import zeta
E ModuleNotFoundError: No module named 'zeta'
There is no zeta package to install.
Running tests from colab, with a pip install from latest github, and v100 GPU, and from CPU system:
FAILED test_main.py::test_zetacloud_basic - AssertionError: expected call not found.
FAILED test_main.py::test_zetacloud_with_stop - AssertionError: expected call not found.
FAILED test_main.py::test_zetacloud_with_down - AssertionError: expected call not found.
FAILED test_main.py::test_zetacloud_with_status_report - AssertionError: expected call not found.
FAILED test_main.py::test_zetacloud_with_exception - Failed: DID NOT RAISE <class 'Exception'>
I was testing the TruncatedRotaryEmbedding example in colab;
from zeta.nn.embeddings.truncated_rope import TruncatedRotaryEmbedding
import torch
# Define the parameters
dim = 64
a = 0.1
b = 0.9
rho = 0.5
seq_len = 100
device = torch.device('cuda')
# Create the TruncatedRotaryEmbedding module
trunc_rotary_emb = TruncatedRotaryEmbedding(dim, a, b, rho)
# Compute the truncated rotary embeddings for the specified sequence length
rotary_embeddings = trunc_rotary_emb(seq_len, device)
print(rotary_embeddings)
I got an error which is deeper in the code than I'm familiar with.
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
[<ipython-input-77-623bbebdd7d0>](https://localhost:8080/#) in <cell line: 13>()
11
12 # Create the TruncatedRotaryEmbedding module
---> 13 trunc_rotary_emb = TruncatedRotaryEmbedding(dim, a, b, rho)
14
15 # Compute the truncated rotary embeddings for the specified sequence length
1 frames
[/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py](https://localhost:8080/#) in register_buffer(self, name, tensor, persistent)
536 raise KeyError("buffer name can't be empty string \"\"")
537 elif hasattr(self, name) and name not in self._buffers:
--> 538 raise KeyError(f"attribute '{name}' already exists")
539 elif tensor is not None and not isinstance(tensor, torch.Tensor):
540 raise TypeError(f"cannot assign '{torch.typename(tensor)}' object to buffer '{name}' "
KeyError: "attribute 'inv_freq' already exists"
Hi,
I tried running vision mamba, which relies on zetascale. After following the error stack, I narrowed it down to zetascale causing the problem when trying to import it.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
For zetascale to be imported and usable
Additional context
I am working on Ubuntu 22.04, running Python 3.10.13 (I have had the same error with Python 3.9 and 3.11). I have the latest PyTorch as of February 21, 2024. I am working in a Conda virtual environment where I installed vision mamba by running 'pip install vision-mamba' which ran without issues.
I would greatly appreciate any advice on how to zetascale to work!
Thanks,
Julian
In colab, running the example from the readme:
import torch
from torch import nn
import zeta.quant as qt
...
The error is:
AttributeError Traceback (most recent call last)
[<ipython-input-8-653fc7394b85>](https://localhost:8080/#) in <cell line: 3>()
1 import torch
2 from torch import nn
----> 3 import zeta.quant as qt
4
5 class MyModel(nn.Module):
[/usr/local/lib/python3.10/dist-packages/zeta/__init__.py](https://localhost:8080/#) in <module>
8 from zeta.training import * # noqa: F403, E402
9 from zeta.tokenizers import * # noqa: F403, E402
---> 10 from zeta.rl import * # noqa: F403, E402
11 from zeta.optim import * # noqa: F403, E402
12 from zeta.ops import * # noqa: F403, E402
AttributeError: module 'zeta.rl' has no attribute 'log_prob'
In looking at the zeta.rl init.py, it did not include a line to import log_prob from DPO (proposed fix):
from zeta.rl.dpo import (
freeze_all_layers,
log_prob_from_model_and_seq,
log_prob,
DPO,
)
I think you should be dividing by the scale in the following line
zeta/zeta/nn/modules/rms_norm.py
Line 35 in 7dbb6a6
This this the scale definition
self.scale = dim**-0.5
And RMSNorm formula
Edit:
Also, I think the normalization should be in the dim -1, not -2
zeta/zeta/nn/modules/rms_norm.py
Line 34 in 7dbb6a6
Describe the bug
Tests are failing because Attend doesn't have dim as a parameter.
ERROR zeta/tests/models/test_maxvit.py::test_maxvit_constructor - TypeError: Attend.__init__() got an unexpected keyword argument 'dim'
Attend does not have dim:
class Attend(nn.Module):
def __init__(
self,
*,
dropout=0.0,
causal=False,
heads=None,
talking_heads=False,
sparse_topk=None,
scale=None,
qk_norm=False,
flash=False,
add_zero_kv=False,
onnxable=False,
):
but MaxVit needs it:
class MaxVit(nn.Module):
def __init__(
self,
*,
num_classes,
dim,
depth,
dim_head: int = 32,
dim_conv_stem=None,
window_size: int = 7,
mbconv_expansion_rate: int = 4,
mbconv_shrinkage_rate=0.25,
dropout=0.01,
channels=3,
):
One way to fix this would be to add dim to Attend, but that requires a bit of logic to make it an optional parameter.
Another way to fix it would be to make a multi-model Attend, which would be called by multi-modal models, separate from the sequential attend. (My preferred solution, but may be hard to implement.)
The example for inverted residual block in the MBConv docs
from zeta.nn import MBConv
import torch
# Create an inverted residual block with 64 input channels, 128 output channels, and downsampling
mbconv_block = MBConv(64, 128, downsample=True)
# Create an input tensor
x = torch.randn(32, 64, 32, 32) # Example input with 32 samples and 64 channels
# Apply the inverted residual block
output = mbconv_block(x)
# Output tensor
print(output)
Throws an error
---------------------------------------------------------------------------
EinopsError Traceback (most recent call last)
[/usr/local/lib/python3.10/dist-packages/einops/einops.py](https://localhost:8080/#) in reduce(tensor, pattern, reduction, **axes_lengths)
521 shape = backend.shape(tensor)
--> 522 recipe = _prepare_transformation_recipe(pattern, reduction, axes_names=tuple(axes_lengths), ndim=len(shape))
523 return _apply_recipe(
9 frames
EinopsError: Non-unitary anonymous axes are not supported in rearrange (exception is length 1)
During handling of the above exception, another exception occurred:
EinopsError Traceback (most recent call last)
[/usr/local/lib/python3.10/dist-packages/einops/einops.py](https://localhost:8080/#) in reduce(tensor, pattern, reduction, **axes_lengths)
531 message += "\n Input is list. "
532 message += "Additional info: {}.".format(axes_lengths)
--> 533 raise EinopsError(message + "\n {}".format(e))
534
535
EinopsError: Error while processing rearrange-reduction pattern "b c -> b c 11".
Input tensor shape: torch.Size([32, 512]). Additional info: {}.
Non-unitary anonymous axes are not supported in rearrange (exception is length 1)
I first tried to rewrite the example:
# Import the necessary modules
import torch
from zeta.nn import MBConv
# Define the input and output dimensions
dim_in = 32
dim_out = 64
# Define whether to downsample
downsample = True
# Define the expansion rate and shrinkage rate
expansion_rate = 6
shrinkage_rate = 0.25
# Define the dropout rate
dropout = 0.1
# Create the inverted residual block
block = MBConv(dim_in, dim_out, downsample=downsample, expansion_rate=expansion_rate, shrinkage_rate=shrinkage_rate, dropout=dropout)
# Create a random tensor of shape (batch_size, dim_in, height, width)
x = torch.randn(16, dim_in, 32, 32)
# Pass the tensor through the block
output = block(x)
print(output.shape)
but that gave the same error.
Copilot's suggested fix, is to rewrite the SqueezeExitation:
class SqueezeExcitation(nn.Module):
def __init__(self, dim, shrinkage_rate=0.25):
super().__init__()
hidden_dim = int(dim * shrinkage_rate)
self.gate = nn.Sequential(
nn.Linear(dim, hidden_dim, bias=False),
nn.SiLU(),
nn.Linear(hidden_dim, dim, bias=False),
nn.Sigmoid(),
)
def forward(self, x):
b, c, h, w = x.shape
y = reduce(x, "b c h w -> b c", "mean")
y = self.gate(y)
y = rearrange(y, "b c -> b c () ()")
return x * y.expand_as(x)
with the explanation:
The error is due to the rearrange operation in the SqueezeExcitation
class. The operation "b c -> b c 11" is trying to reshape the tensor to have an additional dimension of size 11, but this is not possible because the input tensor only has two dimensions.
The SqueezeExcitation
class is trying to apply a squeeze-and-excitation operation, which typically involves reducing the spatial dimensions of the input tensor to 1x1 (i.e., global average pooling), applying a gating mechanism, and then expanding the spatial dimensions back to their original size. However, the code seems to be trying to expand the spatial dimensions to a fixed size of 11x11, which is not correct.
Here is a corrected version of the SqueezeExcitation
class:
class SqueezeExcitation(nn.Module):
def __init__(self, dim, shrinkage_rate=0.25):
super().__init__()
hidden_dim = int(dim * shrinkage_rate)
self.gate = nn.Sequential(
nn.Linear(dim, hidden_dim, bias=False),
nn.SiLU(),
nn.Linear(hidden_dim, dim, bias=False),
nn.Sigmoid(),
)
def forward(self, x):
b, c, h, w = x.shape
y = reduce(x, "b c h w -> b c", "mean")
y = self.gate(y)
y = rearrange(y, "b c -> b c () ()")
return x * y.expand_as(x)
In this corrected version, the forward method first applies global average pooling to the input tensor x
to get a tensor y
of shape (b, c), then applies the gating mechanism to y
, and finally expands the spatial dimensions of y
back to their original size by broadcasting. The output is the element-wise product of x
and the expanded y
, which is a common operation in squeeze-and-excitation blocks.
Does this look reasonable to you?
Describe the bug
Spelling
---------------------------------------------------------------------------
EinopsError Traceback (most recent call last)
[/usr/local/lib/python3.10/dist-packages/einops/einops.py](https://localhost:8080/#) in reduce(tensor, pattern, reduction, **axes_lengths)
521 shape = backend.shape(tensor)
--> 522 recipe = _prepare_transformation_recipe(pattern, reduction, axes_names=tuple(axes_lengths), ndim=len(shape))
523 return _apply_recipe(
10 frames
EinopsError: Wrong shape: expected 4 dims. Received 3-dim tensor.
During handling of the above exception, another exception occurred:
EinopsError Traceback (most recent call last)
[/usr/local/lib/python3.10/dist-packages/einops/einops.py](https://localhost:8080/#) in reduce(tensor, pattern, reduction, **axes_lengths)
531 message += "\n Input is list. "
532 message += "Additional info: {}.".format(axes_lengths)
--> 533 raise EinopsError(message + "\n {}".format(e))
534
535
EinopsError: Error while processing rearrange-reduction pattern "b p n (h d) -> b h p n d".
Input tensor shape: torch.Size([5, 4, 512]). Additional info: {'h': 8}.
Wrong shape: expected 4 dims. Received 3-dim tensor.
I'm going to try to figure out what the correct input shape is.
change the file name main.py -> test_main.py
___________________________ ERROR collecting rl/test_prioritizedreplybuffer.py ___________________________
ImportError while importing test module '/home/v/vzeta/tests/rl/test_prioritizedreplybuffer.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/usr/lib/python3.10/importlib/__init__.py:126: in import_module
return _bootstrap._gcd_import(name[level:], package, level)
rl/test_prioritizedreplybuffer.py:4: in <module>
from zeta.rl.priortized_replay_buffer import (
../../.local/lib/python3.10/site-packages/zeta/rl/priortized_replay_buffer.py:1: in <module>
from sumtree import SumTree
E ModuleNotFoundError: No module named 'sumtree'
Looks like a duplicate test in the nn/modules directory. Deleted in incoming pr.
_____________________________ ERROR collecting quant/test_bitlinear.py _____________________________
import file mismatch:
imported module 'test_bitlinear' has this __file__ attribute:
/workspaces/zeta/tests/nn/modules/test_bitlinear.py
which is not the same as the test file we want to collect:
/workspaces/zeta/tests/quant/test_bitlinear.py
HINT: remove __pycache__ / .pyc files and/or use a unique basename for your test file modules
Hi @kyegomez, you closed kyegomez/swarms#62 about you spamming people as "not planned". Can you confirm that you do not intent to take any action?
Please take a look at:
Please stop spamming me, and others. Also see kyegomez/swarms#40
From container, with pip install zetascale.
In looking at the error, it appears the pip apex is not the package for the nvdia apex[1]
The fix is to install:
git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --disable-pip-version-check --no-cache-dir \
--global-option="--cpp_ext" --global-option="--cuda_ext" ./
following [2]
______________________________ ERROR collecting test_init.py _______________________________
ImportError while importing test module '/usr/src/zeta/tests/test_init.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/usr/local/lib/python3.10/importlib/__init__.py:126: in import_module
return _bootstrap._gcd_import(name[level:], package, level)
test_init.py:1: in <module>
import zeta
/usr/local/lib/python3.10/site-packages/zeta/__init__.py:5: in <module>
from zeta.nn import * # noqa: F403, E402
/usr/local/lib/python3.10/site-packages/zeta/nn/__init__.py:1: in <module>
from zeta.nn.attention import *
Te/usr/local/lib/python3.10/site-packages/zeta/nn/attention/__init__.py:14: in <module>
from zeta.nn.attention.mixture_attention import (
/usr/local/lib/python3.10/site-packages/zeta/nn/attention/mixture_attention.py:8: in <module>
from zeta.models.vit import exists
/usr/local/lib/python3.10/site-packages/zeta/models/__init__.py:3: in <module>
from zeta.models.andromeda import Andromeda
/usr/local/lib/python3.10/site-packages/zeta/models/andromeda.py:4: in <module>
from zeta.structs.auto_regressive_wrapper import AutoregressiveWrapper
/usr/local/lib/python3.10/site-packages/zeta/structs/__init__.py:8: in <module>
from zeta.structs.local_transformer import LocalTransformer
/usr/local/lib/python3.10/site-packages/zeta/structs/local_transformer.py:8: in <module>
from zeta.nn.modules import feedforward_network
/usr/local/lib/python3.10/site-packages/zeta/nn/modules/__init__.py:12: in <module>
from zeta.nn.modules.feedforward_network import FeedForwardNetwork
/usr/local/lib/python3.10/site-packages/zeta/nn/modules/feedforward_network.py:9: in <module>
from apex.normalization import FusedLayerNorm as LayerNorm
/usr/local/lib/python3.10/site-packages/apex/__init__.py:13: in <module>
from pyramid.session import UnencryptedCookieSessionFactoryConfig
E ImportError: cannot import name 'UnencryptedCookieSessionFactoryConfig' from 'pyramid.session' (unknown location)
[1] https://stackoverflow.com/questions/66610378/unencryptedcookiesessionfactoryconfig-error-when-importing-apex
[2] https://stackoverflow.com/a/67188946
ImportError Traceback (most recent call last)
[<ipython-input-11-0b8f0cee3df3>](https://localhost:8080/#) in <cell line: 2>()
1 import torch
----> 2 from zeta.nn import FusedDenseGELUDense
3
4 x = torch.randn(1, 512)
5 model = FusedDenseGELUDense(512, 1024)
ImportError: cannot import name 'FusedDenseGELUDense' from 'zeta.nn' (/content/zeta/zeta/nn/__init__.py)
The docs for PositionalEmbedding appear to be the top level docs for all of zeta, not the specific PositionalEmbedding docs.
https://zeta.apac.ai/en/latest/zeta/nn/embeddings/positional_embeddings/
This could be merely a formatting issue. They don't resemlble:
https://zeta.apac.ai/en/latest/zeta/nn/embeddings/truncated_rope/
A standard docs template more like:
https://zeta.apac.ai/en/latest/zeta/nn/biases/relative_bias/
Would be helpful.
This, I believe, is naming mismatch.
simple_vision_encoder has VisionEncoder, not SimpleVisionEncoder.
init.py has VisionEncoder
test_simple_vision_encoder has SimpleVisionEncoder.
My suggested fix is to change simple_vision_encoder and init.py to be SimpleVisionEncoder.
_________ ERROR collecting tests/structs/test_simple_vision_encoder.py _________
ImportError while importing test module '/usr/src/zeta/tests/structs/test_simple_vision_encoder.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/usr/local/lib/python3.10/importlib/__init__.py:126: in import_module
return _bootstrap._gcd_import(name[level:], package, level)
tests/structs/test_simple_vision_encoder.py:2: in <module>
from zeta.structs.simple_vision_encoder import SimpleVisionEncoder
E ModuleNotFoundError: No module named 'zeta.structs.simple_vision_encoder'```
<!-- POLAR PLEDGE BADGE START -->
## Upvote & Fund
- We're using [Polar.sh](https://polar.sh/kyegomez) so you can upvote and help fund this issue.
- We receive the funding once the issue is completed & confirmed by you.
- Thank you in advance for helping prioritize & fund our backlog.
<a href="https://polar.sh/kyegomez/zeta/issues/84">
<picture>
<source media="(prefers-color-scheme: dark)" srcset="https://polar.sh/api/github/kyegomez/zeta/issues/84/pledge.svg?darkmode=1">
<img alt="Fund with Polar" src="https://polar.sh/api/github/kyegomez/zeta/issues/84/pledge.svg">
</picture>
</a>
<!-- POLAR PLEDGE BADGE END -->
Describe the bug
This is the default template.
I think the traceback is occuring at:
# Test with mock and monkeypatch
def test_cast_tuple_with_mock_and_monkeypatch(monkeypatch):
def mock_isinstance(val, t):
return False
monkeypatch.setattr("builtins.isinstance", mock_isinstance)
assert cast_tuple((1, 2), 1) == ((1, 2),)
Tracebac
tests/utils/test_cast_tuple.py .....Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/_pytest/main.py", line 271, in wrap_session
session.exitstatus = doit(config, session) or 0
File "/usr/local/lib/python3.10/site-packages/_pytest/main.py", line 325, in _main
config.hook.pytest_runtestloop(session=session)
File "/usr/local/lib/python3.10/site-packages/pluggy/_hooks.py", line 493, in __call__
return self._hookexec(self.name, self._hookimpls, kwargs, firstresult)
File "/usr/local/lib/python3.10/site-packages/pluggy/_manager.py", line 115, in _hookexec
return self._inner_hookexec(hook_name, methods, kwargs, firstresult)
File "/usr/local/lib/python3.10/site-packages/pluggy/_callers.py", line 152, in _multicall
return outcome.get_result()
File "/usr/local/lib/python3.10/site-packages/pluggy/_result.py", line 114, in get_result
raise exc.with_traceback(exc.__traceback__)
File "/usr/local/lib/python3.10/site-packages/pluggy/_callers.py", line 137, in _multicall
teardown.throw(outcome._exception)
AttributeError: 'tuple' object has no attribute 'throw'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/_pytest/main.py", line 291, in wrap_session
config.notify_exception(excinfo, config.option)
File "/usr/local/lib/python3.10/site-packages/_pytest/config/__init__.py", line 1105, in notify_exception
excrepr = excinfo.getrepr(
File "/usr/local/lib/python3.10/site-packages/_pytest/_code/code.py", line 686, in getrepr
self.traceback[0]._rawentry if self.traceback else None,
File "/usr/local/lib/python3.10/site-packages/_pytest/_code/code.py", line 583, in traceback
self._traceback = Traceback(self.tb)
File "/usr/local/lib/python3.10/site-packages/_pytest/_code/code.py", line 343, in __init__
super().__init__(tb)
TypeError: 'traceback' object is not iterable
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/bin/pytest", line 8, in <module>
sys.exit(console_main())
File "/usr/local/lib/python3.10/site-packages/_pytest/config/__init__.py", line 192, in console_main
code = main()
File "/usr/local/lib/python3.10/site-packages/_pytest/config/__init__.py", line 169, in main
ret: Union[ExitCode, int] = config.hook.pytest_cmdline_main(
File "/usr/local/lib/python3.10/site-packages/pluggy/_hooks.py", line 493, in __call__
return self._hookexec(self.name, self._hookimpls, kwargs, firstresult)
File "/usr/local/lib/python3.10/site-packages/pluggy/_manager.py", line 115, in _hookexec
return self._inner_hookexec(hook_name, methods, kwargs, firstresult)
File "/usr/local/lib/python3.10/site-packages/pluggy/_callers.py", line 113, in _multicall
raise exception.with_traceback(exception.__traceback__)
File "/usr/local/lib/python3.10/site-packages/pluggy/_callers.py", line 77, in _multicall
res = hook_impl.function(*args)
File "/usr/local/lib/python3.10/site-packages/_pytest/main.py", line 318, in pytest_cmdline_main
return wrap_session(config, _main)
File "/usr/local/lib/python3.10/site-packages/_pytest/main.py", line 306, in wrap_session
config.hook.pytest_sessionfinish(
File "/usr/local/lib/python3.10/site-packages/pluggy/_hooks.py", line 493, in __call__
return self._hookexec(self.name, self._hookimpls, kwargs, firstresult)
File "/usr/local/lib/python3.10/site-packages/pluggy/_manager.py", line 115, in _hookexec
return self._inner_hookexec(hook_name, methods, kwargs, firstresult)
File "/usr/local/lib/python3.10/site-packages/pluggy/_callers.py", line 152, in _multicall
return outcome.get_result()
File "/usr/local/lib/python3.10/site-packages/pluggy/_result.py", line 114, in get_result
raise exc.with_traceback(exc.__traceback__)
File "/usr/local/lib/python3.10/site-packages/pluggy/_callers.py", line 137, in _multicall
teardown.throw(outcome._exception)
AttributeError: 'tuple' object has no attribute 'throw'
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
[<ipython-input-26-508a53a4cb02>](https://localhost:8080/#) in <cell line: 26>()
24
25 # Compute loss
---> 26 loss = dpo_model(preferred_seq, unpreferred_seq)
27 print(loss)
9 frames
[/usr/local/lib/python3.10/dist-packages/torch/nn/modules/linear.py](https://localhost:8080/#) in forward(self, input)
112
113 def forward(self, input: Tensor) -> Tensor:
--> 114 return F.linear(input, self.weight, self.bias)
115
116 def extra_repr(self) -> str:
RuntimeError: mat1 and mat2 must have the same dtype, but got Long and Float
Describe the bug
ERROR tests/structs/test_transformer.py::test_creation - TypeError: AttentionLayers.__init__() missing 1 required positional argument: 'depth'
ERROR tests/structs/test_transformer.py::test_forward[x0-expected_output_size0] - TypeError: AttentionLayers.__init__() missing 1 required positional argument: 'depth'
ERROR tests/structs/test_transformer.py::test_forward[x1-expected_output_size1] - TypeError: AttentionLayers.__init__() missing 1 required positional argument: 'depth'
ERROR tests/structs/test_transformer.py::test_forward[x2-expected_output_size2] - TypeError: AttentionLayers.__init__() missing 1 required positional argument: 'depth'
ERROR tests/structs/test_transformer.py::test_forward_exception[wrong_input0] - TypeError: AttentionLayers.__init__() missing 1 required positional argument: 'depth'
ERROR tests/structs/test_transformer.py::test_forward_exception[wrong_input1] - TypeError: AttentionLayers.__init__() missing 1 required positional argument: 'depth'
ERROR tests/structs/test_transformer.py::test_forward_exception[string] - TypeError: AttentionLayers.__init__() missing 1 required positional argument: 'depth'
AttentionLayer has depth, but it doesn't have a default value:
class AttentionLayers(nn.Module):
def __init__(
self,
dim,
depth,
heads=8,
causal=False,
cross_attend=False,
only_cross=False,
use_scalenorm=False,
use_rmsnorm=False,
use_simple_rmsnorm=False,
alibi_pos_bias=False,
alibi_num_heads=None,
rel_pos_bias=False,
rel_pos_num_buckets=32,
rel_pos_max_distance=128,
dynamic_pos_bias=False,
dynamic_pos_bias_log_distance=False,
dynamic_pos_bias_mlp_depth=2,
dynamic_pos_bias_norm=False,
rotary_pos_emb=False,
rotary_emb_dim=None,
rotary_xpos=False,
rotary_interpolation_factor=1.0,
rotary_xpos_scale_base=512,
rotary_base_rescale_factor=1.0,
custom_layers=None,
sandwich_coef=None,
par_ratio=None,
residual_attn=False,
cross_residual_attn=False,
macaron=False,
pre_norm=True,
pre_norm_has_final_norm=True,
gate_residual=False,
scale_residual=False,
scale_residual_constant=1.0,
deepnorm=False,
shift_tokens=0,
sandwich_norm=False,
resi_dual=False,
resi_dual_scale=1.0,
zero_init_branch_output=False,
layer_dropout=0.0,
cross_attn_tokens_dropout=0.0,
**kwargs,
):
So, when it's called with only a single value(test_transformer), there's an error:
@pytest.fixture()
def init_transformer():
attn_layers = AttentionLayers(
256
)
To Reproduce
Steps to reproduce the behavior:
Expected behavior
A clear and concise description of what you expected to happen.
Screenshots
If applicable, add screenshots to help explain your problem.
Additional context
Add any other context about the problem here.
_____________________________ ERROR collecting main.py _____________________________
ImportError while importing test module '/home/v/vzeta/tests/cloud/main.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/usr/lib/python3.10/importlib/init.py:126: in import_module
return _bootstrap._gcd_import(name[level:], package, level)
main.py:3: in
from zeta.cloud.main import zetacloud
E ModuleNotFoundError: No module named 'zeta.cloud'
I have looked, and this module exists, and is in init.py
_____________ ERROR collecting tests/quant/test_half_bit_linear.py _____________
ImportError while importing test module '/usr/src/zeta/tests/quant/test_half_bit_linear.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/usr/local/lib/python3.10/importlib/__init__.py:126: in import_module
return _bootstrap._gcd_import(name[level:], package, level)
tests/quant/test_half_bit_linear.py:3: in <module>
from zeta.quant.half_bit_linear import HalfBitLinear
E ModuleNotFoundError: No module named 'zeta.quant.half_bit_linear'
from zeta.nn import MambaBlock
raceback (most recent call last):
File "/Users/teli/www/ml/SSM/zeta_mamba.py", line 2, in <module>
from zeta.nn import MambaBlock
RuntimeError: Expected tensor for argument #1 'indices' to have one of the following scalar types: Long, Int; but got torch.FloatTensor instead (while checking arguments for embedding)
To Reproduce
use the Readme example
Expected behavior
should run simple mamba block
Additional context
Add any other context about the problem here.
the lastest zata version uses torch2.2
In the docs for PositionalEmbedding (https://zeta.apac.ai/en/latest/zeta/nn/embeddings/positional_embeddings/)
In the example:
from zeta.nn import PositionalEmbedding
import torch
# Create a PositionalEmbedding instance
positional_embedding = PositionalEmbedding(num_embeddings=100, embedding_dim=128)
# Generate positional embeddings for a sequence of length 10
positions = torch.arange(10)
embeddings = positional_embedding(positions)
on colab, running it has an error:
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
[<ipython-input-79-85c74b797d27>](https://localhost:8080/#) in <cell line: 10>()
8 # Generate positional embeddings for a sequence of length 10
9 positions = torch.arange(10)
---> 10 embeddings = positional_embedding(positions)
2 frames
[/usr/local/lib/python3.10/dist-packages/zeta/nn/embeddings/positional.py](https://localhost:8080/#) in forward(self, x, positions, **kwargs)
26 # being consistent with Fairseq, which starts from 2.
27 positions = (
---> 28 torch.arange(2, x.size(1) + 2, device=x.device)
29 .long()
30 .unsqueeze(0)
IndexError: Dimension out of range (expected to be in range of [-1, 0], but got 1)
__________________________________________ ERROR collecting test_test_example.py __________________________________________
ImportError while importing test module '/home/v/vzeta/tests/test_test_example.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/usr/lib/python3.10/importlib/__init__.py:126: in import_module
return _bootstrap._gcd_import(name[level:], package, level)
test_test_example.py:5: in <module>
from zeta import MultiheadAttention
E ImportError: cannot import name 'MultiheadAttention' from 'zeta' (/home/v/.local/lib/python3.10/site-packages/zeta/__init__.py)
From colab notebook, the example from the docs:
from zeta import rotate_half
import torch
# Create an input tensor
x = torch.randn(2, 3, 4)
# Rotate the input tensor
rotated_x = rotate_half(x)
fails with an import error
ImportError Traceback (most recent call last)
[<ipython-input-38-339cf597a136>](https://localhost:8080/#) in <cell line: 1>()
----> 1 from zeta import rotate_half
2 import torch
3
4 # Create an input tensor
5 x = torch.randn(2, 3, 4)
ImportError: cannot import name 'rotate_half' from 'zeta' (/usr/local/lib/python3.10/dist-packages/zeta/__init__.py)
rotate_half is implemented in several places, and I don't know which one it isn't finding.
______________________ ERROR collecting tests/test_mha.py ______________________
ImportError while importing test module '/home/v/zeta/tests/test_mha.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/usr/lib/python3.10/importlib/init.py:126: in import_module
return _bootstrap._gcd_import(name[level:], package, level)
tests/test_mha.py:1: in
from zeta.utils.attention.multihead_attention import MultiheadAttention
E ModuleNotFoundError: No module named 'zeta.utils.attention'
import file mismatch:
imported module 'test_bitlinear' has this file attribute:
/home/v/vzeta/tests/nn/modules/test_bitlinear.py
which is not the same as the test file we want to collect:
/home/v/vzeta/tests/quant/test_bitlinear.py
HINT: remove pycache / .pyc files and/or use a unique basename for your test file modules
Checking GCP...
Enabling Compute Engine API (free of charge; this may take a minute)...
Failed. Detailed output:
ERROR: (gcloud.services.enable) FAILED_PRECONDITION: Billing account for project '522837608576' is not found. Billing must be enabled for activation of service(s) 'compute.googleapis.com,compute.googleapis.com,compute.googleapis.com' to proceed.
Help Token: ARD_zUbFpP-0qySPromvgtJNdw33QM1HJpmBG-BiRaVI9eOZ4mMJ9-MqOcvVj8OnLeGUqI3WlHsRbqPAtxXrnnY8VB6U3Xpas5zyB6_mbqgH3oXn
- '@type': type.googleapis.com/google.rpc.PreconditionFailure
violations:
- subject: ?error_code=390001&project=522837608576&services=compute.googleapis.com&services=compute.googleapis.com&services=compute.googleapis.com
type: googleapis.com/billing-enabled
- '@type': type.googleapis.com/google.rpc.ErrorInfo
domain: serviceusage.googleapis.com/billing-enabled
metadata:
project: '522837608576'
services: compute.googleapis.com,compute.googleapis.com,compute.googleapis.com
reason: UREQ_PROJECT_BILLING_NOT_FOUND
From colab, the 2nd SinusoidalEmbeddings example
from zeta import apply_rotary_pos_emb
import torch
# Create query and key tensors
q = torch.randn(2, 3, 4)
k = torch.randn(2, 3, 4)
# Generate frequency and scale embeddings using SinusoidalEmbeddings
# Apply rotary positional embeddings
q_emb, k_emb = apply_rotary_pos_emb(q, k, freqs, scale)
Has the error
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
[<ipython-input-39-a72a64e54f5b>](https://localhost:8080/#) in <cell line: 11>()
9
10 # Apply rotary positional embeddings
---> 11 q_emb, k_emb = apply_rotary_pos_emb(q, k, freqs, scale)
[/usr/local/lib/python3.10/dist-packages/zeta/nn/embeddings/xpos_relative_position.py](https://localhost:8080/#) in apply_rotary_pos_emb(x, sin, cos, scale)
69 """
70 sin, cos = map(lambda t: duplicate_interleave(t * scale), (sin, cos))
---> 71 return (x * cos) + (rotate_every_two(x) * sin)
72
73
RuntimeError: The size of tensor a (4) must match the size of tensor b (128) at non-singleton dimension 2
This is another instance of the problem reported in #106 which I'm still testing the fix for.
Running the xpos first example from the docs has an import erro:
ModuleNotFoundError Traceback (most recent call last)
[<ipython-input-14-f2a7ff6796e7>](https://localhost:8080/#) in <cell line: 2>()
1 import torch
----> 2 from xpos import XPOS
3
4 # Create an instance of the XPOS module
5 xpos = XPOS(head_dim=256)
ModuleNotFoundError: No module named 'xpos'
I checked the init in nn.embeddings, but it looks ok. Doing the full path in the example, resolves the issue:
import torch
from zeta.nn.embeddings.xpos_relative_position import XPOS
# Create an instance of the XPOS module
xpos = XPOS(head_dim=256)
# Generate a random input tensor
x = torch.randn(1, 10, 256)
# Apply the XPOS module to the input tensor
output = xpos(x)
My zetascale version is:
zetascale 1.2.4
When I run the code in
https://github.com/kyegomez/Fuyu/tree/main/fuyu
The error is:
Traceback (most recent call last):
File "/mnt/sda/shaoyanli/fuyu/Fuyu/example.py", line 2, in
from fuyu.model import Fuyu
File "/mnt/sda/shaoyanli/fuyu/Fuyu/fuyu/init.py", line 1, in
from fuyu.model import Fuyu
File "/mnt/sda/shaoyanli/fuyu/Fuyu/fuyu/model.py", line 3, in
from zeta.structs import AutoregressiveWrapper, Decoder, Transformer
File "/mnt/sda/anaconda3/envs/fuyu/lib/python3.10/site-packages/zeta/init.py", line 5, in
from zeta.nn import * # noqa: F403, E402
File "/mnt/sda/anaconda3/envs/fuyu/lib/python3.10/site-packages/zeta/nn/init.py", line 1, in
from zeta.nn.attention import *
File "/mnt/sda/anaconda3/envs/fuyu/lib/python3.10/site-packages/zeta/nn/attention/init.py", line 14, in
from zeta.nn.attention.mixture_attention import (
File "/mnt/sda/anaconda3/envs/fuyu/lib/python3.10/site-packages/zeta/nn/attention/mixture_attention.py", line 8, in
from zeta.models.vit import exists
File "/mnt/sda/anaconda3/envs/fuyu/lib/python3.10/site-packages/zeta/models/init.py", line 3, in
from zeta.models.andromeda import Andromeda
File "/mnt/sda/anaconda3/envs/fuyu/lib/python3.10/site-packages/zeta/models/andromeda.py", line 4, in
from zeta.structs.auto_regressive_wrapper import AutoregressiveWrapper
File "/mnt/sda/anaconda3/envs/fuyu/lib/python3.10/sit/mnt/sda/anaconda3/envs/fuyu/lib/python3.10/site-packages/zeta/structs/e-packages/zeta/structs/init.py", line 4, in
from zeta.structs.hierarchical_transformer import (
File "/mnt/sda/anaconda3/envs/fuyu/lib/python3.10/site-packages/zeta/structs/hierarchical_transformer.py", line 13, in
from zeta.structs.attn_layers import rotate_half
ModuleNotFoundError: No module named 'zeta.structs.attn_layers'
so I go to the package path of zeta, but I can't find the File zeta/structs/attn_layers
This file contains only '_'
It was added this way initially.
______________________________________________________ test_imports _______________________________________________________
def test_imports():
modules = [
"nn",
"structs",
"models",
"utils",
"training",
"tokenizers",
"rl",
"optim",
"ops",
"quant",
]
missing_modules = []
for module in modules:
if not hasattr(zeta, module):
missing_modules.append(module)
> assert (
not missing_modules
), f"Modules {', '.join(missing_modules)} not found in zeta package"
E AssertionError: Modules structs, quant not found in zeta package
E assert not ['structs', 'quant']
test_init.py:22: AssertionError
I looked and the code does contain this module name, and it is in the init,py.
__________ ERROR collecting tests/nn/modules/test_adaptive_rmsnorm.py __________
ImportError while importing test module '/usr/src/zeta/tests/nn/modules/test_adaptive_rmsnorm.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/usr/local/lib/python3.10/importlib/__init__.py:126: in import_module
return _bootstrap._gcd_import(name[level:], package, level)
tests/nn/modules/test_adaptive_rmsnorm.py:3: in <module>
from zeta.nn.modules.adaptive_rmsnorm import AdaptiveRMSNorm
E ModuleNotFoundError: No module named 'zeta.nn.modules.adaptive_rmsnorm'
_______________________ ERROR collecting rl/test_prioritizedsequencereplybuffer.py _______________________
ImportError while importing test module '/home/v/vzeta/tests/rl/test_prioritizedsequencereplybuffer.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/usr/lib/python3.10/importlib/__init__.py:126: in import_module
return _bootstrap._gcd_import(name[level:], package, level)
rl/test_prioritizedsequencereplybuffer.py:4: in <module>
from zeta.rl.priortized_rps import (
../../.local/lib/python3.10/site-packages/zeta/rl/priortized_rps.py:1: in <module>
from sumtree import SumTree
E ModuleNotFoundError: No module named 'sumtree'
The official package has the following problem, I noticed it when testing Lumiere..
from zeta.nn.attention.multiquery_attention import MultiQueryAttention
from zeta.nn.modules import SimpleFeedForward
2e50e6f
from zeta.nn.attention.cross_attention import CrossAtte
Hi, I have noticed that some codes are similar to the x-transformers'. Are you the original? Or you may give credit to Lucidrain?
__________________________ ERROR collecting nn/modules/test_linearactivation.py __________________________
../../.local/lib/python3.10/site-packages/_pytest/runner.py:341: in from_call
result: Optional[TResult] = func()
../../.local/lib/python3.10/site-packages/_pytest/runner.py:372: in <lambda>
call = CallInfo.from_call(lambda: list(collector.collect()), "collect")
../../.local/lib/python3.10/site-packages/_pytest/python.py:531: in collect
self._inject_setup_module_fixture()
../../.local/lib/python3.10/site-packages/_pytest/python.py:545: in _inject_setup_module_fixture
self.obj, ("setUpModule", "setup_module")
../../.local/lib/python3.10/site-packages/_pytest/python.py:310: in obj
self._obj = obj = self._getobj()
../../.local/lib/python3.10/site-packages/_pytest/python.py:528: in _getobj
return self._importtestmodule()
../../.local/lib/python3.10/site-packages/_pytest/python.py:617: in _importtestmodule
mod = import_path(self.path, mode=importmode, root=self.config.rootpath)
../../.local/lib/python3.10/site-packages/_pytest/pathlib.py:567: in import_path
importlib.import_module(module_name)
/usr/lib/python3.10/importlib/__init__.py:126: in import_module
return _bootstrap._gcd_import(name[level:], package, level)
<frozen importlib._bootstrap>:1050: in _gcd_import
???
<frozen importlib._bootstrap>:1027: in _find_and_load
???
<frozen importlib._bootstrap>:1006: in _find_and_load_unlocked
???
<frozen importlib._bootstrap>:688: in _load_unlocked
???
../../.local/lib/python3.10/site-packages/_pytest/assertion/rewrite.py:178: in exec_module
exec(co, module.__dict__)
nn/modules/test_linearactivation.py:21: in <module>
@pytest.mark.parametrize("input_tensor", [(torch.tensor([1, 2, "a"]))])
E TypeError: new(): invalid data type 'str'
There's a dimensional error in the 3rd example of Xpos:
Code
import torch
from zeta import fixed_pos_embedding, apply_rotary_pos_emb
# Generate fixed positional embeddings
scale = torch.randn(10, 256)
sin, cos = fixed_pos_embedding(scale)
# Apply rotary positional embeddings to an input tensor
x = torch.randn(1, 10, 256)
output = apply_rotary_pos_emb(x, sin, cos, scale=0.5)
RuntimeError Traceback (most recent call last)
[<ipython-input-18-4d63cb090aa8>](https://localhost:8080/#) in <cell line: 10>()
8 # Apply rotary positional embeddings to an input tensor
9 x = torch.randn(1, 10, 256)
---> 10 output = apply_rotary_pos_emb(x, sin, cos, scale=0.5)
[/usr/local/lib/python3.10/dist-packages/zeta/nn/embeddings/xpos_relative_position.py](https://localhost:8080/#) in apply_rotary_pos_emb(x, sin, cos, scale)
69 """
70 sin, cos = map(lambda t: duplicate_interleave(t * scale), (sin, cos))
---> 71 return (x * cos) + (rotate_every_two(x) * sin)
72
73
RuntimeError: The size of tensor a (256) must match the size of tensor b (512) at non-singleton dimension 2
The implementation in the original paper is wrong. This wrong implementation was copied into hf/transformers, and then fixed:
huggingface/transformers@052fa2f
https://github.com/huggingface/transformers/blob/edb170238febf7fc3e3278ed5b9ca0b2c40c70e3/src/transformers/models/gptj/modeling_flax_gptj.py#L122
zeta/nn/modules/Matrix.py installs tensorflow, in-code (likely a bad idea), as does zeta/utils/disable_logging.py
(Matrix.py should be renamed.)
All the tests which depend on both of these fail, because the in-code import is not there.
ERROR tests/test_init.py
ERROR tests/cloud/test_main.py
ERROR tests/models/test_andromeda.py
ERROR tests/models/test_basemodel.py
ERROR tests/models/test_gpt4.py
ERROR tests/models/test_gpt4multimodal.py
ERROR tests/models/test_llama2.py
ERROR tests/models/test_maxvit.py
ERROR tests/models/test_megavit.py
ERROR tests/models/test_navit.py
ERROR tests/models/test_palme.py
ERROR tests/models/test_vit.py
ERROR tests/nn/attentions/test_attend.py
ERROR tests/nn/attentions/test_cross_attn.py
ERROR tests/nn/attentions/test_cross_attn_multimodal.py
ERROR tests/nn/attentions/test_local_attn_mha.py
ERROR tests/nn/attentions/test_mha.py
ERROR tests/nn/attentions/test_mhaa.py
ERROR tests/nn/attentions/test_mqa.py
ERROR tests/nn/attentions/test_shaped_attn.py
ERROR tests/nn/attentions/test_sparq_attn.py
ERROR tests/nn/attentions/test_sparse_attn.py
ERROR tests/nn/attentions/test_test_mha.py
ERROR tests/nn/attentions/test_xc_attention.py
ERROR tests/nn/biases/test_alibi.py
ERROR tests/nn/biases/test_dynamic_relative.py
ERROR tests/nn/biases/test_relative_position_bias.py
ERROR tests/nn/embeddings/test_QFTSPEmbeddings.py
ERROR tests/nn/embeddings/test_abc_pos_emb.py
ERROR tests/nn/embeddings/test_patch_embedding.py
ERROR tests/nn/embeddings/test_qftp_embeddings.py
ERROR tests/nn/embeddings/test_rope.py
ERROR tests/nn/embeddings/test_rotary.py
ERROR tests/nn/embeddings/test_sine_positional_embs.py
ERROR tests/nn/embeddings/test_truncated_rotary_emb.py
ERROR tests/nn/embeddings/test_vision_embeddings.py
ERROR tests/nn/embeddings/test_vision_lang_embeddings.py
ERROR tests/nn/embeddings/test_xpos.py
ERROR tests/nn/embeddings/test_yarn.py
ERROR tests/nn/modules/test_accurategeluactivation.py
ERROR tests/nn/modules/test_activations.py
ERROR tests/nn/modules/test_adaptive_param.py
ERROR tests/nn/modules/test_adative_layernorm.py
ERROR tests/nn/modules/test_alr_block.py
ERROR tests/nn/modules/test_avg_model_merger.py
ERROR tests/nn/modules/test_clippedgeluactivation.py
ERROR tests/nn/modules/test_cross_attn_images.py
ERROR tests/nn/modules/test_custom_mlp.py
ERROR tests/nn/modules/test_denseblock.py
ERROR tests/nn/modules/test_dualpathblock.py
ERROR tests/nn/modules/test_dynamic_module.py
ERROR tests/nn/modules/test_dynamicroutingblock.py
ERROR tests/nn/modules/test_expert.py
ERROR tests/nn/modules/test_feedbackblock.py
ERROR tests/nn/modules/test_full_feedforward.py
ERROR tests/nn/modules/test_fused_dropout_layernom.py
ERROR tests/nn/modules/test_fused_gelu_dense.py
ERROR tests/nn/modules/test_gatedresidualblock.py
ERROR tests/nn/modules/test_geluactivation.py
ERROR tests/nn/modules/test_hebbian.py
ERROR tests/nn/modules/test_highwaylayer.py
ERROR tests/nn/modules/test_image_projector.py
ERROR tests/nn/modules/test_img_patch_embed.py
ERROR tests/nn/modules/test_kv_cache.py
ERROR tests/nn/modules/test_laplaceactivation.py
ERROR tests/nn/modules/test_linearactivation.py
ERROR tests/nn/modules/test_log_ff.py
ERROR tests/nn/modules/test_mishactivation.py
ERROR tests/nn/modules/test_mlp.py
ERROR tests/nn/modules/test_mm_adapter.py
ERROR tests/nn/modules/test_newgeluactivation.py
ERROR tests/nn/modules/test_polymorphic_neuron.py
ERROR tests/nn/modules/test_pytorchgelutanh.pyERROR tests/test_init.py
ERROR tests/cloud/test_main.py
ERROR tests/models/test_andromeda.py
ERROR tests/models/test_basemodel.py
ERROR tests/models/test_gpt4.py
ERROR tests/models/test_gpt4multimodal.py
ERROR tests/models/test_llama2.py
ERROR tests/models/test_maxvit.py
ERROR tests/models/test_megavit.py
ERROR tests/models/test_navit.py
ERROR tests/models/test_palme.py
ERROR tests/models/test_vit.py
ERROR tests/nn/attentions/test_attend.py
ERROR tests/nn/attentions/test_cross_attn.py
ERROR tests/nn/attentions/test_cross_attn_multimodal.py
ERROR tests/nn/attentions/test_local_attn_mha.py
ERROR tests/nn/attentions/test_mha.py
ERROR tests/nn/attentions/test_mhaa.py
ERROR tests/nn/attentions/test_mqa.py
ERROR tests/nn/attentions/test_shaped_attn.py
ERROR tests/nn/attentions/test_sparq_attn.py
ERROR tests/nn/attentions/test_sparse_attn.py
ERROR tests/nn/attentions/test_test_mha.py
ERROR tests/nn/attentions/test_xc_attention.py
ERROR tests/nn/biases/test_alibi.py
ERROR tests/nn/biases/test_dynamic_relative.py
ERROR tests/nn/biases/test_relative_position_bias.py
ERROR tests/nn/embeddings/test_QFTSPEmbeddings.py
ERROR tests/nn/embeddings/test_abc_pos_emb.py
ERROR tests/nn/embeddings/test_patch_embedding.py
ERROR tests/nn/embeddings/test_qftp_embeddings.py
ERROR tests/nn/embeddings/test_rope.py
ERROR tests/nn/embeddings/test_rotary.py
ERROR tests/nn/embeddings/test_sine_positional_embs.py
ERROR tests/nn/embeddings/test_truncated_rotary_emb.py
ERROR tests/nn/embeddings/test_vision_embeddings.py
ERROR tests/nn/embeddings/test_vision_lang_embeddings.py
ERROR tests/nn/embeddings/test_xpos.py
ERROR tests/nn/embeddings/test_yarn.py
ERROR tests/nn/modules/test_accurategeluactivation.py
ERROR tests/nn/modules/test_activations.py
ERROR tests/nn/modules/test_adaptive_param.py
ERROR tests/nn/modules/test_adative_layernorm.py
ERROR tests/nn/modules/test_alr_block.py
ERROR tests/nn/modules/test_avg_model_merger.py
ERROR tests/nn/modules/test_clippedgeluactivation.py
ERROR tests/nn/modules/test_cross_attn_images.py
ERROR tests/nn/modules/test_custom_mlp.py
ERROR tests/nn/modules/test_denseblock.py
ERROR tests/nn/modules/test_dualpathblock.py
ERROR tests/nn/modules/test_dynamic_module.py
ERROR tests/nn/modules/test_dynamicroutingblock.py
ERROR tests/nn/modules/test_expert.py
ERROR tests/nn/modules/test_feedbackblock.py
ERROR tests/nn/modules/test_full_feedforward.py
ERROR tests/nn/modules/test_fused_dropout_layernom.py
ERROR tests/nn/modules/test_fused_gelu_dense.py
ERROR tests/nn/modules/test_gatedresidualblock.py
ERROR tests/nn/modules/test_geluactivation.py
ERROR tests/nn/modules/test_hebbian.py
ERROR tests/nn/modules/test_highwaylayer.py
ERROR tests/nn/modules/test_image_projector.py
ERROR tests/nn/modules/test_img_patch_embed.py
ERROR tests/nn/modules/test_kv_cache.py
ERROR tests/nn/modules/test_laplaceactivation.py
ERROR tests/nn/modules/test_linearactivation.py
ERROR tests/nn/modules/test_log_ff.py
ERROR tests/nn/modules/test_mishactivation.py
ERROR tests/nn/modules/test_mlp.py
ERROR tests/nn/modules/test_mm_adapter.py
ERROR tests/nn/modules/test_newgeluactivation.py
ERROR tests/nn/modules/test_polymorphic_neuron.py
ERROR tests/nn/modules/test_pytorchgelutanh.py
ERROR tests/nn/modules/test_quantized_layernorm.py
ERROR tests/nn/modules/test_quickgeluactivation.py
ERROR tests/nn/modules/test_recursiveblock.py
ERROR tests/nn/modules/test_relusquaredactivation.py
ERROR tests/nn/modules/test_resnet.py
ERROR tests/nn/modules/test_simple_feedforward.py
ERROR tests/nn/modules/test_simple_mamba.py
ERROR tests/nn/modules/test_simple_res_block.py
ERROR tests/nn/modules/test_slerp_model_merger.py
ERROR tests/nn/modules/test_stochasticskipblock.py
ERROR tests/nn/modules/test_test_conv_lang.py
ERROR tests/nn/modules/test_test_h3_layer.py
ERROR tests/nn/modules/test_test_s4.py
ERROR tests/nn/modules/test_token_learner.py
ERROR tests/nn/modules/test_transformations.py
ERROR tests/nn/modules/test_tripleskipblock.py
ERROR tests/nn/modules/test_unet.py
ERROR tests/nn/modules/test_visual_expert.py
ERROR tests/ops/test_einops_from_to.py
ERROR tests/ops/test_einops_poly.py
ERROR tests/ops/test_mos.py
ERROR tests/optim/test_decoupled_lion.py
ERROR tests/optim/test_gradient_ascent.py
ERROR tests/optim/test_gradient_equillibrum.py
ERROR tests/optim/test_lion8b.py
ERROR tests/optim/test_stable_adamw.py
ERROR tests/quant/test_bitlinear.py
ERROR tests/quant/test_niva.py
ERROR tests/quant/test_qlora.py
ERROR tests/quant/test_quik.py
ERROR tests/quant/test_resudual_vq.py
ERROR tests/rl/test_vision_reward_model.py
ERROR tests/structs/test_autoregressive_wrapper.py
ERROR tests/structs/test_efficient_net.py
ERROR tests/structs/test_encoder_decoder.py
ERROR tests/structs/test_encoderdecoder.py
ERROR tests/structs/test_hierarchicalblock.py
ERROR tests/structs/test_localtransformer.py
ERROR tests/structs/test_paralleltransformerblock.py
ERROR tests/structs/test_simpletransformer.py
ERROR tests/structs/test_transformer.py
ERROR tests/structs/test_vitransformerwrapper.py
ERROR tests/tokenizers/test_gptx.py
ERROR tests/tokenizers/test_llama_tokenizer.py
ERROR tests/tokenizers/test_multimodal_tokenizer.py
ERROR tests/tokenizers/test_sentencepiece.py
ERROR tests/tokenizers/test_tokenmonster.py
ERROR tests/training/test_parallel_wrapper.py
ERROR tests/utils/test_absmax.py
ERROR tests/utils/test_cast_tuple.py
ERROR tests/utils/test_cosine_beta_schedule.py
ERROR tests/utils/test_default.py
ERROR tests/utils/test_disable_warnings_and_logs.py
ERROR tests/utils/test_enforce_types.py
ERROR tests/utils/test_exists.py
ERROR tests/utils/test_get_sinusoid_encoding_table.py
ERROR tests/utils/test_gif_to_tensor.py
ERROR tests/utils/test_group_by_key_prefix.py
ERROR tests/utils/test_group_dict_by_key.py
ERROR tests/utils/test_gumbel_noise.py
ERROR tests/utils/test_interpolate_pos_encoding_2d.py
ERROR tests/utils/test_log.py
ERROR tests/utils/test_maybe.py
ERROR tests/utils/test_module_device.py
ERROR tests/utils/test_once.py
ERROR tests/utils/test_pad_at_dim.py
ERROR tests/utils/test_pick_and_pop.py
ERROR tests/utils/test_print_cuda_memory_usage.py
ERROR tests/utils/test_print_main.py
ERROR tests/utils/test_print_num_params.py
ERROR tests/utils/test_save_load.py
ERROR tests/utils/test_save_load_wrapper.py
ERROR tests/utils/test_save_memory_snapshot.py
ERROR tests/utils/test_string_begins_with.py
ERROR tests/utils/test_top_a.py
ERROR tests/utils/test_top_k.py
ERROR tests/utils/test_top_p.py
ERROR tests/utils/test_track_cuda_memory_usage.py
ERROR tests/utils/test_video_tensor_to_gift.py
ERROR zeta/nn/modules/test_dense_connect.py
ERROR tests/nn/modules/test_quantized_layernorm.py
ERROR tests/nn/modules/test_quickgeluactivation.py
ERROR tests/nn/modules/test_recursiveblock.py
ERROR tests/nn/modules/test_relusquaredactivation.py
ERROR tests/nn/modules/test_resnet.py
ERROR tests/nn/modules/test_simple_feedforward.py
ERROR tests/nn/modules/test_simple_mamba.py
ERROR tests/nn/modules/test_simple_res_block.py
ERROR tests/nn/modules/test_slerp_model_merger.py
ERROR tests/nn/modules/test_stochasticskipblock.py
ERROR tests/nn/modules/test_test_conv_lang.py
ERROR tests/nn/modules/test_test_h3_layer.py
ERROR tests/nn/modules/test_test_s4.py
ERROR tests/nn/modules/test_token_learner.py
ERROR tests/nn/modules/test_transformations.py
ERROR tests/nn/modules/test_tripleskipblock.py
ERROR tests/nn/modules/test_unet.py
ERROR tests/nn/modules/test_visual_expert.py
ERROR tests/ops/test_einops_from_to.py
ERROR tests/ops/test_einops_poly.py
ERROR tests/ops/test_mos.py
ERROR tests/optim/test_decoupled_lion.py
ERROR tests/optim/test_gradient_ascent.py
ERROR tests/optim/test_gradient_equillibrum.py
ERROR tests/optim/test_lion8b.py
ERROR tests/optim/test_stable_adamw.py
ERROR tests/quant/test_bitlinear.py
ERROR tests/quant/test_niva.py
ERROR tests/quant/test_qlora.py
ERROR tests/quant/test_quik.py
ERROR tests/quant/test_resudual_vq.py
ERROR tests/rl/test_vision_reward_model.py
ERROR tests/structs/test_autoregressive_wrapper.py
ERROR tests/structs/test_efficient_net.py
ERROR tests/structs/test_encoder_decoder.py
ERROR tests/structs/test_encoderdecoder.py
ERROR tests/structs/test_hierarchicalblock.py
ERROR tests/structs/test_localtransformer.py
ERROR tests/structs/test_paralleltransformerblock.py
ERROR tests/structs/test_simpletransformer.py
ERROR tests/structs/test_transformer.py
ERROR tests/structs/test_vitransformerwrapper.py
ERROR tests/tokenizers/test_gptx.py
ERROR tests/tokenizers/test_llama_tokenizer.py
ERROR tests/tokenizers/test_multimodal_tokenizer.py
ERROR tests/tokenizers/test_sentencepiece.py
ERROR tests/tokenizers/test_tokenmonster.py
ERROR tests/training/test_parallel_wrapper.py
ERROR tests/utils/test_absmax.py
ERROR tests/utils/test_cast_tuple.py
ERROR tests/utils/test_cosine_beta_schedule.py
ERROR tests/utils/test_default.py
ERROR tests/utils/test_disable_warnings_and_logs.py
ERROR tests/utils/test_enforce_types.py
ERROR tests/utils/test_exists.py
ERROR tests/utils/test_get_sinusoid_encoding_table.py
ERROR tests/utils/test_gif_to_tensor.py
ERROR tests/utils/test_group_by_key_prefix.py
ERROR tests/utils/test_group_dict_by_key.py
ERROR tests/utils/test_gumbel_noise.py
ERROR tests/utils/test_interpolate_pos_encoding_2d.py
ERROR tests/utils/test_log.py
ERROR tests/utils/test_maybe.py
ERROR tests/utils/test_module_device.py
ERROR tests/utils/test_once.py
ERROR tests/utils/test_pad_at_dim.py
ERROR tests/utils/test_pick_and_pop.py
ERROR tests/utils/test_print_cuda_memory_usage.py
ERROR tests/utils/test_print_main.py
ERROR tests/utils/test_print_num_params.py
ERROR tests/utils/test_save_load.py
ERROR tests/utils/test_save_load_wrapper.py
ERROR tests/utils/test_save_memory_snapshot.py
ERROR tests/utils/test_string_begins_with.py
ERROR tests/utils/test_top_a.py
ERROR tests/utils/test_top_k.py
ERROR tests/utils/test_top_p.py
ERROR tests/utils/test_track_cuda_memory_usage.py
ERROR tests/utils/test_video_tensor_to_gift.py
ERROR zeta/nn/modules/test_dense_connect.py
The original function as written:
def pad_at_dim(t, pad, dim=-1, value=0.0):
dims_from_right = (-dim - 1) if dim < 0 else (t.ndim - dim - 1)
zeros = (0, 0) * dims_from_right
return F.pad(t, (*zeros, *pad), value=value)
Assumes simple behavior. The PyTorch or Tensorflow implementation:
torch.nn.functional.pad (https://pytorch.org/docs/stable/generated/torch.nn.functional.pad.html)
https://github.com/tensorflow/tensorflow/blob/v2.14.0/tensorflow/python/ops/array_ops.py#L3452-L3508
has more complex, and correct behavior.
I noticed this, because none of the tests in test_pad_at_dim.py are passing.
pad_at_dim is used in:
playground/models/stacked_mm_bitnet
nn/biases/alibi.py
nn/modules/shift_tokens.py
zeta/structs/transformer.py
Describe the bug
I ran the code in your implementation of rtx2
, and got error from zeta
package.
I have updated package botocore
and installed dependencies classifier-free-guidance-pytorch
, efficientnet-pytorch
.
Did not have issue running on Pytorch 2.0+ w/ Python 3.10+.
To Reproduce
Steps to reproduce the behavior:
rtx2_example.py
Expected behavior
A successful run.
Terminal output
Trying to use first CUDA device.
Using CUDA.
torch.Size([1, 150528])
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /media/root/Toshiba XG3/works/agi_computer_control/rt_x_experiments/RT-X/rtx2_example.py:20 in │
│ <module> │
│ │
│ 17 │ except: │
│ 18 │ │ print("Could not find DirectML device.") │
│ 19 print(f"Using {device_name}.") │
│ ❱ 20 from rtx import RTX2 │
│ 21 │
│ 22 │
│ 23 def forward_new(self, img: torch.Tensor, text: torch.Tensor): │
│ │
│ /media/root/Toshiba XG3/works/agi_computer_control/rt_x_experiments/RT-X/rtx/__init__.py:1 in │
│ <module> │
│ │
│ ❱ 1 from rtx.rtx2 import RTX2 │
│ 2 │
│ │
│ /media/root/Toshiba XG3/works/agi_computer_control/rt_x_experiments/RT-X/rtx/rtx2.py:4 in │
│ <module> │
│ │
│ 1 #!pip install torch zetascale │
│ 2 │
│ 3 import torch │
│ ❱ 4 from zeta.nn.architecture import ( │
│ 5 │ AutoregressiveWrapper, │
│ 6 │ Decoder, │
│ 7 │ Encoder, │
│ │
│ /usr/local/lib/python3.9/dist-packages/zeta/__init__.py:1 in <module> │
│ │
│ ❱ 1 from zeta import nn │
│ 2 from zeta.nn.architecture.transformer import FeedForward │
│ 3 from zeta.nn.modules.layernorm import LayerNorm │
│ 4 from zeta import models │
│ │
│ /usr/local/lib/python3.9/dist-packages/zeta/nn/__init__.py:3 in <module> │
│ │
│ 1 # architecture │
│ 2 # from zeta.nn.architecture import * │
│ ❱ 3 from zeta.nn import architecture │
│ 4 │
│ 5 # Attention │
│ 6 # from zeta.nn.attention import * │
│ │
│ /usr/local/lib/python3.9/dist-packages/zeta/nn/architecture/__init__.py:2 in <module> │
│ │
│ 1 from zeta.nn.architecture.auto_regressive_wrapper import AutoregressiveWrapper │
│ ❱ 2 from zeta.nn.architecture.encoder import Encoder │
│ 3 from zeta.nn.architecture.encoder_decoder import EncoderDecoder │
│ 4 from zeta.nn.architecture.hierarchical_transformer import HierarchicalTransformer │
│ 5 from zeta.nn.architecture.local_transformer import LocalTransformer │
│ │
│ /usr/local/lib/python3.9/dist-packages/zeta/nn/architecture/encoder.py:1 in <module> │
│ │
│ ❱ 1 from zeta.nn.architecture.transformer import AttentionLayers │
│ 2 │
│ 3 │
│ 4 class Encoder(AttentionLayers): │
│ │
│ /usr/local/lib/python3.9/dist-packages/zeta/nn/architecture/transformer.py:15 in <module> │
│ │
│ 12 from einops import rearrange, reduce, repeat │
│ 13 from torch import Tensor, einsum, nn │
│ 14 │
│ ❱ 15 from zeta.nn.attention.attend import Attend, Intermediates │
│ 16 from functools import reduce │
│ 17 │
│ 18 EfficientAttentionConfig = namedtuple( │
│ │
│ /usr/local/lib/python3.9/dist-packages/zeta/nn/attention/__init__.py:14 in <module> │
│ │
│ 11 # from zeta.nn.attention.mgqa import MGQA │
│ 12 │
│ 13 # from zeta.nn.attention.spatial_linear_attention import SpatialLinearAttention │
│ ❱ 14 from zeta.nn.attention.mixture_attention import ( │
│ 15 │ MixtureOfAttention, │
│ 16 │ MixtureOfAutoregressiveAttention, │
│ 17 ) │
│ │
│ /usr/local/lib/python3.9/dist-packages/zeta/nn/attention/mixture_attention.py:8 in <module> │
│ │
│ 5 │
│ 6 from typing import Tuple, Optional │
│ 7 from einops import rearrange, repeat, reduce, pack, unpack │
│ ❱ 8 from zeta.models.vit import exists │
│ 9 from zeta.nn.architecture.transformer import RMSNorm, apply_rotary_pos_emb │
│ 10 │
│ 11 from zeta.nn.attention.attend import Attend │
│ │
│ /usr/local/lib/python3.9/dist-packages/zeta/models/__init__.py:3 in <module> │
│ │
│ 1 # Copyright (c) 2022 Agora │
│ 2 # Licensed under The MIT License [see LICENSE for details] │
│ ❱ 3 from zeta.models.andromeda import Andromeda │
│ 4 from zeta.models.base import BaseModel │
│ 5 from zeta.models.gpt4 import GPT4, GPT4MultiModal │
│ 6 from zeta.models.llama import LLama2 │
│ │
│ /usr/local/lib/python3.9/dist-packages/zeta/models/andromeda.py:5 in <module> │
│ │
│ 2 from torch.nn import Module │
│ 3 │
│ 4 from zeta.nn.architecture.auto_regressive_wrapper import AutoregressiveWrapper │
│ ❱ 5 from zeta.nn.architecture.transformer import ( │
│ 6 │ Decoder, │
│ 7 │ Transformer, │
│ 8 ) │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
ImportError: cannot import name 'Decoder' from partially initialized module
'zeta.nn.architecture.transformer' (most likely due to a circular import)
(/usr/local/lib/python3.9/dist-packages/zeta/nn/architecture/transformer.py)
FAILED models/test_andromeda.py::test_initial_parameters - AttributeError: 'Andromeda' object has no attribute 'num_tokens'
FAILED models/test_andromeda.py::test_forward_successful - ImportError: import error in zeta.models.AutoregressiveWrapper: No module named 'zeta.models.Au...
FAILED models/test_andromeda.py::test_forward_exception - ImportError: import error in zeta.models.AutoregressiveWrapper: No module named 'zeta.models.Au...
When I took a look at what Attributes Andromeda has, I don't see num_tokens, even though andromeda.py says it should be there:
['Andromeda', 'T_destination', '__annotations__', '__call__', '__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattr__', '__getattribute__', '__getstate__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__setstate__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_apply', '_backward_hooks', '_backward_pre_hooks', '_buffers', '_call_impl', '_compiled_call_impl', '_forward_hooks', '_forward_hooks_always_called', '_forward_hooks_with_kwargs', '_forward_pre_hooks', '_forward_pre_hooks_with_kwargs', '_get_backward_hooks', '_get_backward_pre_hooks', '_get_name', '_is_full_backward_hook', '_load_from_state_dict', '_load_state_dict_post_hooks', '_load_state_dict_pre_hooks', '_maybe_warn_non_full_backward_hook', '_modules', '_named_members', '_non_persistent_buffers_set', '_parameters', '_register_load_state_dict_pre_hook', '_register_state_dict_hook', '_replicate_for_data_parallel', '_save_to_state_dict', '_slow_forward', '_state_dict_hooks', '_state_dict_pre_hooks', '_version', '_wrapped_call_impl',
'add_module', 'apply',
'bfloat16', 'buffers',
'call_super_init', 'children', 'compile', 'cpu', 'cuda',
'decoder', 'double', 'dump_patches',
'eval', 'extra_repr',
'float', 'forward',
'get_buffer', 'get_extra_state', 'get_parameter', 'get_submodule',
'half',
'ipu',
'load_state_dict',
'modules',
'named_buffers', 'named_children', 'named_modules', 'named_parameters', 'parameters',
'register_backward_hook', 'register_buffer', 'register_forward_hook', 'register_forward_pre_hook', 'register_full_backward_hook', 'register_full_backward_pre_hook', 'register_load_state_dict_post_hook', 'register_module', 'register_parameter', 'register_state_dict_pre_hook', 'requires_grad_',
'set_extra_state', 'share_memory', 'state_dict',
'to', 'to_empty', 'train', 'training', 'type',
'xpu',
'zero_grad']
``
Andromeda is a Transformer model, but also, I don't see num_tokens there either.
Do those attributes look familiar?
<!-- POLAR PLEDGE BADGE START -->
## Upvote & Fund
- We're using [Polar.sh](https://polar.sh/kyegomez) so you can upvote and help fund this issue.
- We receive the funding once the issue is completed & confirmed by you.
- Thank you in advance for helping prioritize & fund our backlog.
<a href="https://polar.sh/kyegomez/zeta/issues/96">
<picture>
<source media="(prefers-color-scheme: dark)" srcset="https://polar.sh/api/github/kyegomez/zeta/issues/96/pledge.svg?darkmode=1">
<img alt="Fund with Polar" src="https://polar.sh/api/github/kyegomez/zeta/issues/96/pledge.svg">
</picture>
</a>
<!-- POLAR PLEDGE BADGE END -->
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.