kyegomez / zeta Goto Github PK

Build high-performance AI models with modular building blocks

License: Apache License 2.0

Python 99.69% Shell 0.22% Dockerfile 0.09%

artificial-intelligence multi-modal transformers deep-learning gpt4 llama2 multi-agent-systems multi-modal-learning multi-platform pytorch

zeta's Issues

[BUG] VisionEmbedding readme example error

In the VisionEmbedding example, run in a colab notebook:

ImportError                               Traceback (most recent call last)
[<ipython-input-10-de694ebd704c>](https://localhost:8080/#) in <cell line: 2>()
      1 import torch
----> 2 from zeta.nn import VisionEmbedding
      3 
      4 # Create an instance of VisionEmbedding
      5 vision_embedding = VisionEmbedding(

ImportError: cannot import name 'VisionEmbedding' from 'zeta.nn' (/usr/local/lib/python3.10/dist-packages/zeta/nn/__init__.py)

zeta/nn/init.py imports

from zeta.nn.embeddings import *

So, I thought the fix was to change the example to:

from zeta.nn.embeddings import VisionEmbedding

but that doesn't work:
It appears VisionEmbedding isn't exported in zeta.nn.embeddings init,py

Upvote & Fund

We're using Polar.sh so you can upvote and help fund this issue.
We receive the funding once the issue is completed & confirmed by you.
Thank you in advance for helping prioritize & fund our backlog.

[BUG] [DOCS] Custom MLP - Forward Pass

import torch

# Input data (batch of 5 samples with 10 features each)
input_data = torch.randn(5, 10)

# Forward pass through the MLP
output = mlp(input_data)

fails with

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
[<ipython-input-31-8ebb46f7f9e9>](https://localhost:8080/#) in <cell line: 7>()
      5 
      6 # Forward pass through the MLP
----> 7 output = mlp(input_data)

5 frames
[/usr/local/lib/python3.10/dist-packages/torch/nn/modules/linear.py](https://localhost:8080/#) in forward(self, input)
    112 
    113     def forward(self, input: Tensor) -> Tensor:
--> 114         return F.linear(input, self.weight, self.bias)
    115 
    116     def extra_repr(self) -> str:

RuntimeError: mat1 and mat2 shapes cannot be multiplied (5x10 and 20x10)

Realized the example in the docs is wrong. A correct example is (patch incoming):

import torch
from zeta.nn import CustomMLP

# Define the layer sizes
layer_sizes = [5, 10, 1]

# Create the MLP
mlp = CustomMLP(layer_sizes, activation="relu", dropout=0.5)

# Create a random tensor of shape (batch_size, input_size)
x = torch.randn(32, 5)

# Pass the tensor through the MLP
output = mlp(x)

print(output)

Upvote & Fund

We're using Polar.sh so you can upvote and help fund this issue.
We receive the funding once the issue is completed & confirmed by you.
Thank you in advance for helping prioritize & fund our backlog.

[BUG] [DOCS] RelativePositionBias calling n_heads sb num_heads

The RelativePositionBias example has an incorrect parameter name:
n_heads should be num_heads.
PR incoming.

TypeError                                 Traceback (most recent call last)
[<ipython-input-18-93e65b1f0592>](https://localhost:8080/#) in <cell line: 24>()
     22 
     23 # Example 3: Modify default configurations
---> 24 custom_rel_pos_bias = RelativePositionBias(bidirectional=False, num_buckets=64, max_distance=256, n_heads=8)

TypeError: RelativePositionBias.__init__() got an unexpected keyword argument 'n_heads'

Upvote & Fund

We're using Polar.sh so you can upvote and help fund this issue.
We receive the funding once the issue is completed & confirmed by you.
Thank you in advance for helping prioritize & fund our backlog.

[BUG] no module named zeta

From a container, pytest test/test_init.py:

______________________________ ERROR collecting test_init.py _______________________________
ImportError while importing test module '/usr/src/zeta/tests/test_init.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/usr/local/lib/python3.10/importlib/__init__.py:126: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
test_init.py:1: in <module>
    import zeta
E   ModuleNotFoundError: No module named 'zeta'

There is no zeta package to install.

Upvote & Fund

We're using Polar.sh so you can upvote and help fund this issue.
We receive the funding once the issue is completed & confirmed by you.
Thank you in advance for helping prioritize & fund our backlog.

[TEST] cloud/test_main failures

Running tests from colab, with a pip install from latest github, and v100 GPU, and from CPU system:

FAILED test_main.py::test_zetacloud_basic - AssertionError: expected call not found.
FAILED test_main.py::test_zetacloud_with_stop - AssertionError: expected call not found.
FAILED test_main.py::test_zetacloud_with_down - AssertionError: expected call not found.
FAILED test_main.py::test_zetacloud_with_status_report - AssertionError: expected call not found.
FAILED test_main.py::test_zetacloud_with_exception - Failed: DID NOT RAISE <class 'Exception'>

Upvote & Fund

We're using Polar.sh so you can upvote and help fund this issue.
We receive the funding once the issue is completed & confirmed by you.
Thank you in advance for helping prioritize & fund our backlog.

[BUG] [DOCS] TruncatedRotaryEmbedding example KeyError: "attribute 'inv_freq' already exists"

I was testing the TruncatedRotaryEmbedding example in colab;

from zeta.nn.embeddings.truncated_rope import TruncatedRotaryEmbedding
import torch

# Define the parameters
dim = 64
a = 0.1
b = 0.9
rho = 0.5
seq_len = 100
device = torch.device('cuda')

# Create the TruncatedRotaryEmbedding module
trunc_rotary_emb = TruncatedRotaryEmbedding(dim, a, b, rho)

# Compute the truncated rotary embeddings for the specified sequence length
rotary_embeddings = trunc_rotary_emb(seq_len, device)

print(rotary_embeddings)

I got an error which is deeper in the code than I'm familiar with.

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
[<ipython-input-77-623bbebdd7d0>](https://localhost:8080/#) in <cell line: 13>()
     11 
     12 # Create the TruncatedRotaryEmbedding module
---> 13 trunc_rotary_emb = TruncatedRotaryEmbedding(dim, a, b, rho)
     14 
     15 # Compute the truncated rotary embeddings for the specified sequence length

1 frames
[/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py](https://localhost:8080/#) in register_buffer(self, name, tensor, persistent)
    536             raise KeyError("buffer name can't be empty string \"\"")
    537         elif hasattr(self, name) and name not in self._buffers:
--> 538             raise KeyError(f"attribute '{name}' already exists")
    539         elif tensor is not None and not isinstance(tensor, torch.Tensor):
    540             raise TypeError(f"cannot assign '{torch.typename(tensor)}' object to buffer '{name}' "

KeyError: "attribute 'inv_freq' already exists"

Upvote & Fund

We're using Polar.sh so you can upvote and help fund this issue.
We receive the funding once the issue is completed & confirmed by you.
Thank you in advance for helping prioritize & fund our backlog.

[BUG] Running import zeta immediately fails with "RuntimeError: mat1 and mat2 shapes cannot be multiplied (512x4 and 512x512)"

Hi,

I tried running vision mamba, which relies on zetascale. After following the error stack, I narrowed it down to zetascale causing the problem when trying to import it.

To Reproduce
Steps to reproduce the behavior:

Try running 'import zeta' in Python

Expected behavior
For zetascale to be imported and usable

Screenshots

Additional context
I am working on Ubuntu 22.04, running Python 3.10.13 (I have had the same error with Python 3.9 and 3.11). I have the latest PyTorch as of February 21, 2024. I am working in a Conda virtual environment where I installed vision mamba by running 'pip install vision-mamba' which ran without issues.

I would greatly appreciate any advice on how to zetascale to work!

Thanks,
Julian

Upvote & Fund

We're using Polar.sh so you can upvote and help fund this issue.
We receive the funding once the issue is completed & confirmed by you.
Thank you in advance for helping prioritize & fund our backlog.

[BUG] [DOCS] [IMPORT] example fix - import log_prob from DPO

In colab, running the example from the readme:

import torch
from torch import nn
import zeta.quant as qt
...

The error is:

AttributeError                            Traceback (most recent call last)
[<ipython-input-8-653fc7394b85>](https://localhost:8080/#) in <cell line: 3>()
      1 import torch
      2 from torch import nn
----> 3 import zeta.quant as qt
      4 
      5 class MyModel(nn.Module):

[/usr/local/lib/python3.10/dist-packages/zeta/__init__.py](https://localhost:8080/#) in <module>
      8 from zeta.training import *  # noqa: F403, E402
      9 from zeta.tokenizers import *  # noqa: F403, E402
---> 10 from zeta.rl import *  # noqa: F403, E402
     11 from zeta.optim import *  # noqa: F403, E402
     12 from zeta.ops import *  # noqa: F403, E402

AttributeError: module 'zeta.rl' has no attribute 'log_prob'

In looking at the zeta.rl init.py, it did not include a line to import log_prob from DPO (proposed fix):

from zeta.rl.dpo import (
    freeze_all_layers,
    log_prob_from_model_and_seq,
    log_prob,
    DPO,
)

Upvote & Fund

We're using Polar.sh so you can upvote and help fund this issue.
We receive the funding once the issue is completed & confirmed by you.
Thank you in advance for helping prioritize & fund our backlog.

[BUG] RMSNorm Implementation

I think you should be dividing by the scale in the following line

zeta/zeta/nn/modules/rms_norm.py

Line 35 in 7dbb6a6

return normed * self.scale * self.gamma

This this the scale definition

https://github.com/kyegomez/zeta/blob/7dbb6a62f83413977a922d5fc6dec1b11f734bc3/zeta/nn/modules/rms_norm.py#L29C9-L29C31

self.scale = dim**-0.5

And RMSNorm formula

Edit:

Also, I think the normalization should be in the dim -1, not -2

zeta/zeta/nn/modules/rms_norm.py

Line 34 in 7dbb6a6

normed = F.normalize(x, dim=-2)

Upvote & Fund

We're using Polar.sh so you can upvote and help fund this issue.
We receive the funding once the issue is completed & confirmed by you.
Thank you in advance for helping prioritize & fund our backlog.

[BUG] Attend , dim, MaxVit

Describe the bug
Tests are failing because Attend doesn't have dim as a parameter.

ERROR zeta/tests/models/test_maxvit.py::test_maxvit_constructor - TypeError: Attend.__init__() got an unexpected keyword argument 'dim'

Attend does not have dim:

class Attend(nn.Module):
    def __init__(
        self,
        *,
        dropout=0.0,
        causal=False,
        heads=None,
        talking_heads=False,
        sparse_topk=None,
        scale=None,
        qk_norm=False,
        flash=False,
        add_zero_kv=False,
        onnxable=False,
    ):

but MaxVit needs it:

class MaxVit(nn.Module):
    def __init__(
        self,
        *,
        num_classes,
        dim,
        depth,
        dim_head: int = 32,
        dim_conv_stem=None,
        window_size: int = 7,
        mbconv_expansion_rate: int = 4,
        mbconv_shrinkage_rate=0.25,
        dropout=0.01,
        channels=3,
    ):

One way to fix this would be to add dim to Attend, but that requires a bit of logic to make it an optional parameter.

Another way to fix it would be to make a multi-model Attend, which would be called by multi-modal models, separate from the sequential attend. (My preferred solution, but may be hard to implement.)

Upvote & Fund

We're using Polar.sh so you can upvote and help fund this issue.
We receive the funding once the issue is completed & confirmed by you.
Thank you in advance for helping prioritize & fund our backlog.

[BUG] [DOCS] MBConv function

The example for inverted residual block in the MBConv docs

from zeta.nn import MBConv
import torch

# Create an inverted residual block with 64 input channels, 128 output channels, and downsampling
mbconv_block = MBConv(64, 128, downsample=True)

# Create an input tensor
x = torch.randn(32, 64, 32, 32)  # Example input with 32 samples and 64 channels

# Apply the inverted residual block
output = mbconv_block(x)

# Output tensor
print(output)

Throws an error

---------------------------------------------------------------------------
EinopsError                               Traceback (most recent call last)
[/usr/local/lib/python3.10/dist-packages/einops/einops.py](https://localhost:8080/#) in reduce(tensor, pattern, reduction, **axes_lengths)
    521         shape = backend.shape(tensor)
--> 522         recipe = _prepare_transformation_recipe(pattern, reduction, axes_names=tuple(axes_lengths), ndim=len(shape))
    523         return _apply_recipe(

9 frames
EinopsError: Non-unitary anonymous axes are not supported in rearrange (exception is length 1)

During handling of the above exception, another exception occurred:

EinopsError                               Traceback (most recent call last)
[/usr/local/lib/python3.10/dist-packages/einops/einops.py](https://localhost:8080/#) in reduce(tensor, pattern, reduction, **axes_lengths)
    531             message += "\n Input is list. "
    532         message += "Additional info: {}.".format(axes_lengths)
--> 533         raise EinopsError(message + "\n {}".format(e))
    534
    535 

EinopsError:  Error while processing rearrange-reduction pattern "b c -> b c 11".
 Input tensor shape: torch.Size([32, 512]). Additional info: {}.
 Non-unitary anonymous axes are not supported in rearrange (exception is length 1)

I first tried to rewrite the example:

# Import the necessary modules
import torch
from zeta.nn import MBConv

# Define the input and output dimensions
dim_in = 32
dim_out = 64

# Define whether to downsample
downsample = True

# Define the expansion rate and shrinkage rate
expansion_rate = 6
shrinkage_rate = 0.25

# Define the dropout rate
dropout = 0.1

# Create the inverted residual block
block = MBConv(dim_in, dim_out, downsample=downsample, expansion_rate=expansion_rate, shrinkage_rate=shrinkage_rate, dropout=dropout)

# Create a random tensor of shape (batch_size, dim_in, height, width)
x = torch.randn(16, dim_in, 32, 32)

# Pass the tensor through the block
output = block(x)

print(output.shape)

but that gave the same error.

Copilot's suggested fix, is to rewrite the SqueezeExitation:

class SqueezeExcitation(nn.Module):
    def __init__(self, dim, shrinkage_rate=0.25):
        super().__init__()
        hidden_dim = int(dim * shrinkage_rate)

        self.gate = nn.Sequential(
            nn.Linear(dim, hidden_dim, bias=False),
            nn.SiLU(),
            nn.Linear(hidden_dim, dim, bias=False),
            nn.Sigmoid(),
        )

    def forward(self, x):
        b, c, h, w = x.shape
        y = reduce(x, "b c h w -> b c", "mean")
        y = self.gate(y)
        y = rearrange(y, "b c -> b c () ()")
        return x * y.expand_as(x)

with the explanation:
The error is due to the rearrange operation in the SqueezeExcitation class. The operation "b c -> b c 11" is trying to reshape the tensor to have an additional dimension of size 11, but this is not possible because the input tensor only has two dimensions.

The SqueezeExcitation class is trying to apply a squeeze-and-excitation operation, which typically involves reducing the spatial dimensions of the input tensor to 1x1 (i.e., global average pooling), applying a gating mechanism, and then expanding the spatial dimensions back to their original size. However, the code seems to be trying to expand the spatial dimensions to a fixed size of 11x11, which is not correct.

Here is a corrected version of the SqueezeExcitation class:

class SqueezeExcitation(nn.Module):
    def __init__(self, dim, shrinkage_rate=0.25):
        super().__init__()
        hidden_dim = int(dim * shrinkage_rate)

        self.gate = nn.Sequential(
            nn.Linear(dim, hidden_dim, bias=False),
            nn.SiLU(),
            nn.Linear(hidden_dim, dim, bias=False),
            nn.Sigmoid(),
        )

    def forward(self, x):
        b, c, h, w = x.shape
        y = reduce(x, "b c h w -> b c", "mean")
        y = self.gate(y)
        y = rearrange(y, "b c -> b c () ()")
        return x * y.expand_as(x)

In this corrected version, the forward method first applies global average pooling to the input tensor x to get a tensor y of shape (b, c), then applies the gating mechanism to y, and finally expands the spatial dimensions of y back to their original size by broadcasting. The output is the element-wise product of x and the expanded y, which is a common operation in squeeze-and-excitation blocks.

Does this look reasonable to you?

Upvote & Fund

We're using Polar.sh so you can upvote and help fund this issue.
We receive the funding once the issue is completed & confirmed by you.
Thank you in advance for helping prioritize & fund our backlog.

[BUGF] spelling SpacialTransformer should be SpatialTransformer

Describe the bug
Spelling

Upvote & Fund

We're using Polar.sh so you can upvote and help fund this issue.
We receive the funding once the issue is completed & confirmed by you.
Thank you in advance for helping prioritize & fund our backlog.

[BUG] [DOCS] VitTransformerBlock examples have einops error

---------------------------------------------------------------------------
EinopsError                               Traceback (most recent call last)
[/usr/local/lib/python3.10/dist-packages/einops/einops.py](https://localhost:8080/#) in reduce(tensor, pattern, reduction, **axes_lengths)
    521         shape = backend.shape(tensor)
--> 522         recipe = _prepare_transformation_recipe(pattern, reduction, axes_names=tuple(axes_lengths), ndim=len(shape))
    523         return _apply_recipe(

10 frames
EinopsError: Wrong shape: expected 4 dims. Received 3-dim tensor.

During handling of the above exception, another exception occurred:

EinopsError                               Traceback (most recent call last)
[/usr/local/lib/python3.10/dist-packages/einops/einops.py](https://localhost:8080/#) in reduce(tensor, pattern, reduction, **axes_lengths)
    531             message += "\n Input is list. "
    532         message += "Additional info: {}.".format(axes_lengths)
--> 533         raise EinopsError(message + "\n {}".format(e))
    534 
    535

EinopsError:  Error while processing rearrange-reduction pattern "b p n (h d) -> b h p n d".
 Input tensor shape: torch.Size([5, 4, 512]). Additional info: {'h': 8}.
 Wrong shape: expected 4 dims. Received 3-dim tensor.

I'm going to try to figure out what the correct input shape is.

Upvote & Fund

We're using Polar.sh so you can upvote and help fund this issue.
We receive the funding once the issue is completed & confirmed by you.
Thank you in advance for helping prioritize & fund our backlog.

[BUG] fix main.py -> test_main.py

change the file name main.py -> test_main.py

Upvote & Fund

We're using Polar.sh so you can upvote and help fund this issue.
We receive the funding once the issue is completed & confirmed by you.
Thank you in advance for helping prioritize & fund our backlog.

[BUG] ERROR collecting rl/test_prioritizedreplybuffer.py - ModuleNotFoundError: No module named 'sumtree'

___________________________ ERROR collecting rl/test_prioritizedreplybuffer.py ___________________________
ImportError while importing test module '/home/v/vzeta/tests/rl/test_prioritizedreplybuffer.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/usr/lib/python3.10/importlib/__init__.py:126: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
rl/test_prioritizedreplybuffer.py:4: in <module>
    from zeta.rl.priortized_replay_buffer import (
../../.local/lib/python3.10/site-packages/zeta/rl/priortized_replay_buffer.py:1: in <module>
    from sumtree import SumTree
E   ModuleNotFoundError: No module named 'sumtree'

Upvote & Fund

We're using Polar.sh so you can upvote and help fund this issue.
We receive the funding once the issue is completed & confirmed by you.
Thank you in advance for helping prioritize & fund our backlog.

[BUG] test_bitlinear duplicated.

Looks like a duplicate test in the nn/modules directory. Deleted in incoming pr.

_____________________________ ERROR collecting quant/test_bitlinear.py _____________________________
import file mismatch:
imported module 'test_bitlinear' has this __file__ attribute:
  /workspaces/zeta/tests/nn/modules/test_bitlinear.py
which is not the same as the test file we want to collect:
  /workspaces/zeta/tests/quant/test_bitlinear.py
HINT: remove __pycache__ / .pyc files and/or use a unique basename for your test file modules

Upvote & Fund

We're using Polar.sh so you can upvote and help fund this issue.
We receive the funding once the issue is completed & confirmed by you.
Thank you in advance for helping prioritize & fund our backlog.

Confirm you do not plan to stop spamming people.

Hi @kyegomez, you closed kyegomez/swarms#62 about you spamming people as "not planned". Can you confirm that you do not intent to take any action?

Please take a look at:

It would be great if you stopped.

Please stop spamming me, and others. Also see kyegomez/swarms#40

Upvote & Fund

We're using Polar.sh so you can upvote and help fund this issue.
We receive the funding once the issue is completed & confirmed by you.
Thank you in advance for helping prioritize & fund our backlog.

[BUG] pip apex is not nvidia apex

From container, with pip install zetascale.

In looking at the error, it appears the pip apex is not the package for the nvdia apex[1]
The fix is to install:

git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --disable-pip-version-check --no-cache-dir \
--global-option="--cpp_ext" --global-option="--cuda_ext" ./

following [2]

______________________________ ERROR collecting test_init.py _______________________________
ImportError while importing test module '/usr/src/zeta/tests/test_init.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/usr/local/lib/python3.10/importlib/__init__.py:126: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
test_init.py:1: in <module>
    import zeta
/usr/local/lib/python3.10/site-packages/zeta/__init__.py:5: in <module>
    from zeta.nn import *  # noqa: F403, E402
/usr/local/lib/python3.10/site-packages/zeta/nn/__init__.py:1: in <module>
    from zeta.nn.attention import *
Te/usr/local/lib/python3.10/site-packages/zeta/nn/attention/__init__.py:14: in <module>
    from zeta.nn.attention.mixture_attention import (
/usr/local/lib/python3.10/site-packages/zeta/nn/attention/mixture_attention.py:8: in <module>
    from zeta.models.vit import exists
/usr/local/lib/python3.10/site-packages/zeta/models/__init__.py:3: in <module>
    from zeta.models.andromeda import Andromeda
/usr/local/lib/python3.10/site-packages/zeta/models/andromeda.py:4: in <module>
    from zeta.structs.auto_regressive_wrapper import AutoregressiveWrapper
/usr/local/lib/python3.10/site-packages/zeta/structs/__init__.py:8: in <module>
    from zeta.structs.local_transformer import LocalTransformer
/usr/local/lib/python3.10/site-packages/zeta/structs/local_transformer.py:8: in <module>
    from zeta.nn.modules import feedforward_network
/usr/local/lib/python3.10/site-packages/zeta/nn/modules/__init__.py:12: in <module>
    from zeta.nn.modules.feedforward_network import FeedForwardNetwork
/usr/local/lib/python3.10/site-packages/zeta/nn/modules/feedforward_network.py:9: in <module>
    from apex.normalization import FusedLayerNorm as LayerNorm
/usr/local/lib/python3.10/site-packages/apex/__init__.py:13: in <module>
    from pyramid.session import UnencryptedCookieSessionFactoryConfig
E   ImportError: cannot import name 'UnencryptedCookieSessionFactoryConfig' from 'pyramid.session' (unknown location)

[1] https://stackoverflow.com/questions/66610378/unencryptedcookiesessionfactoryconfig-error-when-importing-apex
[2] https://stackoverflow.com/a/67188946

Upvote & Fund

We're using Polar.sh so you can upvote and help fund this issue.
We receive the funding once the issue is completed & confirmed by you.
Thank you in advance for helping prioritize & fund our backlog.

[BUG] FusedDenseGELUDense, FusedDropoutLayerNorm not exported in zeta/nn/init.py

ImportError                               Traceback (most recent call last)
[<ipython-input-11-0b8f0cee3df3>](https://localhost:8080/#) in <cell line: 2>()
      1 import torch
----> 2 from zeta.nn import FusedDenseGELUDense
      3 
      4 x = torch.randn(1, 512)
      5 model = FusedDenseGELUDense(512, 1024)

ImportError: cannot import name 'FusedDenseGELUDense' from 'zeta.nn' (/content/zeta/zeta/nn/__init__.py)

Upvote & Fund

We're using Polar.sh so you can upvote and help fund this issue.
We receive the funding once the issue is completed & confirmed by you.
Thank you in advance for helping prioritize & fund our backlog.

[BUG] [DOCS] missing/misplaced docs

The docs for PositionalEmbedding appear to be the top level docs for all of zeta, not the specific PositionalEmbedding docs.

https://zeta.apac.ai/en/latest/zeta/nn/embeddings/positional_embeddings/

This could be merely a formatting issue. They don't resemlble:
https://zeta.apac.ai/en/latest/zeta/nn/embeddings/truncated_rope/

A standard docs template more like:
https://zeta.apac.ai/en/latest/zeta/nn/biases/relative_bias/
Would be helpful.

Upvote & Fund

We're using Polar.sh so you can upvote and help fund this issue.
We receive the funding once the issue is completed & confirmed by you.
Thank you in advance for helping prioritize & fund our backlog.

[TEST]

This, I believe, is naming mismatch.
simple_vision_encoder has VisionEncoder, not SimpleVisionEncoder.
init.py has VisionEncoder
test_simple_vision_encoder has SimpleVisionEncoder.

My suggested fix is to change simple_vision_encoder and init.py to be SimpleVisionEncoder.

_________ ERROR collecting tests/structs/test_simple_vision_encoder.py _________
ImportError while importing test module '/usr/src/zeta/tests/structs/test_simple_vision_encoder.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/usr/local/lib/python3.10/importlib/__init__.py:126: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
tests/structs/test_simple_vision_encoder.py:2: in <module>
    from zeta.structs.simple_vision_encoder import SimpleVisionEncoder
E   ModuleNotFoundError: No module named 'zeta.structs.simple_vision_encoder'```

<!-- POLAR PLEDGE BADGE START -->
## Upvote & Fund

- We're using [Polar.sh](https://polar.sh/kyegomez) so you can upvote and help fund this issue.
- We receive the funding once the issue is completed & confirmed by you.
- Thank you in advance for helping prioritize & fund our backlog.

<a href="https://polar.sh/kyegomez/zeta/issues/84">
<picture>
  <source media="(prefers-color-scheme: dark)" srcset="https://polar.sh/api/github/kyegomez/zeta/issues/84/pledge.svg?darkmode=1">
  <img alt="Fund with Polar" src="https://polar.sh/api/github/kyegomez/zeta/issues/84/pledge.svg">
</picture>
</a>
<!-- POLAR PLEDGE BADGE END -->

[CONF] generator-generic-ossf-slsa3-publish.yml - needs initial config

Describe the bug
This is the default template.

Upvote & Fund

We're using Polar.sh so you can upvote and help fund this issue.
We receive the funding once the issue is completed & confirmed by you.
Thank you in advance for helping prioritize & fund our backlog.

[TEST] test_cast_tuple.py - failing test - suggest removing

I think the traceback is occuring at:

# Test with mock and monkeypatch
def test_cast_tuple_with_mock_and_monkeypatch(monkeypatch):
    def mock_isinstance(val, t):
        return False

    monkeypatch.setattr("builtins.isinstance", mock_isinstance)
    assert cast_tuple((1, 2), 1) == ((1, 2),)

Tracebac

tests/utils/test_cast_tuple.py .....Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/_pytest/main.py", line 271, in wrap_session
    session.exitstatus = doit(config, session) or 0
  File "/usr/local/lib/python3.10/site-packages/_pytest/main.py", line 325, in _main
    config.hook.pytest_runtestloop(session=session)
  File "/usr/local/lib/python3.10/site-packages/pluggy/_hooks.py", line 493, in __call__
    return self._hookexec(self.name, self._hookimpls, kwargs, firstresult)
  File "/usr/local/lib/python3.10/site-packages/pluggy/_manager.py", line 115, in _hookexec
    return self._inner_hookexec(hook_name, methods, kwargs, firstresult)
  File "/usr/local/lib/python3.10/site-packages/pluggy/_callers.py", line 152, in _multicall
    return outcome.get_result()
  File "/usr/local/lib/python3.10/site-packages/pluggy/_result.py", line 114, in get_result
    raise exc.with_traceback(exc.__traceback__)
  File "/usr/local/lib/python3.10/site-packages/pluggy/_callers.py", line 137, in _multicall
    teardown.throw(outcome._exception)
AttributeError: 'tuple' object has no attribute 'throw'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/_pytest/main.py", line 291, in wrap_session
    config.notify_exception(excinfo, config.option)
  File "/usr/local/lib/python3.10/site-packages/_pytest/config/__init__.py", line 1105, in notify_exception
    excrepr = excinfo.getrepr(
  File "/usr/local/lib/python3.10/site-packages/_pytest/_code/code.py", line 686, in getrepr
    self.traceback[0]._rawentry if self.traceback else None,
  File "/usr/local/lib/python3.10/site-packages/_pytest/_code/code.py", line 583, in traceback
    self._traceback = Traceback(self.tb)
  File "/usr/local/lib/python3.10/site-packages/_pytest/_code/code.py", line 343, in __init__
    super().__init__(tb)
TypeError: 'traceback' object is not iterable

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/bin/pytest", line 8, in <module>
    sys.exit(console_main())
  File "/usr/local/lib/python3.10/site-packages/_pytest/config/__init__.py", line 192, in console_main
    code = main()
  File "/usr/local/lib/python3.10/site-packages/_pytest/config/__init__.py", line 169, in main
    ret: Union[ExitCode, int] = config.hook.pytest_cmdline_main(
  File "/usr/local/lib/python3.10/site-packages/pluggy/_hooks.py", line 493, in __call__
    return self._hookexec(self.name, self._hookimpls, kwargs, firstresult)
  File "/usr/local/lib/python3.10/site-packages/pluggy/_manager.py", line 115, in _hookexec
    return self._inner_hookexec(hook_name, methods, kwargs, firstresult)
  File "/usr/local/lib/python3.10/site-packages/pluggy/_callers.py", line 113, in _multicall
    raise exception.with_traceback(exception.__traceback__)
  File "/usr/local/lib/python3.10/site-packages/pluggy/_callers.py", line 77, in _multicall
    res = hook_impl.function(*args)
  File "/usr/local/lib/python3.10/site-packages/_pytest/main.py", line 318, in pytest_cmdline_main
    return wrap_session(config, _main)
  File "/usr/local/lib/python3.10/site-packages/_pytest/main.py", line 306, in wrap_session
    config.hook.pytest_sessionfinish(
  File "/usr/local/lib/python3.10/site-packages/pluggy/_hooks.py", line 493, in __call__
    return self._hookexec(self.name, self._hookimpls, kwargs, firstresult)
  File "/usr/local/lib/python3.10/site-packages/pluggy/_manager.py", line 115, in _hookexec
    return self._inner_hookexec(hook_name, methods, kwargs, firstresult)
  File "/usr/local/lib/python3.10/site-packages/pluggy/_callers.py", line 152, in _multicall
    return outcome.get_result()
  File "/usr/local/lib/python3.10/site-packages/pluggy/_result.py", line 114, in get_result
    raise exc.with_traceback(exc.__traceback__)
  File "/usr/local/lib/python3.10/site-packages/pluggy/_callers.py", line 137, in _multicall
    teardown.throw(outcome._exception)
AttributeError: 'tuple' object has no attribute 'throw'

Upvote & Fund

We're using Polar.sh so you can upvote and help fund this issue.
We receive the funding once the issue is completed & confirmed by you.
Thank you in advance for helping prioritize & fund our backlog.

[BUG] [DPO - Direct Policy Optimization][RuntimeError: mat1 and mat2 must have the same dtype, but got Long and Float]

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
[<ipython-input-26-508a53a4cb02>](https://localhost:8080/#) in <cell line: 26>()
     24 
     25 # Compute loss
---> 26 loss = dpo_model(preferred_seq, unpreferred_seq)
     27 print(loss)

9 frames
[/usr/local/lib/python3.10/dist-packages/torch/nn/modules/linear.py](https://localhost:8080/#) in forward(self, input)
    112 
    113     def forward(self, input: Tensor) -> Tensor:
--> 114         return F.linear(input, self.weight, self.bias)
    115 
    116     def extra_repr(self) -> str:

RuntimeError: mat1 and mat2 must have the same dtype, but got Long and Float

Upvote & Fund

We're using Polar.sh so you can upvote and help fund this issue.
We receive the funding once the issue is completed & confirmed by you.
Thank you in advance for helping prioritize & fund our backlog.

[BUG] no default value for depth of AttentionLayer

Describe the bug

ERROR tests/structs/test_transformer.py::test_creation - TypeError: AttentionLayers.__init__() missing 1 required positional argument: 'depth'
ERROR tests/structs/test_transformer.py::test_forward[x0-expected_output_size0] - TypeError: AttentionLayers.__init__() missing 1 required positional argument: 'depth'
ERROR tests/structs/test_transformer.py::test_forward[x1-expected_output_size1] - TypeError: AttentionLayers.__init__() missing 1 required positional argument: 'depth'
ERROR tests/structs/test_transformer.py::test_forward[x2-expected_output_size2] - TypeError: AttentionLayers.__init__() missing 1 required positional argument: 'depth'
ERROR tests/structs/test_transformer.py::test_forward_exception[wrong_input0] - TypeError: AttentionLayers.__init__() missing 1 required positional argument: 'depth'
ERROR tests/structs/test_transformer.py::test_forward_exception[wrong_input1] - TypeError: AttentionLayers.__init__() missing 1 required positional argument: 'depth'
ERROR tests/structs/test_transformer.py::test_forward_exception[string] - TypeError: AttentionLayers.__init__() missing 1 required positional argument: 'depth'

AttentionLayer has depth, but it doesn't have a default value:

class AttentionLayers(nn.Module):
    def __init__(
        self,
        dim,
        depth,
        heads=8,
        causal=False,
        cross_attend=False,
        only_cross=False,
        use_scalenorm=False,
        use_rmsnorm=False,
        use_simple_rmsnorm=False,
        alibi_pos_bias=False,
        alibi_num_heads=None,
        rel_pos_bias=False,
        rel_pos_num_buckets=32,
        rel_pos_max_distance=128,
        dynamic_pos_bias=False,
        dynamic_pos_bias_log_distance=False,
        dynamic_pos_bias_mlp_depth=2,
        dynamic_pos_bias_norm=False,
        rotary_pos_emb=False,
        rotary_emb_dim=None,
        rotary_xpos=False,
        rotary_interpolation_factor=1.0,
        rotary_xpos_scale_base=512,
        rotary_base_rescale_factor=1.0,
        custom_layers=None,
        sandwich_coef=None,
        par_ratio=None,
        residual_attn=False,
        cross_residual_attn=False,
        macaron=False,
        pre_norm=True,
        pre_norm_has_final_norm=True,
        gate_residual=False,
        scale_residual=False,
        scale_residual_constant=1.0,
        deepnorm=False,
        shift_tokens=0,
        sandwich_norm=False,
        resi_dual=False,
        resi_dual_scale=1.0,
        zero_init_branch_output=False,
        layer_dropout=0.0,
        cross_attn_tokens_dropout=0.0,
        **kwargs,
    ):

So, when it's called with only a single value(test_transformer), there's an error:

@pytest.fixture()
def init_transformer():
    attn_layers = AttentionLayers(
        256
    )

To Reproduce
Steps to reproduce the behavior:

Go to '...'
Click on '....'
Scroll down to '....'
See error

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
If applicable, add screenshots to help explain your problem.

Additional context
Add any other context about the problem here.

Upvote & Fund

We're using Polar.sh so you can upvote and help fund this issue.
We receive the funding once the issue is completed & confirmed by you.
Thank you in advance for helping prioritize & fund our backlog.

[BUG] ModuleNotFoundError: No module named 'zeta.cloud'

_____________________________ ERROR collecting main.py _____________________________
ImportError while importing test module '/home/v/vzeta/tests/cloud/main.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/usr/lib/python3.10/importlib/init.py:126: in import_module
return _bootstrap._gcd_import(name[level:], package, level)
main.py:3: in
from zeta.cloud.main import zetacloud
E ModuleNotFoundError: No module named 'zeta.cloud'

Upvote & Fund

We're using Polar.sh so you can upvote and help fund this issue.
We receive the funding once the issue is completed & confirmed by you.
Thank you in advance for helping prioritize & fund our backlog.

[TEST] half_bit_linear - no module named

I have looked, and this module exists, and is in init.py

_____________ ERROR collecting tests/quant/test_half_bit_linear.py _____________
ImportError while importing test module '/usr/src/zeta/tests/quant/test_half_bit_linear.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/usr/local/lib/python3.10/importlib/__init__.py:126: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
tests/quant/test_half_bit_linear.py:3: in <module>
    from zeta.quant.half_bit_linear import HalfBitLinear
E   ModuleNotFoundError: No module named 'zeta.quant.half_bit_linear'

Upvote & Fund

We're using Polar.sh so you can upvote and help fund this issue.
We receive the funding once the issue is completed & confirmed by you.
Thank you in advance for helping prioritize & fund our backlog.

[BUG]MambaBlock no working on MacOS, even just import

Describe the bug

from zeta.nn import MambaBlock

raceback (most recent call last):
  File "/Users/teli/www/ml/SSM/zeta_mamba.py", line 2, in <module>
    from zeta.nn import MambaBlock

RuntimeError: Expected tensor for argument #1 'indices' to have one of the following scalar types: Long, Int; but got torch.FloatTensor instead (while checking arguments for embedding)

To Reproduce
use the Readme example

Expected behavior
should run simple mamba block

Screenshots

Additional context
Add any other context about the problem here.

Upvote & Fund

We're using Polar.sh so you can upvote and help fund this issue.
We receive the funding once the issue is completed & confirmed by you.
Thank you in advance for helping prioritize & fund our backlog.

[BUG] how could i find a zata version that compatiable with python3.8 and torch 1.12

the lastest zata version uses torch2.2

Upvote & Fund

We're using Polar.sh so you can upvote and help fund this issue.
We receive the funding once the issue is completed & confirmed by you.
Thank you in advance for helping prioritize & fund our backlog.

[BUG] [DOCS] PositionalEmbedding example IndexError: Dimension out of range

In the docs for PositionalEmbedding (https://zeta.apac.ai/en/latest/zeta/nn/embeddings/positional_embeddings/)

In the example:

from zeta.nn  import PositionalEmbedding
import torch

# Create a PositionalEmbedding instance
positional_embedding = PositionalEmbedding(num_embeddings=100, embedding_dim=128)

# Generate positional embeddings for a sequence of length 10
positions = torch.arange(10)
embeddings = positional_embedding(positions)

on colab, running it has an error:

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
[<ipython-input-79-85c74b797d27>](https://localhost:8080/#) in <cell line: 10>()
      8 # Generate positional embeddings for a sequence of length 10
      9 positions = torch.arange(10)
---> 10 embeddings = positional_embedding(positions)

2 frames
[/usr/local/lib/python3.10/dist-packages/zeta/nn/embeddings/positional.py](https://localhost:8080/#) in forward(self, x, positions, **kwargs)
     26             # being consistent with Fairseq, which starts from 2.
     27             positions = (
---> 28                 torch.arange(2, x.size(1) + 2, device=x.device)
     29                 .long()
     30                 .unsqueeze(0)

IndexError: Dimension out of range (expected to be in range of [-1, 0], but got 1)

Upvote & Fund

We're using Polar.sh so you can upvote and help fund this issue.
We receive the funding once the issue is completed & confirmed by you.
Thank you in advance for helping prioritize & fund our backlog.

[BUG] test_test_example: ImportError: cannot import name 'MultiheadAttention' from 'zeta' (/home/v/.local/lib/python3.10/site-packages/zeta/init.py)

__________________________________________ ERROR collecting test_test_example.py __________________________________________
ImportError while importing test module '/home/v/vzeta/tests/test_test_example.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/usr/lib/python3.10/importlib/__init__.py:126: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
test_test_example.py:5: in <module>
    from zeta import MultiheadAttention
E   ImportError: cannot import name 'MultiheadAttention' from 'zeta' (/home/v/.local/lib/python3.10/site-packages/zeta/__init__.py)

Upvote & Fund

We're using Polar.sh so you can upvote and help fund this issue.
We receive the funding once the issue is completed & confirmed by you.
Thank you in advance for helping prioritize & fund our backlog.

[BUG] [DOCS] SinusoidalEmbeddings rotate_half

From colab notebook, the example from the docs:

from zeta import rotate_half
import torch

# Create an input tensor
x = torch.randn(2, 3, 4)

# Rotate the input tensor
rotated_x = rotate_half(x)

fails with an import error

ImportError                               Traceback (most recent call last)

[<ipython-input-38-339cf597a136>](https://localhost:8080/#) in <cell line: 1>()
----> 1 from zeta import rotate_half
      2 import torch
      3 
      4 # Create an input tensor
      5 x = torch.randn(2, 3, 4)

ImportError: cannot import name 'rotate_half' from 'zeta' (/usr/local/lib/python3.10/dist-packages/zeta/__init__.py)

rotate_half is implemented in several places, and I don't know which one it isn't finding.

Upvote & Fund

We're using Polar.sh so you can upvote and help fund this issue.
We receive the funding once the issue is completed & confirmed by you.
Thank you in advance for helping prioritize & fund our backlog.

[BUG] ModuleNotFoundError: No module named 'zeta.utils.attention'

______________________ ERROR collecting tests/test_mha.py ______________________
ImportError while importing test module '/home/v/zeta/tests/test_mha.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/usr/lib/python3.10/importlib/init.py:126: in import_module
return _bootstrap._gcd_import(name[level:], package, level)
tests/test_mha.py:1: in
from zeta.utils.attention.multihead_attention import MultiheadAttention
E ModuleNotFoundError: No module named 'zeta.utils.attention'

Upvote & Fund

We're using Polar.sh so you can upvote and help fund this issue.
We receive the funding once the issue is completed & confirmed by you.
Thank you in advance for helping prioritize & fund our backlog.

[BUG] remove pycache / .pyc files and/or use a unique basename for your test file modules

import file mismatch:
imported module 'test_bitlinear' has this file attribute:
/home/v/vzeta/tests/nn/modules/test_bitlinear.py
which is not the same as the test file we want to collect:
/home/v/vzeta/tests/quant/test_bitlinear.py
HINT: remove pycache / .pyc files and/or use a unique basename for your test file modules

Upvote & Fund

We're using Polar.sh so you can upvote and help fund this issue.
We receive the funding once the issue is completed & confirmed by you.
Thank you in advance for helping prioritize & fund our backlog.

[BUG] ERROR: (gcloud.services.enable) FAILED_PRECONDITION:

  Checking GCP...
Enabling Compute Engine API (free of charge; this may take a minute)...
Failed. Detailed output:
ERROR: (gcloud.services.enable) FAILED_PRECONDITION: Billing account for project '522837608576' is not found. Billing must be enabled for activation of service(s) 'compute.googleapis.com,compute.googleapis.com,compute.googleapis.com' to proceed.
Help Token: ARD_zUbFpP-0qySPromvgtJNdw33QM1HJpmBG-BiRaVI9eOZ4mMJ9-MqOcvVj8OnLeGUqI3WlHsRbqPAtxXrnnY8VB6U3Xpas5zyB6_mbqgH3oXn
- '@type': type.googleapis.com/google.rpc.PreconditionFailure
  violations:
  - subject: ?error_code=390001&project=522837608576&services=compute.googleapis.com&services=compute.googleapis.com&services=compute.googleapis.com
    type: googleapis.com/billing-enabled
- '@type': type.googleapis.com/google.rpc.ErrorInfo
  domain: serviceusage.googleapis.com/billing-enabled
  metadata:
    project: '522837608576'
    services: compute.googleapis.com,compute.googleapis.com,compute.googleapis.com
  reason: UREQ_PROJECT_BILLING_NOT_FOUND

Upvote & Fund

We're using Polar.sh so you can upvote and help fund this issue.
We receive the funding once the issue is completed & confirmed by you.
Thank you in advance for helping prioritize & fund our backlog.

[BUG] SinusoidalEmbedding rotate_every_two , related to #106

From colab, the 2nd SinusoidalEmbeddings example

from zeta import apply_rotary_pos_emb
import torch

# Create query and key tensors
q = torch.randn(2, 3, 4)
k = torch.randn(2, 3, 4)

# Generate frequency and scale embeddings using SinusoidalEmbeddings

# Apply rotary positional embeddings
q_emb, k_emb = apply_rotary_pos_emb(q, k, freqs, scale)

Has the error

---------------------------------------------------------------------------

RuntimeError                              Traceback (most recent call last)

[<ipython-input-39-a72a64e54f5b>](https://localhost:8080/#) in <cell line: 11>()
      9 
     10 # Apply rotary positional embeddings
---> 11 q_emb, k_emb = apply_rotary_pos_emb(q, k, freqs, scale)

[/usr/local/lib/python3.10/dist-packages/zeta/nn/embeddings/xpos_relative_position.py](https://localhost:8080/#) in apply_rotary_pos_emb(x, sin, cos, scale)
     69     """
     70     sin, cos = map(lambda t: duplicate_interleave(t * scale), (sin, cos))
---> 71     return (x * cos) + (rotate_every_two(x) * sin)
     72 
     73 

RuntimeError: The size of tensor a (4) must match the size of tensor b (128) at non-singleton dimension 2

This is another instance of the problem reported in #106 which I'm still testing the fix for.

Upvote & Fund

We're using Polar.sh so you can upvote and help fund this issue.
We receive the funding once the issue is completed & confirmed by you.
Thank you in advance for helping prioritize & fund our backlog.

[BUG] [DOCS] [IMPORT] XPOS

Running the xpos first example from the docs has an import erro:

ModuleNotFoundError                       Traceback (most recent call last)
[<ipython-input-14-f2a7ff6796e7>](https://localhost:8080/#) in <cell line: 2>()
      1 import torch
----> 2 from xpos import XPOS
      3 
      4 # Create an instance of the XPOS module
      5 xpos = XPOS(head_dim=256)

ModuleNotFoundError: No module named 'xpos'

I checked the init in nn.embeddings, but it looks ok. Doing the full path in the example, resolves the issue:

import torch
from zeta.nn.embeddings.xpos_relative_position import XPOS

# Create an instance of the XPOS module
xpos = XPOS(head_dim=256)

# Generate a random input tensor
x = torch.randn(1, 10, 256)

# Apply the XPOS module to the input tensor
output = xpos(x)

Upvote & Fund

We're using Polar.sh so you can upvote and help fund this issue.
We receive the funding once the issue is completed & confirmed by you.
Thank you in advance for helping prioritize & fund our backlog.

[BUG] ModuleNotFoundError: No module named 'zeta.structs.attn_layers'

My zetascale version is:
zetascale 1.2.4
When I run the code in
https://github.com/kyegomez/Fuyu/tree/main/fuyu
The error is:

Traceback (most recent call last):
File "/mnt/sda/shaoyanli/fuyu/Fuyu/example.py", line 2, in
from fuyu.model import Fuyu
File "/mnt/sda/shaoyanli/fuyu/Fuyu/fuyu/init.py", line 1, in
from fuyu.model import Fuyu
File "/mnt/sda/shaoyanli/fuyu/Fuyu/fuyu/model.py", line 3, in
from zeta.structs import AutoregressiveWrapper, Decoder, Transformer
File "/mnt/sda/anaconda3/envs/fuyu/lib/python3.10/site-packages/zeta/init.py", line 5, in
from zeta.nn import * # noqa: F403, E402
File "/mnt/sda/anaconda3/envs/fuyu/lib/python3.10/site-packages/zeta/nn/init.py", line 1, in
from zeta.nn.attention import *
File "/mnt/sda/anaconda3/envs/fuyu/lib/python3.10/site-packages/zeta/nn/attention/init.py", line 14, in
from zeta.nn.attention.mixture_attention import (
File "/mnt/sda/anaconda3/envs/fuyu/lib/python3.10/site-packages/zeta/nn/attention/mixture_attention.py", line 8, in
from zeta.models.vit import exists
File "/mnt/sda/anaconda3/envs/fuyu/lib/python3.10/site-packages/zeta/models/init.py", line 3, in
from zeta.models.andromeda import Andromeda
File "/mnt/sda/anaconda3/envs/fuyu/lib/python3.10/site-packages/zeta/models/andromeda.py", line 4, in
from zeta.structs.auto_regressive_wrapper import AutoregressiveWrapper
File "/mnt/sda/anaconda3/envs/fuyu/lib/python3.10/sit/mnt/sda/anaconda3/envs/fuyu/lib/python3.10/site-packages/zeta/structs/e-packages/zeta/structs/init.py", line 4, in
from zeta.structs.hierarchical_transformer import (
File "/mnt/sda/anaconda3/envs/fuyu/lib/python3.10/site-packages/zeta/structs/hierarchical_transformer.py", line 13, in
from zeta.structs.attn_layers import rotate_half
ModuleNotFoundError: No module named 'zeta.structs.attn_layers'

so I go to the package path of zeta, but I can't find the File zeta/structs/attn_layers

Upvote & Fund

We're using Polar.sh so you can upvote and help fund this issue.
We receive the funding once the issue is completed & confirmed by you.
Thank you in advance for helping prioritize & fund our backlog.

[BUG] zeta/nn/modules/multiclass_label.py is empty

This file contains only '_'

It was added this way initially.

Upvote & Fund

We're using Polar.sh so you can upvote and help fund this issue.
We receive the funding once the issue is completed & confirmed by you.
Thank you in advance for helping prioritize & fund our backlog.

[BUG] test_init.py: AssertionError: Modules structs, quant not found in zeta package / assert not ['structs', 'quant']

______________________________________________________ test_imports _______________________________________________________

    def test_imports():
        modules = [
            "nn",
            "structs",
            "models",
            "utils",
            "training",
            "tokenizers",
            "rl",
            "optim",
            "ops",
            "quant",
        ]
        missing_modules = []
        for module in modules:
            if not hasattr(zeta, module):
                missing_modules.append(module)
    
>       assert (
            not missing_modules
        ), f"Modules {', '.join(missing_modules)} not found in zeta package"
E       AssertionError: Modules structs, quant not found in zeta package
E       assert not ['structs', 'quant']

test_init.py:22: AssertionError

Upvote & Fund

We're using Polar.sh so you can upvote and help fund this issue.
We receive the funding once the issue is completed & confirmed by you.
Thank you in advance for helping prioritize & fund our backlog.

[TEST] adaptive_rmsnorm - no module named

I looked and the code does contain this module name, and it is in the init,py.

__________ ERROR collecting tests/nn/modules/test_adaptive_rmsnorm.py __________
ImportError while importing test module '/usr/src/zeta/tests/nn/modules/test_adaptive_rmsnorm.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/usr/local/lib/python3.10/importlib/__init__.py:126: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
tests/nn/modules/test_adaptive_rmsnorm.py:3: in <module>
    from zeta.nn.modules.adaptive_rmsnorm import AdaptiveRMSNorm
E   ModuleNotFoundError: No module named 'zeta.nn.modules.adaptive_rmsnorm'

Upvote & Fund

We're using Polar.sh so you can upvote and help fund this issue.
We receive the funding once the issue is completed & confirmed by you.
Thank you in advance for helping prioritize & fund our backlog.

[BUG] ERROR collecting rl/test_prioritizedsequencereplybuffer.py - ModuleNotFoundError: No module named 'sumtree'

_______________________ ERROR collecting rl/test_prioritizedsequencereplybuffer.py _______________________
ImportError while importing test module '/home/v/vzeta/tests/rl/test_prioritizedsequencereplybuffer.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/usr/lib/python3.10/importlib/__init__.py:126: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
rl/test_prioritizedsequencereplybuffer.py:4: in <module>
    from zeta.rl.priortized_rps import (
../../.local/lib/python3.10/site-packages/zeta/rl/priortized_rps.py:1: in <module>
    from sumtree import SumTree
E   ModuleNotFoundError: No module named 'sumtree'

Upvote & Fund

We're using Polar.sh so you can upvote and help fund this issue.
We receive the funding once the issue is completed & confirmed by you.
Thank you in advance for helping prioritize & fund our backlog.

[BUG] Pip should be updated

The official package has the following problem, I noticed it when testing Lumiere..

<<<<<<< HEAD
from zeta.nn.attention.multiquery_attention import (
MultiQueryAttention,
)
from zeta.nn.modules.simple_feedforward import SimpleFeedForward

from zeta.nn.attention.multiquery_attention import MultiQueryAttention
from zeta.nn.modules import SimpleFeedForward

2e50e6f
from zeta.nn.attention.cross_attention import CrossAtte

Upvote & Fund

We're using Polar.sh so you can upvote and help fund this issue.
We receive the funding once the issue is completed & confirmed by you.
Thank you in advance for helping prioritize & fund our backlog.

Relationship with x-transformers

Hi, I have noticed that some codes are similar to the x-transformers'. Are you the original? Or you may give credit to Lucidrain?

Upvote & Fund

We're using Polar.sh so you can upvote and help fund this issue.
We receive the funding once the issue is completed & confirmed by you.
Thank you in advance for helping prioritize & fund our backlog.

[BUG] TypeError: new(): invalid data type 'str'

__________________________ ERROR collecting nn/modules/test_linearactivation.py __________________________
../../.local/lib/python3.10/site-packages/_pytest/runner.py:341: in from_call
    result: Optional[TResult] = func()
../../.local/lib/python3.10/site-packages/_pytest/runner.py:372: in <lambda>
    call = CallInfo.from_call(lambda: list(collector.collect()), "collect")
../../.local/lib/python3.10/site-packages/_pytest/python.py:531: in collect
    self._inject_setup_module_fixture()
../../.local/lib/python3.10/site-packages/_pytest/python.py:545: in _inject_setup_module_fixture
    self.obj, ("setUpModule", "setup_module")
../../.local/lib/python3.10/site-packages/_pytest/python.py:310: in obj
    self._obj = obj = self._getobj()
../../.local/lib/python3.10/site-packages/_pytest/python.py:528: in _getobj
    return self._importtestmodule()
../../.local/lib/python3.10/site-packages/_pytest/python.py:617: in _importtestmodule
    mod = import_path(self.path, mode=importmode, root=self.config.rootpath)
../../.local/lib/python3.10/site-packages/_pytest/pathlib.py:567: in import_path
    importlib.import_module(module_name)
/usr/lib/python3.10/importlib/__init__.py:126: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
<frozen importlib._bootstrap>:1050: in _gcd_import
    ???
<frozen importlib._bootstrap>:1027: in _find_and_load
    ???
<frozen importlib._bootstrap>:1006: in _find_and_load_unlocked
    ???
<frozen importlib._bootstrap>:688: in _load_unlocked
    ???
../../.local/lib/python3.10/site-packages/_pytest/assertion/rewrite.py:178: in exec_module
    exec(co, module.__dict__)
nn/modules/test_linearactivation.py:21: in <module>
    @pytest.mark.parametrize("input_tensor", [(torch.tensor([1, 2, "a"]))])
E   TypeError: new(): invalid data type 'str'

Upvote & Fund

We're using Polar.sh so you can upvote and help fund this issue.
We receive the funding once the issue is completed & confirmed by you.
Thank you in advance for helping prioritize & fund our backlog.

[BUG] [DOCS] XPOS

There's a dimensional error in the 3rd example of Xpos:
Code

import torch
from zeta import fixed_pos_embedding, apply_rotary_pos_emb

# Generate fixed positional embeddings
scale = torch.randn(10, 256)
sin, cos = fixed_pos_embedding(scale)

# Apply rotary positional embeddings to an input tensor
x = torch.randn(1, 10, 256)
output = apply_rotary_pos_emb(x, sin, cos, scale=0.5)

RuntimeError                              Traceback (most recent call last)
[<ipython-input-18-4d63cb090aa8>](https://localhost:8080/#) in <cell line: 10>()
      8 # Apply rotary positional embeddings to an input tensor
      9 x = torch.randn(1, 10, 256)
---> 10 output = apply_rotary_pos_emb(x, sin, cos, scale=0.5)

[/usr/local/lib/python3.10/dist-packages/zeta/nn/embeddings/xpos_relative_position.py](https://localhost:8080/#) in apply_rotary_pos_emb(x, sin, cos, scale)
     69     """
     70     sin, cos = map(lambda t: duplicate_interleave(t * scale), (sin, cos))
---> 71     return (x * cos) + (rotate_every_two(x) * sin)
     72 
     73 

RuntimeError: The size of tensor a (256) must match the size of tensor b (512) at non-singleton dimension 2

The implementation in the original paper is wrong. This wrong implementation was copied into hf/transformers, and then fixed:
huggingface/transformers@052fa2f
https://github.com/huggingface/transformers/blob/edb170238febf7fc3e3278ed5b9ca0b2c40c70e3/src/transformers/models/gptj/modeling_flax_gptj.py#L122

Upvote & Fund

We're using Polar.sh so you can upvote and help fund this issue.
We receive the funding once the issue is completed & confirmed by you.
Thank you in advance for helping prioritize & fund our backlog.

[BUG] tensorflow imported

zeta/nn/modules/Matrix.py installs tensorflow, in-code (likely a bad idea), as does zeta/utils/disable_logging.py

(Matrix.py should be renamed.)

All the tests which depend on both of these fail, because the in-code import is not there.

ERROR tests/test_init.py
ERROR tests/cloud/test_main.py
ERROR tests/models/test_andromeda.py
ERROR tests/models/test_basemodel.py
ERROR tests/models/test_gpt4.py
ERROR tests/models/test_gpt4multimodal.py
ERROR tests/models/test_llama2.py
ERROR tests/models/test_maxvit.py
ERROR tests/models/test_megavit.py
ERROR tests/models/test_navit.py
ERROR tests/models/test_palme.py
ERROR tests/models/test_vit.py
ERROR tests/nn/attentions/test_attend.py
ERROR tests/nn/attentions/test_cross_attn.py
ERROR tests/nn/attentions/test_cross_attn_multimodal.py
ERROR tests/nn/attentions/test_local_attn_mha.py
ERROR tests/nn/attentions/test_mha.py
ERROR tests/nn/attentions/test_mhaa.py
ERROR tests/nn/attentions/test_mqa.py
ERROR tests/nn/attentions/test_shaped_attn.py
ERROR tests/nn/attentions/test_sparq_attn.py
ERROR tests/nn/attentions/test_sparse_attn.py
ERROR tests/nn/attentions/test_test_mha.py
ERROR tests/nn/attentions/test_xc_attention.py
ERROR tests/nn/biases/test_alibi.py
ERROR tests/nn/biases/test_dynamic_relative.py
ERROR tests/nn/biases/test_relative_position_bias.py
ERROR tests/nn/embeddings/test_QFTSPEmbeddings.py
ERROR tests/nn/embeddings/test_abc_pos_emb.py
ERROR tests/nn/embeddings/test_patch_embedding.py
ERROR tests/nn/embeddings/test_qftp_embeddings.py
ERROR tests/nn/embeddings/test_rope.py
ERROR tests/nn/embeddings/test_rotary.py
ERROR tests/nn/embeddings/test_sine_positional_embs.py
ERROR tests/nn/embeddings/test_truncated_rotary_emb.py
ERROR tests/nn/embeddings/test_vision_embeddings.py
ERROR tests/nn/embeddings/test_vision_lang_embeddings.py
ERROR tests/nn/embeddings/test_xpos.py
ERROR tests/nn/embeddings/test_yarn.py
ERROR tests/nn/modules/test_accurategeluactivation.py
ERROR tests/nn/modules/test_activations.py
ERROR tests/nn/modules/test_adaptive_param.py
ERROR tests/nn/modules/test_adative_layernorm.py
ERROR tests/nn/modules/test_alr_block.py
ERROR tests/nn/modules/test_avg_model_merger.py
ERROR tests/nn/modules/test_clippedgeluactivation.py
ERROR tests/nn/modules/test_cross_attn_images.py
ERROR tests/nn/modules/test_custom_mlp.py
ERROR tests/nn/modules/test_denseblock.py
ERROR tests/nn/modules/test_dualpathblock.py
ERROR tests/nn/modules/test_dynamic_module.py
ERROR tests/nn/modules/test_dynamicroutingblock.py
ERROR tests/nn/modules/test_expert.py
ERROR tests/nn/modules/test_feedbackblock.py
ERROR tests/nn/modules/test_full_feedforward.py
ERROR tests/nn/modules/test_fused_dropout_layernom.py
ERROR tests/nn/modules/test_fused_gelu_dense.py
ERROR tests/nn/modules/test_gatedresidualblock.py
ERROR tests/nn/modules/test_geluactivation.py
ERROR tests/nn/modules/test_hebbian.py
ERROR tests/nn/modules/test_highwaylayer.py
ERROR tests/nn/modules/test_image_projector.py
ERROR tests/nn/modules/test_img_patch_embed.py
ERROR tests/nn/modules/test_kv_cache.py
ERROR tests/nn/modules/test_laplaceactivation.py
ERROR tests/nn/modules/test_linearactivation.py
ERROR tests/nn/modules/test_log_ff.py
ERROR tests/nn/modules/test_mishactivation.py
ERROR tests/nn/modules/test_mlp.py
ERROR tests/nn/modules/test_mm_adapter.py
ERROR tests/nn/modules/test_newgeluactivation.py
ERROR tests/nn/modules/test_polymorphic_neuron.py
ERROR tests/nn/modules/test_pytorchgelutanh.pyERROR tests/test_init.py
ERROR tests/cloud/test_main.py
ERROR tests/models/test_andromeda.py
ERROR tests/models/test_basemodel.py
ERROR tests/models/test_gpt4.py
ERROR tests/models/test_gpt4multimodal.py
ERROR tests/models/test_llama2.py
ERROR tests/models/test_maxvit.py
ERROR tests/models/test_megavit.py
ERROR tests/models/test_navit.py
ERROR tests/models/test_palme.py
ERROR tests/models/test_vit.py
ERROR tests/nn/attentions/test_attend.py
ERROR tests/nn/attentions/test_cross_attn.py
ERROR tests/nn/attentions/test_cross_attn_multimodal.py
ERROR tests/nn/attentions/test_local_attn_mha.py
ERROR tests/nn/attentions/test_mha.py
ERROR tests/nn/attentions/test_mhaa.py
ERROR tests/nn/attentions/test_mqa.py
ERROR tests/nn/attentions/test_shaped_attn.py
ERROR tests/nn/attentions/test_sparq_attn.py
ERROR tests/nn/attentions/test_sparse_attn.py
ERROR tests/nn/attentions/test_test_mha.py
ERROR tests/nn/attentions/test_xc_attention.py
ERROR tests/nn/biases/test_alibi.py
ERROR tests/nn/biases/test_dynamic_relative.py
ERROR tests/nn/biases/test_relative_position_bias.py
ERROR tests/nn/embeddings/test_QFTSPEmbeddings.py
ERROR tests/nn/embeddings/test_abc_pos_emb.py
ERROR tests/nn/embeddings/test_patch_embedding.py
ERROR tests/nn/embeddings/test_qftp_embeddings.py
ERROR tests/nn/embeddings/test_rope.py
ERROR tests/nn/embeddings/test_rotary.py
ERROR tests/nn/embeddings/test_sine_positional_embs.py
ERROR tests/nn/embeddings/test_truncated_rotary_emb.py
ERROR tests/nn/embeddings/test_vision_embeddings.py
ERROR tests/nn/embeddings/test_vision_lang_embeddings.py
ERROR tests/nn/embeddings/test_xpos.py
ERROR tests/nn/embeddings/test_yarn.py
ERROR tests/nn/modules/test_accurategeluactivation.py
ERROR tests/nn/modules/test_activations.py
ERROR tests/nn/modules/test_adaptive_param.py
ERROR tests/nn/modules/test_adative_layernorm.py
ERROR tests/nn/modules/test_alr_block.py
ERROR tests/nn/modules/test_avg_model_merger.py
ERROR tests/nn/modules/test_clippedgeluactivation.py
ERROR tests/nn/modules/test_cross_attn_images.py
ERROR tests/nn/modules/test_custom_mlp.py
ERROR tests/nn/modules/test_denseblock.py
ERROR tests/nn/modules/test_dualpathblock.py
ERROR tests/nn/modules/test_dynamic_module.py
ERROR tests/nn/modules/test_dynamicroutingblock.py
ERROR tests/nn/modules/test_expert.py
ERROR tests/nn/modules/test_feedbackblock.py
ERROR tests/nn/modules/test_full_feedforward.py
ERROR tests/nn/modules/test_fused_dropout_layernom.py
ERROR tests/nn/modules/test_fused_gelu_dense.py
ERROR tests/nn/modules/test_gatedresidualblock.py
ERROR tests/nn/modules/test_geluactivation.py
ERROR tests/nn/modules/test_hebbian.py
ERROR tests/nn/modules/test_highwaylayer.py
ERROR tests/nn/modules/test_image_projector.py
ERROR tests/nn/modules/test_img_patch_embed.py
ERROR tests/nn/modules/test_kv_cache.py
ERROR tests/nn/modules/test_laplaceactivation.py
ERROR tests/nn/modules/test_linearactivation.py
ERROR tests/nn/modules/test_log_ff.py
ERROR tests/nn/modules/test_mishactivation.py
ERROR tests/nn/modules/test_mlp.py
ERROR tests/nn/modules/test_mm_adapter.py
ERROR tests/nn/modules/test_newgeluactivation.py
ERROR tests/nn/modules/test_polymorphic_neuron.py
ERROR tests/nn/modules/test_pytorchgelutanh.py
ERROR tests/nn/modules/test_quantized_layernorm.py
ERROR tests/nn/modules/test_quickgeluactivation.py
ERROR tests/nn/modules/test_recursiveblock.py
ERROR tests/nn/modules/test_relusquaredactivation.py
ERROR tests/nn/modules/test_resnet.py
ERROR tests/nn/modules/test_simple_feedforward.py
ERROR tests/nn/modules/test_simple_mamba.py
ERROR tests/nn/modules/test_simple_res_block.py
ERROR tests/nn/modules/test_slerp_model_merger.py
ERROR tests/nn/modules/test_stochasticskipblock.py
ERROR tests/nn/modules/test_test_conv_lang.py
ERROR tests/nn/modules/test_test_h3_layer.py
ERROR tests/nn/modules/test_test_s4.py
ERROR tests/nn/modules/test_token_learner.py
ERROR tests/nn/modules/test_transformations.py
ERROR tests/nn/modules/test_tripleskipblock.py
ERROR tests/nn/modules/test_unet.py
ERROR tests/nn/modules/test_visual_expert.py
ERROR tests/ops/test_einops_from_to.py
ERROR tests/ops/test_einops_poly.py
ERROR tests/ops/test_mos.py
ERROR tests/optim/test_decoupled_lion.py
ERROR tests/optim/test_gradient_ascent.py
ERROR tests/optim/test_gradient_equillibrum.py
ERROR tests/optim/test_lion8b.py
ERROR tests/optim/test_stable_adamw.py
ERROR tests/quant/test_bitlinear.py
ERROR tests/quant/test_niva.py
ERROR tests/quant/test_qlora.py
ERROR tests/quant/test_quik.py
ERROR tests/quant/test_resudual_vq.py
ERROR tests/rl/test_vision_reward_model.py
ERROR tests/structs/test_autoregressive_wrapper.py
ERROR tests/structs/test_efficient_net.py
ERROR tests/structs/test_encoder_decoder.py
ERROR tests/structs/test_encoderdecoder.py
ERROR tests/structs/test_hierarchicalblock.py
ERROR tests/structs/test_localtransformer.py
ERROR tests/structs/test_paralleltransformerblock.py
ERROR tests/structs/test_simpletransformer.py
ERROR tests/structs/test_transformer.py
ERROR tests/structs/test_vitransformerwrapper.py
ERROR tests/tokenizers/test_gptx.py
ERROR tests/tokenizers/test_llama_tokenizer.py
ERROR tests/tokenizers/test_multimodal_tokenizer.py
ERROR tests/tokenizers/test_sentencepiece.py
ERROR tests/tokenizers/test_tokenmonster.py
ERROR tests/training/test_parallel_wrapper.py
ERROR tests/utils/test_absmax.py
ERROR tests/utils/test_cast_tuple.py
ERROR tests/utils/test_cosine_beta_schedule.py
ERROR tests/utils/test_default.py
ERROR tests/utils/test_disable_warnings_and_logs.py
ERROR tests/utils/test_enforce_types.py
ERROR tests/utils/test_exists.py
ERROR tests/utils/test_get_sinusoid_encoding_table.py
ERROR tests/utils/test_gif_to_tensor.py
ERROR tests/utils/test_group_by_key_prefix.py
ERROR tests/utils/test_group_dict_by_key.py
ERROR tests/utils/test_gumbel_noise.py
ERROR tests/utils/test_interpolate_pos_encoding_2d.py
ERROR tests/utils/test_log.py
ERROR tests/utils/test_maybe.py
ERROR tests/utils/test_module_device.py
ERROR tests/utils/test_once.py
ERROR tests/utils/test_pad_at_dim.py
ERROR tests/utils/test_pick_and_pop.py
ERROR tests/utils/test_print_cuda_memory_usage.py
ERROR tests/utils/test_print_main.py
ERROR tests/utils/test_print_num_params.py
ERROR tests/utils/test_save_load.py
ERROR tests/utils/test_save_load_wrapper.py
ERROR tests/utils/test_save_memory_snapshot.py
ERROR tests/utils/test_string_begins_with.py
ERROR tests/utils/test_top_a.py
ERROR tests/utils/test_top_k.py
ERROR tests/utils/test_top_p.py
ERROR tests/utils/test_track_cuda_memory_usage.py
ERROR tests/utils/test_video_tensor_to_gift.py
ERROR zeta/nn/modules/test_dense_connect.py

ERROR tests/nn/modules/test_quantized_layernorm.py
ERROR tests/nn/modules/test_quickgeluactivation.py
ERROR tests/nn/modules/test_recursiveblock.py
ERROR tests/nn/modules/test_relusquaredactivation.py
ERROR tests/nn/modules/test_resnet.py
ERROR tests/nn/modules/test_simple_feedforward.py
ERROR tests/nn/modules/test_simple_mamba.py
ERROR tests/nn/modules/test_simple_res_block.py
ERROR tests/nn/modules/test_slerp_model_merger.py
ERROR tests/nn/modules/test_stochasticskipblock.py
ERROR tests/nn/modules/test_test_conv_lang.py
ERROR tests/nn/modules/test_test_h3_layer.py
ERROR tests/nn/modules/test_test_s4.py
ERROR tests/nn/modules/test_token_learner.py
ERROR tests/nn/modules/test_transformations.py
ERROR tests/nn/modules/test_tripleskipblock.py
ERROR tests/nn/modules/test_unet.py
ERROR tests/nn/modules/test_visual_expert.py
ERROR tests/ops/test_einops_from_to.py
ERROR tests/ops/test_einops_poly.py
ERROR tests/ops/test_mos.py
ERROR tests/optim/test_decoupled_lion.py
ERROR tests/optim/test_gradient_ascent.py
ERROR tests/optim/test_gradient_equillibrum.py
ERROR tests/optim/test_lion8b.py
ERROR tests/optim/test_stable_adamw.py
ERROR tests/quant/test_bitlinear.py
ERROR tests/quant/test_niva.py
ERROR tests/quant/test_qlora.py
ERROR tests/quant/test_quik.py
ERROR tests/quant/test_resudual_vq.py
ERROR tests/rl/test_vision_reward_model.py
ERROR tests/structs/test_autoregressive_wrapper.py
ERROR tests/structs/test_efficient_net.py
ERROR tests/structs/test_encoder_decoder.py
ERROR tests/structs/test_encoderdecoder.py
ERROR tests/structs/test_hierarchicalblock.py
ERROR tests/structs/test_localtransformer.py
ERROR tests/structs/test_paralleltransformerblock.py
ERROR tests/structs/test_simpletransformer.py
ERROR tests/structs/test_transformer.py
ERROR tests/structs/test_vitransformerwrapper.py
ERROR tests/tokenizers/test_gptx.py
ERROR tests/tokenizers/test_llama_tokenizer.py
ERROR tests/tokenizers/test_multimodal_tokenizer.py
ERROR tests/tokenizers/test_sentencepiece.py
ERROR tests/tokenizers/test_tokenmonster.py
ERROR tests/training/test_parallel_wrapper.py
ERROR tests/utils/test_absmax.py
ERROR tests/utils/test_cast_tuple.py
ERROR tests/utils/test_cosine_beta_schedule.py
ERROR tests/utils/test_default.py
ERROR tests/utils/test_disable_warnings_and_logs.py
ERROR tests/utils/test_enforce_types.py
ERROR tests/utils/test_exists.py
ERROR tests/utils/test_get_sinusoid_encoding_table.py
ERROR tests/utils/test_gif_to_tensor.py
ERROR tests/utils/test_group_by_key_prefix.py
ERROR tests/utils/test_group_dict_by_key.py
ERROR tests/utils/test_gumbel_noise.py
ERROR tests/utils/test_interpolate_pos_encoding_2d.py
ERROR tests/utils/test_log.py
ERROR tests/utils/test_maybe.py
ERROR tests/utils/test_module_device.py
ERROR tests/utils/test_once.py
ERROR tests/utils/test_pad_at_dim.py
ERROR tests/utils/test_pick_and_pop.py
ERROR tests/utils/test_print_cuda_memory_usage.py
ERROR tests/utils/test_print_main.py
ERROR tests/utils/test_print_num_params.py
ERROR tests/utils/test_save_load.py
ERROR tests/utils/test_save_load_wrapper.py
ERROR tests/utils/test_save_memory_snapshot.py
ERROR tests/utils/test_string_begins_with.py
ERROR tests/utils/test_top_a.py
ERROR tests/utils/test_top_k.py
ERROR tests/utils/test_top_p.py
ERROR tests/utils/test_track_cuda_memory_usage.py
ERROR tests/utils/test_video_tensor_to_gift.py
ERROR zeta/nn/modules/test_dense_connect.py

Upvote & Fund

We're using Polar.sh so you can upvote and help fund this issue.
We receive the funding once the issue is completed & confirmed by you.
Thank you in advance for helping prioritize & fund our backlog.

[BUG] utils/main/pad_at_dim - recommend refactoring to use torch.nn.functional.pad

The original function as written:

def pad_at_dim(t, pad, dim=-1, value=0.0):
    dims_from_right = (-dim - 1) if dim < 0 else (t.ndim - dim - 1)
    zeros = (0, 0) * dims_from_right
    return F.pad(t, (*zeros, *pad), value=value)

Assumes simple behavior. The PyTorch or Tensorflow implementation:

torch.nn.functional.pad (https://pytorch.org/docs/stable/generated/torch.nn.functional.pad.html)
https://github.com/tensorflow/tensorflow/blob/v2.14.0/tensorflow/python/ops/array_ops.py#L3452-L3508

has more complex, and correct behavior.

I noticed this, because none of the tests in test_pad_at_dim.py are passing.

pad_at_dim is used in:

playground/models/stacked_mm_bitnet
nn/biases/alibi.py
nn/modules/shift_tokens.py
zeta/structs/transformer.py

Upvote & Fund

We're using Polar.sh so you can upvote and help fund this issue.
We receive the funding once the issue is completed & confirmed by you.
Thank you in advance for helping prioritize & fund our backlog.

Circular import on Python 3.9 and Pytorch 1.12.1+cu113 [BUG]

Describe the bug

I ran the code in your implementation of rtx2, and got error from zeta package.

I have updated package botocore and installed dependencies classifier-free-guidance-pytorch, efficientnet-pytorch.

Did not have issue running on Pytorch 2.0+ w/ Python 3.10+.

To Reproduce
Steps to reproduce the behavior:

Run script rtx2_example.py
Emit error

Expected behavior
A successful run.

Terminal output

Trying to use first CUDA device.
Using CUDA.
torch.Size([1, 150528])
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /media/root/Toshiba XG3/works/agi_computer_control/rt_x_experiments/RT-X/rtx2_example.py:20 in   │
│ <module>                                                                                         │
│                                                                                                  │
│   17 │   except:                                                                                 │
│   18 │   │   print("Could not find DirectML device.")                                            │
│   19 print(f"Using {device_name}.")                                                              │
│ ❱ 20 from rtx import RTX2                                                                        │
│   21                                                                                             │
│   22                                                                                             │
│   23 def forward_new(self, img: torch.Tensor, text: torch.Tensor):                               │
│                                                                                                  │
│ /media/root/Toshiba XG3/works/agi_computer_control/rt_x_experiments/RT-X/rtx/__init__.py:1 in    │
│ <module>                                                                                         │
│                                                                                                  │
│ ❱ 1 from rtx.rtx2 import RTX2                                                                    │
│   2                                                                                              │
│                                                                                                  │
│ /media/root/Toshiba XG3/works/agi_computer_control/rt_x_experiments/RT-X/rtx/rtx2.py:4 in        │
│ <module>                                                                                         │
│                                                                                                  │
│     1 #!pip install torch zetascale                                                              │
│     2                                                                                            │
│     3 import torch                                                                               │
│ ❱   4 from zeta.nn.architecture import (                                                         │
│     5 │   AutoregressiveWrapper,                                                                 │
│     6 │   Decoder,                                                                               │
│     7 │   Encoder,                                                                               │
│                                                                                                  │
│ /usr/local/lib/python3.9/dist-packages/zeta/__init__.py:1 in <module>                            │
│                                                                                                  │
│ ❱  1 from zeta import nn                                                                         │
│    2 from zeta.nn.architecture.transformer import FeedForward                                    │
│    3 from zeta.nn.modules.layernorm import LayerNorm                                             │
│    4 from zeta import models                                                                     │
│                                                                                                  │
│ /usr/local/lib/python3.9/dist-packages/zeta/nn/__init__.py:3 in <module>                         │
│                                                                                                  │
│    1 # architecture                                                                              │
│    2 # from zeta.nn.architecture import *                                                        │
│ ❱  3 from zeta.nn import architecture                                                            │
│    4                                                                                             │
│    5 # Attention                                                                                 │
│    6 # from zeta.nn.attention import *                                                           │
│                                                                                                  │
│ /usr/local/lib/python3.9/dist-packages/zeta/nn/architecture/__init__.py:2 in <module>            │
│                                                                                                  │
│    1 from zeta.nn.architecture.auto_regressive_wrapper import AutoregressiveWrapper              │
│ ❱  2 from zeta.nn.architecture.encoder import Encoder                                            │
│    3 from zeta.nn.architecture.encoder_decoder import EncoderDecoder                             │
│    4 from zeta.nn.architecture.hierarchical_transformer import HierarchicalTransformer           │
│    5 from zeta.nn.architecture.local_transformer import LocalTransformer                         │
│                                                                                                  │
│ /usr/local/lib/python3.9/dist-packages/zeta/nn/architecture/encoder.py:1 in <module>             │
│                                                                                                  │
│ ❱ 1 from zeta.nn.architecture.transformer import AttentionLayers                                 │
│   2                                                                                              │
│   3                                                                                              │
│   4 class Encoder(AttentionLayers):                                                              │
│                                                                                                  │
│ /usr/local/lib/python3.9/dist-packages/zeta/nn/architecture/transformer.py:15 in <module>        │
│                                                                                                  │
│     12 from einops import rearrange, reduce, repeat                                              │
│     13 from torch import Tensor, einsum, nn                                                      │
│     14                                                                                           │
│ ❱   15 from zeta.nn.attention.attend import Attend, Intermediates                                │
│     16 from functools import reduce                                                              │
│     17                                                                                           │
│     18 EfficientAttentionConfig = namedtuple(                                                    │
│                                                                                                  │
│ /usr/local/lib/python3.9/dist-packages/zeta/nn/attention/__init__.py:14 in <module>              │
│                                                                                                  │
│   11 # from zeta.nn.attention.mgqa import MGQA                                                   │
│   12                                                                                             │
│   13 # from zeta.nn.attention.spatial_linear_attention import SpatialLinearAttention             │
│ ❱ 14 from zeta.nn.attention.mixture_attention import (                                           │
│   15 │   MixtureOfAttention,                                                                     │
│   16 │   MixtureOfAutoregressiveAttention,                                                       │
│   17 )                                                                                           │
│                                                                                                  │
│ /usr/local/lib/python3.9/dist-packages/zeta/nn/attention/mixture_attention.py:8 in <module>      │
│                                                                                                  │
│     5                                                                                            │
│     6 from typing import Tuple, Optional                                                         │
│     7 from einops import rearrange, repeat, reduce, pack, unpack                                 │
│ ❱   8 from zeta.models.vit import exists                                                         │
│     9 from zeta.nn.architecture.transformer import RMSNorm, apply_rotary_pos_emb                 │
│    10                                                                                            │
│    11 from zeta.nn.attention.attend import Attend                                                │
│                                                                                                  │
│ /usr/local/lib/python3.9/dist-packages/zeta/models/__init__.py:3 in <module>                     │
│                                                                                                  │
│    1 # Copyright (c) 2022 Agora                                                                  │
│    2 # Licensed under The MIT License [see LICENSE for details]                                  │
│ ❱  3 from zeta.models.andromeda import Andromeda                                                 │
│    4 from zeta.models.base import BaseModel                                                      │
│    5 from zeta.models.gpt4 import GPT4, GPT4MultiModal                                           │
│    6 from zeta.models.llama import LLama2                                                        │
│                                                                                                  │
│ /usr/local/lib/python3.9/dist-packages/zeta/models/andromeda.py:5 in <module>                    │
│                                                                                                  │
│     2 from torch.nn import Module                                                                │
│     3                                                                                            │
│     4 from zeta.nn.architecture.auto_regressive_wrapper import AutoregressiveWrapper             │
│ ❱   5 from zeta.nn.architecture.transformer import (                                             │
│     6 │   Decoder,                                                                               │
│     7 │   Transformer,                                                                           │
│     8 )                                                                                          │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
ImportError: cannot import name 'Decoder' from partially initialized module 
'zeta.nn.architecture.transformer' (most likely due to a circular import) 
(/usr/local/lib/python3.9/dist-packages/zeta/nn/architecture/transformer.py)

Upvote & Fund

We're using Polar.sh so you can upvote and help fund this issue.
We receive the funding once the issue is completed & confirmed by you.
Thank you in advance for helping prioritize & fund our backlog.

[TEST] models/test_andromeda num_tokens attribute

FAILED models/test_andromeda.py::test_initial_parameters - AttributeError: 'Andromeda' object has no attribute 'num_tokens'
FAILED models/test_andromeda.py::test_forward_successful - ImportError: import error in zeta.models.AutoregressiveWrapper: No module named 'zeta.models.Au...
FAILED models/test_andromeda.py::test_forward_exception - ImportError: import error in zeta.models.AutoregressiveWrapper: No module named 'zeta.models.Au...

When I took a look at what Attributes Andromeda has, I don't see num_tokens, even though andromeda.py says it should be there:

['Andromeda', 'T_destination', '__annotations__', '__call__', '__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattr__', '__getattribute__', '__getstate__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__setstate__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_apply', '_backward_hooks', '_backward_pre_hooks', '_buffers', '_call_impl', '_compiled_call_impl', '_forward_hooks', '_forward_hooks_always_called', '_forward_hooks_with_kwargs', '_forward_pre_hooks', '_forward_pre_hooks_with_kwargs', '_get_backward_hooks', '_get_backward_pre_hooks', '_get_name', '_is_full_backward_hook', '_load_from_state_dict', '_load_state_dict_post_hooks', '_load_state_dict_pre_hooks', '_maybe_warn_non_full_backward_hook', '_modules', '_named_members', '_non_persistent_buffers_set', '_parameters', '_register_load_state_dict_pre_hook', '_register_state_dict_hook', '_replicate_for_data_parallel', '_save_to_state_dict', '_slow_forward', '_state_dict_hooks', '_state_dict_pre_hooks', '_version', '_wrapped_call_impl', 
'add_module', 'apply', 
'bfloat16', 'buffers', 
'call_super_init', 'children', 'compile', 'cpu', 'cuda', 
'decoder', 'double', 'dump_patches', 
'eval', 'extra_repr', 
'float', 'forward', 
'get_buffer', 'get_extra_state', 'get_parameter', 'get_submodule', 
'half', 
'ipu', 
'load_state_dict', 
'modules', 
'named_buffers', 'named_children', 'named_modules', 'named_parameters', 'parameters', 
'register_backward_hook', 'register_buffer', 'register_forward_hook', 'register_forward_pre_hook', 'register_full_backward_hook', 'register_full_backward_pre_hook', 'register_load_state_dict_post_hook', 'register_module', 'register_parameter', 'register_state_dict_pre_hook', 'requires_grad_', 
'set_extra_state', 'share_memory', 'state_dict', 
'to', 'to_empty', 'train', 'training', 'type', 
'xpu', 
'zero_grad']
``
Andromeda is a Transformer model, but also, I don't see num_tokens there either.

Do those attributes look familiar?

<!-- POLAR PLEDGE BADGE START -->
## Upvote & Fund

- We're using [Polar.sh](https://polar.sh/kyegomez) so you can upvote and help fund this issue.
- We receive the funding once the issue is completed & confirmed by you.
- Thank you in advance for helping prioritize & fund our backlog.

<a href="https://polar.sh/kyegomez/zeta/issues/96">
<picture>
  <source media="(prefers-color-scheme: dark)" srcset="https://polar.sh/api/github/kyegomez/zeta/issues/96/pledge.svg?darkmode=1">
  <img alt="Fund with Polar" src="https://polar.sh/api/github/kyegomez/zeta/issues/96/pledge.svg">
</picture>
</a>
<!-- POLAR PLEDGE BADGE END -->

kyegomez / zeta Goto Github PK

zeta's Issues

Upvote & Fund

Upvote & Fund

Upvote & Fund

Upvote & Fund

Upvote & Fund

Upvote & Fund

Upvote & Fund

Upvote & Fund

Upvote & Fund

Upvote & Fund

Upvote & Fund

Upvote & Fund

Upvote & Fund

Upvote & Fund

Upvote & Fund

Upvote & Fund

It would be great if you stopped.

Upvote & Fund

Upvote & Fund

Upvote & Fund

Upvote & Fund

Upvote & Fund

Upvote & Fund

Upvote & Fund

Upvote & Fund

Upvote & Fund

Upvote & Fund

Upvote & Fund

Upvote & Fund

Upvote & Fund

Upvote & Fund

Upvote & Fund

Upvote & Fund

Upvote & Fund

Upvote & Fund

Upvote & Fund

Upvote & Fund

Upvote & Fund

Upvote & Fund

Upvote & Fund

Upvote & Fund

Upvote & Fund

<<<<<<< HEAD from zeta.nn.attention.multiquery_attention import ( MultiQueryAttention, ) from zeta.nn.modules.simple_feedforward import SimpleFeedForward

Upvote & Fund

Upvote & Fund

Upvote & Fund

Upvote & Fund

Upvote & Fund

Upvote & Fund

Upvote & Fund

Recommend Projects

Recommend Topics

Recommend Org

<<<<<<< HEAD
from zeta.nn.attention.multiquery_attention import (
MultiQueryAttention,
)
from zeta.nn.modules.simple_feedforward import SimpleFeedForward