kyegomez / screenai Goto Github PK
View Code? Open in Web Editor NEWImplementation of the ScreenAI model from the paper: "A Vision-Language Model for UI and Infographics Understanding"
Home Page: https://discord.gg/GYbXvDGevY
License: MIT License
Implementation of the ScreenAI model from the paper: "A Vision-Language Model for UI and Infographics Understanding"
Home Page: https://discord.gg/GYbXvDGevY
License: MIT License
Describe the bug
after pip install screenai
a runtime error is produced in the from screenai.main import ScreenAI
line in the default example :
RuntimeError: mat1 and mat2 shapes cannot be multiplied (512x4 and 512x512)
To Reproduce
Steps to reproduce the behavior:
pip install screenai
Expected behavior
run without error
Screenshots
`---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
~\AppData\Local\Temp\ipykernel_20976\3292023021.py in <cell line: 2>()
1 import torch
----> 2 from screenai.main import ScreenAI
3
4 # Create a tensor for the image
5 image = torch.rand(1, 3, 224, 224)
~\AppData\Local\Programs\Python\Python39\lib\site-packages\screenai_init_.py in
----> 1 from screenai.main import (
2 CrossAttention,
3 MultiModalEncoder,
4 MultiModalDecoder,
5 ScreenAI,
~\AppData\Local\Programs\Python\Python39\lib\site-packages\screenai\main.py in
5 from torch import Tensor, einsum, nn
6 from torch.autograd import Function
----> 7 from zeta.nn import (
8 SwiGLU,
9 FeedForward,
~\AppData\Local\Programs\Python\Python39\lib\site-packages\zeta_init_.py in
26 logger.addFilter(f)
27
---> 28 from zeta.nn import *
29 from zeta.models import *
30 from zeta.utils import *
~\AppData\Local\Programs\Python\Python39\lib\site-packages\zeta\nn_init_.py in
1 from zeta.nn.attention import *
2 from zeta.nn.embeddings import *
----> 3 from zeta.nn.modules import *
4 from zeta.nn.biases import *
~\AppData\Local\Programs\Python\Python39\lib\site-packages\zeta\nn\modules_init_.py in
45 from zeta.nn.modules.s4 import s4d_kernel
46 from zeta.nn.modules.h3 import H3Layer
---> 47 from zeta.nn.modules.mlp_mixer import MLPMixer
48 from zeta.nn.modules.leaky_relu import LeakyRELU
49 from zeta.nn.modules.adaptive_layernorm import AdaptiveLayerNorm
~\AppData\Local\Programs\Python\Python39\lib\site-packages\zeta\nn\modules\mlp_mixer.py in
143 1, 512, 32, 32
144 ) # Batch size of 1, 512 channels, 32x32 image
--> 145 output = mlp_mixer(example_input)
146 print(
147 output.shape
~\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\nn\modules\module.py in _wrapped_call_impl(self, *args, **kwargs)
1516 return self._compiled_call_impl(*args, **kwargs) # type: ignore[misc]
1517 else:
-> 1518 return self._call_impl(*args, **kwargs)
1519
1520 def _call_impl(self, *args, **kwargs):
~\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\nn\modules\module.py in _call_impl(self, *args, **kwargs)
1525 or _global_backward_pre_hooks or _global_backward_hooks
1526 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1527 return forward_call(*args, **kwargs)
1528
1529 try:
~\AppData\Local\Programs\Python\Python39\lib\site-packages\zeta\nn\modules\mlp_mixer.py in forward(self, x)
123 x = rearrange(x, "n c h w -> n (h w) c")
124 for mixer_block in self.mixer_blocks:
--> 125 x = mixer_block(x)
126 x = self.pred_head_layernorm(x)
127 x = x.mean(dim=1)
~\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\nn\modules\module.py in _wrapped_call_impl(self, *args, **kwargs)
1516 return self._compiled_call_impl(*args, **kwargs) # type: ignore[misc]
1517 else:
-> 1518 return self._call_impl(*args, **kwargs)
1519
1520 def _call_impl(self, *args, **kwargs):
~\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\nn\modules\module.py in _call_impl(self, *args, **kwargs)
1525 or _global_backward_pre_hooks or _global_backward_hooks
1526 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1527 return forward_call(*args, **kwargs)
1528
1529 try:
~\AppData\Local\Programs\Python\Python39\lib\site-packages\zeta\nn\modules\mlp_mixer.py in forward(self, x)
61 y = self.norm1(x)
62 y = rearrange(y, "n c t -> n t c")
---> 63 y = self.tokens_mlp(y)
64 y = rearrange(y, "n t c -> n c t")
65 x = x + y
~\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\nn\modules\module.py in _wrapped_call_impl(self, *args, **kwargs)
1516 return self._compiled_call_impl(*args, **kwargs) # type: ignore[misc]
1517 else:
-> 1518 return self._call_impl(*args, **kwargs)
1519
1520 def _call_impl(self, *args, **kwargs):
~\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\nn\modules\module.py in _call_impl(self, *args, **kwargs)
1525 or _global_backward_pre_hooks or _global_backward_hooks
1526 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1527 return forward_call(*args, **kwargs)
1528
1529 try:
~\AppData\Local\Programs\Python\Python39\lib\site-packages\zeta\nn\modules\mlp_mixer.py in forward(self, x)
28 torch.Tensor: description
29 """
---> 30 y = self.dense1(x)
31 y = F.gelu(y)
32 return self.dense2(y)
~\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\nn\modules\module.py in _wrapped_call_impl(self, *args, **kwargs)
1516 return self._compiled_call_impl(*args, **kwargs) # type: ignore[misc]
1517 else:
-> 1518 return self._call_impl(*args, **kwargs)
1519
1520 def _call_impl(self, *args, **kwargs):
~\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\nn\modules\module.py in _call_impl(self, *args, **kwargs)
1525 or _global_backward_pre_hooks or _global_backward_hooks
1526 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1527 return forward_call(*args, **kwargs)
1528
1529 try:
~\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\nn\modules\linear.py in forward(self, input)
112
113 def forward(self, input: Tensor) -> Tensor:
--> 114 return F.linear(input, self.weight, self.bias)
115
116 def extra_repr(self) -> str:
RuntimeError: mat1 and mat2 shapes cannot be multiplied (512x4 and 512x512)`
Hello @kyegomez ,
Thank you so much for this awesome repo. I'm very excited to test this project. So, i've tried with example code but it gives me this below error
SyntaxError: Non-UTF-8 code starting with '\xff' in file C:\Users\alamj\Downloads\screenai.py on line 1, but no encoding declared; see https://peps.python.org/pep-0263/ for details
Could you use a real example of giving input image and text and converting them to vector and feed to the model. I really want to check it out
Thank you!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.