Git Product home page Git Product logo

Comments (9)

Yingrjimsch avatar Yingrjimsch commented on August 22, 2024 10

same here with windows10

Edit: I solved this issue by pip uninstall zetascale and reinstall with pip install zetascale In my case it installed an ancient version 0.9.xyz and after I installed the newest version 2.2.7 it worked

@kyegomez maybe it would be good to update the README example with the actual example from the example.py after solving this issue I got more issue because

  1. there was no num_tokens defined
  2. there was no max_seq_len defined
  3. image and text were not initialized with the right dimensions

Another question I've got is, how did you choose num_tokens and max_seq_len?

from screenai.

github-actions avatar github-actions commented on August 22, 2024

Hello there, thank you for opening an Issue ! 🙏🏻 The team was notified and they will get back to you asap.

from screenai.

DevChrisRoth avatar DevChrisRoth commented on August 22, 2024

Got that same issue on a Mac M1

from screenai.

emarashliev avatar emarashliev commented on August 22, 2024

Same here Intel Mac

from screenai.

carlitose avatar carlitose commented on August 22, 2024

Same with mac M2

from screenai.

zhaixiaowai avatar zhaixiaowai commented on August 22, 2024

Same with windows11&wsl

from screenai.

github-actions avatar github-actions commented on August 22, 2024

Stale issue message

from screenai.

MElmardi avatar MElmardi commented on August 22, 2024

Same with Linux Ubuntu 24 LTS

from screenai.

RokiRan avatar RokiRan commented on August 22, 2024

After my modifications, I got a working code, and I hope it solves your problem.

import torch
from screenai.main import ScreenAI

# 创建图像张量
image = torch.rand(1, 3, 224, 224)

# 创建 ScreenAI 模型的实例
model = ScreenAI(
    num_tokens=2000,
    max_seq_len=1024,
    patch_size=16,
    image_size=224,
    dim=512,
    depth=6,
    heads=8,
    vit_depth=4,
    multi_modal_encoder_depth=4,
    llm_decoder_depth=4,
    mm_encoder_ff_mult=4,
)

# 假设您的文本已经被转换为词索引,这里我们使用随机整数来模拟
# num_tokens 是您的词汇表大小,max_seq_len 是模型能够处理的最大序列长度
text_indices = torch.randint(0, model.num_tokens, (1, model.max_seq_len))

# 将文本索引张量转换为长整型张量
text = text_indices.long()

# 使用给定的文本和图像张量进行模型的正向传播
out = model(text, image)

# 打印输出张量的形状
print(out)

from screenai.

Related Issues (3)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.