Git Product home page Git Product logo

Comments (6)

Yingrjimsch avatar Yingrjimsch commented on August 22, 2024 1

Hello,

I could run it with an actual img with the following code

import torch
from torchvision.io import read_image
from screenai.main import ScreenAI

# Create a tensor for the image
image = read_image('test.png').unsqueeze(0).to(torch.float32)
# Create a tensor for the text
text = torch.randint(0, 20000, (1, 1028))

# Create an instance of the ScreenAI model with specified parameters
model = ScreenAI(
    num_tokens = 20000,
    max_seq_len = 1028,
    patch_size=16,
    image_size=224,
    dim=512,
    depth=6,
    heads=8,
    vit_depth=4,
    multi_modal_encoder_depth=4,
    llm_decoder_depth=4,
    mm_encoder_ff_mult=4,
)

# Perform forward pass of the model with the given text and image tensors
out = model(text, image)

# Print the shape of the output tensor
print(out)

and a test image which needs to be 224 x 224 pixels for example:
test

Maybe this helps.

from screenai.

github-actions avatar github-actions commented on August 22, 2024

Hello there, thank you for opening an Issue ! 🙏🏻 The team was notified and they will get back to you asap.

from screenai.

JamshedAlamQaderi avatar JamshedAlamQaderi commented on August 22, 2024

@Yingrjimsch thank you so much for the help. Can you also tell me if i could input prompt text and encode it to tensor? how to do decode output tensor?

from screenai.

Yingrjimsch avatar Yingrjimsch commented on August 22, 2024

Hi @JamshedAlamQaderi I had no time yet to try that but I would suggest use the Hugging Face transformer library to find a tokenizer. Use the tokenizer on your input text and set num_tokens as well as max_seq_length to the tokenizers specs. If I have time I'll try it as well and keep you updated.

from screenai.

Barney-Steven avatar Barney-Steven commented on August 22, 2024

Hi, @JamshedAlamQaderi , this repo is not the official Implementation, you can see the definition in "from screenai.main import ScreenAI", it is a very simple structure. ScreenAI is not open source for now. I find something similar in Huggingface, try moondream2.

from screenai.

JamshedAlamQaderi avatar JamshedAlamQaderi commented on August 22, 2024

Thank you guys for helping me

from screenai.

Related Issues (3)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.