kyegomez / palm-e Goto Github PK
View Code? Open in Web Editor NEWImplementation of "PaLM-E: An Embodied Multimodal Language Model"
Home Page: https://discord.gg/GYbXvDGevY
License: Apache License 2.0
Implementation of "PaLM-E: An Embodied Multimodal Language Model"
Home Page: https://discord.gg/GYbXvDGevY
License: Apache License 2.0
[2023-08-08 03:29:00,661] [INFO] [real_accelerator.py:133:get_accelerator] Setting ds_accelerator to cuda (auto detect)
`text_config_dict` is provided which will be used to initialize `CLIPTextConfig`. The value `text_config["id2label"]` will be overriden.
`text_config_dict` is provided which will be used to initialize `CLIPTextConfig`. The value `text_config["bos_token_id"]` will be overriden.
`text_config_dict` is provided which will be used to initialize `CLIPTextConfig`. The value `text_config["eos_token_id"]` will be overriden.
Non-A100 GPU detected, using math or mem efficient attention if input tensor is on cuda
torch.Size([1, 257, 1024])
torch.Size([1, 64, 1024])
torch.Size([1, 64, 50304])
torch.Size([1, 50, 50304])
torch.Size([1, 114, 50304])
Error duing forward pass: Expected tensor for argument #1 'indices' to have one of the following scalar types: Long, Int; but got torch.FloatTensor instead (while checking arguments for embedding)
Output: None
Hey,
I would like to know what is the minimum size of GPU needed to train the Palm-E model, I have a RTX A6000 + GPU 48GB PC and I don't know if it's possible to train it?
Thank you very much.
Qing
I think this "from palme.model import PALME" has to be " from palme.model import PalmE" ?
Do you know how I can train the custom dataset on it?
Please tell me why you're deleting the palme source for the training step.
Trying to access the trainingg SOP ReadME file from the main ReadMe file but it is not found.
Lot's of thanks for creating the library. I'm not used to the transformers library, ¿how can I unpack the output?
For example: if I have an image (img) and a caption, how can I receive a text answering to the caption according to the image as an output? I don't know how to work with tensor objects.
Another issue: ¿where can I find the robotics datasets? Thanks
Hello,
From the example we can see that the basic usage requires an image and text embeddings. But there is no information regarding how to extract these embeddings from text.
Let's say I have a question in natural language, how do I convert it into a torch tensor?
Thanks a lot
Thanks for your great work!
And can you tell me how to transform the output into the planning ?
THX. BUT...How do you know if your code is right without training?
I'm getting the following error after running the example.py:
File "palme/example.py", line 8, in <module> model = PalmE() ^^^^^^^ File "palme/palme/model.py", line 122, in __init__ self.encoder = ViTransformerWrapper( ^^^^^^^^^^^^ File "python3.11/site-packages/torch/nn/modules/module.py", line 1643, in __setattr__ raise AttributeError( AttributeError: cannot assign module before Module.__init__() call
I have programmed my own robot on python by using a Raspberry Pi 4. I would like to know if I can use this project to make it intelligent, just as Google's PaLM-E robot. The hardware is totally different, but it has functions to go front, left, move the arms... Thanks.
NOTE: It has a camera and an ultrasonic sensor for trigonometry
Hi @kyegomez
Could you please provide more examples of usage? It would be very helpful !!!!!!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.