Comments (10)
Hey, thanks for reaching out! You are doing everything right; unfortunately, this is the "streaming" that Google provides. My first guess was that because of safety, Google needs to first ensure the output is safe before delivering it to you; that's why it hangs and then floods you with all events at once (just a hypothesis, I have no evidence of this being a fact).
I tried disabling the safety stuff to see if it would get better, but no luck:
client.stream_generate_content({
contents: { role: 'user', parts: { text: 'hi!' } },
safetySettings: [
{
category: 'HARM_CATEGORY_SEXUALLY_EXPLICIT',
threshold: 'BLOCK_NONE'
},
{
category: 'HARM_CATEGORY_HATE_SPEECH',
threshold: 'BLOCK_NONE'
},
{
category: 'HARM_CATEGORY_HARASSMENT',
threshold: 'BLOCK_NONE'
},
{
category: 'HARM_CATEGORY_DANGEROUS_CONTENT',
threshold: 'BLOCK_NONE'
}
]
})
I wrote more about this here: LBPE Score: A New Perspective for Evaluating AI LLMs
And compared Gemini "streaming" with other providers:
"Gemini Pro’s “streaming” is mostly waiting, then a burst of activity — it’s not genuinely streaming."
From my research and experiments, we are doing everything right according to the documentation (?alt=sse), and this is how it works. I would be happy to find out we are missing something, but so far, it sounds like it is what it is.
from gemini-ai.
Oh, that's super helpful. I will try to investigate that; maybe there are some internals in Faraday that we can tweak to make it work that way! Also, if you find something, please share it as well.
from gemini-ai.
I also tried prompting gemini-pro
with Python using google-cloud-aiplatform
package and it seems to support streaming as well.
screencast.2024-01-22.23-41-16.mp4
from gemini-ai.
Great job @gbaptista ! Thanks for resolving this.
from gemini-ai.
Hey @gbaptista,
Thanks for your descriptive response. I tried to implement my own client with streaming response using a couple of tools like Faraday, Net::http and I got the same result. However, if you try requesting gemini with curl
you're actually getting a streaming-like behavior.
curl -N -X POST -H "Content-Type: application/json" -d '{"contents": [{"role": "user", "parts": [{"text": "tell me short story"}]}]}' https://generativelanguage.googleapis.com/v1beta/models/gemini-pro:streamGenerateContent\?key\=API_KEY 2> /dev/null | grep "text"
So I was wondering if the server behaves differently when you make a request with curl
or is it something within ruby that buffers the response.
from gemini-ai.
Ok, found it. Add this gem:
gem 'faraday-typhoeus', '~> 1.1'
Add this before your code:
require 'faraday'
require 'faraday/typhoeus'
Faraday.default_adapter = :typhoeus
Streaming should work now.
Probably related to this:
I'm going to give some thought to how to include this in the gem. Typhoeus was the first one that worked, but I need to consider whether we want to provide a specific alternative default adapter. If we decide to do so, we need to choose which adapter would be the best choice.
from gemini-ai.
Oh, cool! I was also able to make it work with typhoeus gem doing Typhoeus::Request.new
. That's great that it is possible to do this just by changing Faraday's adapter.
IMO gemini-ai
gem could just swap default_adapter to typhoeus in case of streaming response for you. Something like:
response = Faraday.new(request: @request_options) do |faraday|
faraday.response :raise_error
faraday.default_adapter = :typhoeus if server_sent_events_enabled
end.post do |request|
...
from gemini-ai.
Another challenge with streaming response from gemini-pro is that it is a chunked JSON string.
So in the first chunk you get something like:
[{
"candidates": [...]
}
then the next once can be
,
{
"candidates": [...]
}
I've even seen the case when the candidates
object was divided between 2 chunks. So by default these chunks are likely not parsable JSON. One would need to analyze and alter the string received from server before it could be parsed and a callback block called.
from gemini-ai.
@alchaplinsky That's true, but fortunately, I already solved this problem (partial JSON responses) in another gem, so I know how to deal with it. Let's get it done; I will prepare a PR.
from gemini-ai.
@alchaplinsky done:
from gemini-ai.
Related Issues (4)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gemini-ai.