This is a no-nonsense async Scala client for OpenAI API supporting all the available endpoints and params including streaming, the newest ChatGPT completion, vision, and voice routines (as defined here), provided in a single, convenient service called OpenAIService. The supported calls are:
- Models: listModels, and retrieveModel
- Completions: createCompletion
- Chat Completions: createChatCompletion (๐ฅ new: also with GPT vision support!), createChatFunCompletion (deprecated), and createChatToolCompletion (๐ฅ new)
- Edits: createEdit (deprecated)
- Images: createImage, createImageEdit, and createImageVariation
- Embeddings: createEmbeddings
- Audio: createAudioTranscription, createAudioTranslation, and createAudioSpeech (๐ฅ new)
- Files: listFiles, uploadFile, deleteFile, retrieveFile, and retrieveFileContent
- Fine-tunes: createFineTune, listFineTunes, retrieveFineTune, cancelFineTune, listFineTuneEvents, and deleteFineTuneModel
- Moderations: createModeration
- Threads (๐ฅ new): createThread, retrieveThread, modifyThread, and deleteThread
- Thread Messages (๐ฅ new): createThreadMessage, retrieveThreadMessage, modifyThreadMessage, listThreadMessages, retrieveThreadMessageFile, and listThreadMessageFiles
Note that in order to be consistent with the OpenAI API naming, the service function names match exactly the API endpoint titles/descriptions with camelcase.
Also, we aimed the lib to be self-contained with the fewest dependencies possible therefore we ended up using only two libs play-ahc-ws-standalone
and play-ws-standalone-json
(at the top level). Additionally, if dependency injection is required we use scala-guice
lib as well.
This lib supports also "OpenAI-API-compatible" providers such as FastChat (umbrella for open-source LLMs - Vicuna, Alpaca, LLaMA, fastchat-t5-3b-v1.0, mpt-7b-chat, etc.), Azure, or any other similar service with a custom URL. Check the examples below for more details.
๐ For background information read an article about the lib/client on Medium.
Try out also our Scala client for Pinecone vector database, or use both clients together! This demo project shows how to generate and store OpenAI embeddings (with text-embedding-ada-002
model) into Pinecone and query them afterward. The OpenAI + Pinecone combo is commonly used for autonomous AI agents, such as babyAGI and AutoGPT.
โ๏ธ Important: this is a "community-maintained" library and, as such, has no relation to OpenAI company.
The currently supported Scala versions are 2.12, 2.13, and 3.
To pull the library you have to add the following dependency to your build.sbt
"io.cequence" %% "openai-scala-client" % "1.0.0.RC.1"
or to pom.xml (if you use maven)
<dependency>
<groupId>io.cequence</groupId>
<artifactId>openai-scala-client_2.12</artifactId>
<version>1.0.0.RC.1</version>
</dependency>
If you want a streaming support use "io.cequence" %% "openai-scala-client-stream" % "1.0.0.RC.1"
instead.
- Env. variables:
OPENAI_SCALA_CLIENT_API_KEY
and optionally alsoOPENAI_SCALA_CLIENT_ORG_ID
(if you have one) - File config (default): openai-scala-client.conf
I. Obtaining OpenAIService
First you need to provide an implicit execution context as well as akka materializer, e.g., as
implicit val ec = ExecutionContext.global
implicit val materializer = Materializer(ActorSystem())
Then you can obtain a service in one of the following ways.
- Default config (expects env. variable(s) to be set as defined in
Config
section)
val service = OpenAIServiceFactory()
- Custom config
val config = ConfigFactory.load("path_to_my_custom_config")
val service = OpenAIServiceFactory(config)
- Without config
val service = OpenAIServiceFactory(
apiKey = "your_api_key",
orgId = Some("your_org_id") // if you have one
)
- Minimal
OpenAICoreService
supportinglistModels
,createCompletion
,createChatCompletion
, andcreateEmbeddings
calls - e.g. FastChat service running on the port 8000
val service = OpenAICoreServiceFactory("http://localhost:8000/v1/")
- For Azure with API Key
val service = OpenAIServiceFactory.forAzureWithApiKey(
resourceName = "your-resource-name",
deploymentId = "your-deployment-id", // usually model name such as "gpt-35-turbo"
apiVersion = "2023-05-15", // newest version
apiKey = "your_api_key"
)
- For Azure with Access Token
val service = OpenAIServiceFactory.forAzureWithAccessToken(
resourceName = "your-resource-name",
deploymentId = "your-deployment-id", // usually model name such as "gpt-35-turbo"
apiVersion = "2023-05-15", // newest version
accessToken = "your_access_token"
)
โ๏ธ Important: If you want streaming support use OpenAIServiceStreamedFactory
or OpenAICoreServiceStreamedFactory
from openai-scala-client-stream
lib instead of OpenAIServiceFactory
(in the three examples above). Three additional functions - createCompletionStreamed
, createChatCompletionStreamed
, and listFineTuneEventsStreamed
(deprecated) provided by OpenAIServiceStreamedExtra will be then available.
๐ฅ New: Note that it is now possible to use a streamed service also with a non-OpenAI provider e.g. as:
val service = OpenAICoreServiceStreamedFactory.customInstance("http://localhost:8000/v1/")
- Via dependency injection (requires
openai-scala-guice
lib)
class MyClass @Inject() (openAIService: OpenAIService) {...}
II. Calling functions
Full documentation of each call with its respective inputs and settings is provided in OpenAIService. Since all the calls are async they return responses wrapped in Future
.
๐ฅ New: There is a new project openai-scala-client-examples where you can find a lot of ready-to-use examples!
Examples:
- List models
service.listModels.map(models =>
models.foreach(println)
)
- Retrieve model
service.retrieveModel(ModelId.text_davinci_003).map(model =>
println(model.getOrElse("N/A"))
)
- Create completion
val text = """Extract the name and mailing address from this email:
|Dear Kelly,
|It was great to talk to you at the seminar. I thought Jane's talk was quite good.
|Thank you for the book. Here's my address 2111 Ash Lane, Crestview CA 92002
|Best,
|Maya
""".stripMargin
service.createCompletion(text).map(completion =>
println(completion.choices.head.text)
)
- Create completion with a custom setting
val text = """Extract the name and mailing address from this email:
|Dear Kelly,
|It was great to talk to you at the seminar. I thought Jane's talk was quite good.
|Thank you for the book. Here's my address 2111 Ash Lane, Crestview CA 92002
|Best,
|Maya
""".stripMargin
service.createCompletion(
text,
settings = CreateCompletionSettings(
model = ModelId.text_davinci_001,
max_tokens = Some(1500),
temperature = Some(0.9),
presence_penalty = Some(0.2),
frequency_penalty = Some(0.2)
)
).map(completion =>
println(completion.choices.head.text)
)
- ๐ฅ New: Count used tokens before calling
createChatCompletions
orcreateChatFunCompletions
, this help you select proper model ex.gpt-3.5-turbo
orgpt-3.5-turbo-16k
and reduce costs. This is an experimental feature and it may not work for all models.
import io.cequence.openaiscala.service.OpenAICountTokensHelper
import io.cequence.openaiscala.domain.{ChatRole, FunMessageSpec, FunctionSpec}
class MyCompletionService extends OpenAICountTokensHelper {
def exec = {
val messages: Seq[FunMessageSpec] = ??? // messages to be sent to OpenAI
val function: FunctionSpec = ??? // function to be called
val tokens = countFunMessageTokens(messages, List(function), Some(function.name))
}
}
- Create completion with streaming and a custom setting
val source = service.createCompletionStreamed(
prompt = "Write me a Shakespeare poem about two cats playing baseball in Russia using at least 2 pages",
settings = CreateCompletionSettings(
model = ModelId.text_davinci_003,
max_tokens = Some(1500),
temperature = Some(0.9),
presence_penalty = Some(0.2),
frequency_penalty = Some(0.2)
)
)
source.map(completion =>
println(completion.choices.head.text)
).runWith(Sink.ignore)
For this to work you need to use OpenAIServiceStreamedFactory
from openai-scala-client-stream
lib.
- Create chat completion
val createChatCompletionSettings = CreateChatCompletionSettings(
model = ModelId.gpt_3_5_turbo
)
val messages = Seq(
SystemMessage("You are a helpful assistant."),
UserMessage("Who won the world series in 2020?"),
AssistantMessage("The Los Angeles Dodgers won the World Series in 2020."),
UserMessage("Where was it played?"),
)
service.createChatCompletion(
messages = messages,
settings = createChatCompletionSettings
).map { chatCompletion =>
println(chatCompletion.choices.head.message.content)
}
- Create chat completion for functions
val messages = Seq(
FunMessageSpec(role = ChatRole.User, content = Some("What's the weather like in Boston?")),
)
// as a param type we can use "number", "string", "boolean", "object", "array", and "null"
val functions = Seq(
FunctionSpec(
name = "get_current_weather",
description = Some("Get the current weather in a given location"),
parameters = Map(
"type" -> "object",
"properties" -> Map(
"location" -> Map(
"type" -> "string",
"description" -> "The city and state, e.g. San Francisco, CA",
),
"unit" -> Map(
"type" -> "string",
"enum" -> Seq("celsius", "fahrenheit")
)
),
"required" -> Seq("location"),
)
)
)
// if we want to force the model to use the above function as a response
// we can do so by passing: responseFunctionName = Some("get_current_weather")`
service.createChatFunCompletion(
messages = messages,
functions = functions,
responseFunctionName = None
).map { response =>
val chatFunCompletionMessage = response.choices.head.message
val functionCall = chatFunCompletionMessage.function_call
println("function call name : " + functionCall.map(_.name).getOrElse("N/A"))
println("function call arguments : " + functionCall.map(_.arguments).getOrElse("N/A"))
}
Note that instead of MessageSpec
, the function_call
version of the chat completion uses the FunMessageSpec
class to define messages - both as part of the request and the response.
This extension of the standard chat completion is currently supported by the following 0613
models, all conveniently available in ModelId
object:
gpt-3.5-turbo-0613
(default),gpt-3.5-turbo-16k-0613
,gpt-4-0613
, andgpt-4-32k-0613
.
โ๏ธ Important Note: After you are done using the service, you should close it by calling service.close
. Otherwise, the underlying resources/threads won't be released.
III. Using multiple services
- Load distribution with
OpenAIMultiServiceAdapter
- round robin (rotation) type
val service1 = OpenAIServiceFactory("your-api-key1")
val service2 = OpenAIServiceFactory("your-api-key2")
val service3 = OpenAIServiceFactory("your-api-key3")
val service = OpenAIMultiServiceAdapter.ofRoundRobinType(service1, service2, service3)
service.listModels.map { models =>
models.foreach(println)
service.close()
}
- Load distribution with
OpenAIMultiServiceAdapter
- random order type
val service1 = OpenAIServiceFactory("your-api-key1")
val service2 = OpenAIServiceFactory("your-api-key2")
val service3 = OpenAIServiceFactory("your-api-key3")
val service = OpenAIMultiServiceAdapter.ofRandomOrderType(service1, service2, service3)
service.listModels.map { models =>
models.foreach(println)
service.close()
}
- Create completion and retry on transient errors (e.g. rate limit error)
import akka.actor.{ActorSystem, Scheduler}
import io.cequence.openaiscala.RetryHelpers
import io.cequence.openaiscala.RetryHelpers.RetrySettings
import io.cequence.openaiscala.domain.{ChatRole, MessageSpec}
import io.cequence.openaiscala.service.{OpenAIService, OpenAIServiceFactory}
import javax.inject.Inject
import scala.concurrent.duration.DurationInt
import scala.concurrent.{ExecutionContext, Future}
class MyCompletionService @Inject() (
val actorSystem: ActorSystem,
implicit val ec: ExecutionContext,
implicit val scheduler: Scheduler
)(val apiKey: String)
extends RetryHelpers {
val service: OpenAIService = OpenAIServiceFactory(apiKey)
implicit val retrySettings: RetrySettings =
RetrySettings(interval = 10.seconds)
def ask(prompt: String): Future[String] =
for {
completion <- service
.createChatCompletion(
List(MessageSpec(ChatRole.User, prompt))
)
.retryOnFailure
} yield completion.choices.head.message.content
}
- Retries with
OpenAIRetryServiceAdapter
val serviceAux = ... // your service
implicit val retrySettings: RetrySettings =
RetrySettings(maxRetries = 10).constantInterval(10.seconds)
// wrap it with the retry adapter
val service = OpenAIRetryServiceAdapter(serviceAux)
service.listModels.map { models =>
models.foreach(println)
service.close
}
-
Wen Scala 3?
Feb 2023. You are right; we chose the shortest month to do so :)Done! -
I got a timeout exception. How can I change the timeout setting?
You can do it either by passing the
timeouts
param toOpenAIServiceFactory
or, if you use your own configuration file, then you can simply add it there as:
openai-scala-client {
timeouts {
requestTimeoutSec = 200
readTimeoutSec = 200
connectTimeoutSec = 5
pooledConnectionIdleTimeoutSec = 60
}
}
-
I got an exception like
com.typesafe.config.ConfigException$UnresolvedSubstitution: openai-scala-client.conf @ jar:file:.../io/cequence/openai-scala-client_2.13/0.0.1/openai-scala-client_2.13-0.0.1.jar!/openai-scala-client.conf: 4: Could not resolve substitution to a value: ${OPENAI_SCALA_CLIENT_API_KEY}
. What should I do?Set the env. variable
OPENAI_SCALA_CLIENT_API_KEY
. If you don't have one register here. -
It all looks cool. I want to chat with you about your research and development?
Just shoot us an email at [email protected].
This library is available and published as open source under the terms of the MIT License.
This project is open-source and welcomes any contribution or feedback (here).
Development of this library has been supported by - Cequence.io - The future of contracting
Created and maintained by Peter Banda.