Comments (5)
This can be particularly annoying when you are just trying to fix a typo in the YAML and have to wait 30s for the download just to learn that you missed a spot and still have a typo.
from langstream.
This works for me. @cdbartholomew are you able to provide a full application example ?
from langstream.
Sure. Here is my application:
module: "module-1"
id: "pipeline-1"
topics:
- name: "questions-topic"
creation-mode: create-if-not-exists
- name: "answers-topic"
creation-mode: create-if-not-exists
- name: "log-topic"
creation-mode: create-if-not-exists
errors:
on-failure: "fail"
pipeline:
- name: "convert-to-structure"
id: "convert-to-structure"
type: "document-to-json"
input: "questions-topic"
configuration:
text-field: "question"
copy-properties: true
- name: "compute-embeddings"
id: "compute-embeddings"
type: "compute-ai-embeddings"
configuration:
model: "text-embedding-ada-002" # This needs to match the name of the model deployment, not the base model
embeddings-field: "value.question_embeddings"
text: "{{% value.question }}"
- name: "lookup-related-documents-in-llm"
type: "query"
configuration:
datasource: "AstraDatasource"
query: "SELECT text FROM chatbot.documents ORDER BY embeddings_vector ANN OF ? LIMIT 15"
fields:
- "value.question_embeddings"
output-field: "value.related_documents"
- name: "Query Chat History"
id: query-chat-history
type: "query"
configuration:
datasource: "AstraDatasource"
query: "select question,answer from chatbot.chatbot_history where session = ? limit 5"
output-field: "value.history"
fields:
- "value.sessionId"
- name: "ai-chat-completions"
type: "ai-chat-completions"
configuration:
model: "gpt-4" # This needs to be set to the model deployment name, not the base name
completion-field: "value.answer"
log-field: "value.prompt"
messages:
- role: system
content: |
You are the assistant to Guillermo Rauch. Your role is to answer questions about Guillermo's life and perspectives in a friendly yet informative manner.
Instruction:
- Your answer should ONLY include information from the Interview Extracts listed below. Make sure to stay consistent and avoid contradictions. This is very important.
- If you don't know something, state clearly that you don't have enough information to answer that question accurately.
- Your responses should be concise but clear. Limit the answer to at most one paragraph.
- Maintain a relaxed, honest, and friendly tone, as if speaking to a friend.
- Do not cite the context you were given unless specifically asked.
- Avoid elaborating, adding emphasis, or hyperbole. Speak plainly and avoid editorializing.
- Feel free to ask for clarification if a question you are given is not clear.
- When responding, take into consideration the Chat History listed below. Note that the history is in reverse chronological order, so the most recent message is at the top.
Examples:
- Q: "What is Guillermo's favorite programming language?"
A: "Guillermo prefers JavaScript for most of his projects."
- Q: "Where did he study?"
A: "I'm not sure about Guillermo's educational background."
Interview Extracts:
====================
{{%# value.related_documents}}
{{% text}}
-----------------------------------------------
{{%/ value.related_documents}}
Chat History:
=============
{{%# value.history}}
User: {{% question}} Assistant: {{% answer}}
-----------------------------------------------
{{%/ value.history}}
- role: user
content: "User question: {{% value.question}}"
- name: "cleanup-response"
type: "drop-fields"
output: "log-topic"
configuration:
fields:
- "question_embeddings"
- "related_documents"
- name: "keep-only-the-answer"
type: "compute"
input: "log-topic"
output: "answers-topic"
configuration:
fields:
- name: "value"
expression: "value.answer"
type: STRING
- name: "Write history to Cassandra"
type: "sink"
input: "log-topic"
configuration:
connector.class: com.datastax.oss.kafka.sink.CassandraSinkConnector
key.converter: org.apache.kafka.connect.storage.StringConverter
value.converter: org.apache.kafka.connect.storage.StringConverter
cloud.secureConnectBundle: "{{ secrets.cassandra.secure-connect-bundle }}"
auth.username: "{{ secrets.cassandra.username }}"
auth.password: "{{ secrets.cassandra.password }}"
topic.log-topic.chatbot.chatbot_history.mapping: "session=value.sessionId,question=value.question,answer=value.answer,prompt=value.prompt,timestamp=now()"
name: cassandra-sink-history
config:
configuration:
resources:
- type: "datasource"
name: "AstraDatasource"
configuration:
service: "astra"
username: "{{{ secrets.cassandra.username }}}"
password: "{{{ secrets.cassandra.password }}}"
secureBundle: "{{{ secrets.cassandra.secure-connect-bundle }}}"
- type: "open-ai-configuration"
name: "OpenAI Azure configuration"
configuration:
url: "{{ secrets.open-ai.url }}"
access-key: "{{ secrets.open-ai.access-key }}"
provider: "azure"
dependencies:
- name: "Kafka Connect Sink for Apache Cassandra from DataStax"
url: "https://github.com/datastax/kafka-sink/releases/download/1.5.0/kafka-connect-cassandra-sink-1.5.0.jar"
sha512sum: "242bf60363d36bd426232451cac836a24ae8d510857372a128f601503ad77aa9eabf14c4f484ca0830b6a68d9e8664e3820739ad8dd3deee2c58e49a94a20a3c"
type: "java-library"
instance:
instance:
streamingCluster:
type: "kafka"
configuration:
admin:
bootstrap.servers: kafka-gcp-useast4.dev.streaming.datastax.com:9093
security.protocol: SASL_SSL
sasl.jaas.config: "org.apache.kafka.common.security.plain.PlainLoginModule required username='{{ secrets.astra-token.tenant }}' password='token:{{ secrets.astra-token.token }}';"
sasl.mechanism: PLAIN
session.timeout.ms: "45000"
Here is how I deploy it:
1329 ./bin/sga-cli --conf=conf/cli.yaml apps deploy i-love-ai -app examples/applications/i-love-ai/ -i examples/instances/astra.yaml -s examples/applications/i-love-ai/secrets.yaml
from langstream.
@cdbartholomew not able to reproduce with the latest cli version. would you mind to test it again ? maybe I fixed it already
from langstream.
Fixed with latest version
from langstream.
Related Issues (20)
- mini-langstream delete fails if other minikube clusters are defined
- mini-langstream getting started (deploy sample app) fails with: Missing required option: '--instance=<instanceFilePath>'
- mini-langstream start suggestion fails to deploy due to incorrect path
- [Feature request] Add more automatic configuration for LangChain LLMs for Python processors HOT 2
- [Feature request] Toggle for showing credentials in log
- langstream apps deploy should provide more beginner friendly help if it can't connect to annot connect to http://localhost:8090/api/applications/ HOT 1
- java.lang.RuntimeException: Missing some input keys: {'query'} for source record KafkaRecord.KafkaConsumerRecord HOT 8
- LangStream pipeline UI has disconnect when LangChain is used
- LangStream has no Windows support HOT 1
- PGVector as vector db HOT 2
- ProduceConsumeHandlerTest.testSendEvents is flaky
- write-to-astra sink to support JSON API HOT 2
- Langstream does not work on windows. HOT 1
- redirection to a forbidden domain happened without slash suffix character in the web crawler HOT 1
- Provisioned disks for custom agent permission denied HOT 3
- RESOURCE_EXHAUSTED in python-source agent HOT 8
- Allow runtime upgrade of existing applications HOT 1
- Copy artifacts into the container before compiling
- langstream python run-tests should quote all arguments
- Support GPU in Kubernetes cluster
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from langstream.