Git Product home page Git Product logo

langfuse / langfuse Goto Github PK

View Code? Open in Web Editor NEW
4.8K 19.0 433.0 15.49 MB

🪢 Open source LLM engineering platform: Observability, metrics, evals, prompt management, playground, datasets. Integrates with LlamaIndex, Langchain, OpenAI SDK, LiteLLM, and more. 🍊YC W23

Home Page: https://langfuse.com/docs

License: Other

JavaScript 1.00% TypeScript 98.23% CSS 0.17% Dockerfile 0.24% PLpgSQL 0.20% Shell 0.16%
analytics llm llmops gpt large-language-models openai self-hosted ycombinator monitoring observability

langfuse's Introduction

Langfuse GitHub Banner

Langfuse uses Github Discussions for Support and Feature Requests.
We're hiring. Join us in Backend Engineering, Product Engineering, and Developer Relations.

MIT License Y Combinator W23 Docker Image langfuse npm package langfuse Python package on PyPi

Overview

Unmute video for voice-over

langfuse-overview-3min.mp4

Develop

Monitor

Test

  • Experiments: Track and test app behaviour before deploying a new version

Get started

Langfuse Cloud

Managed deployment by the Langfuse team, generous free-tier (hobby plan), no credit card required.

» Langfuse Cloud

Localhost (docker)

# Clone repository
git clone https://github.com/langfuse/langfuse.git
cd langfuse

# Run server and database
docker compose up -d

→ Learn more about deploying locally

Self-host (docker)

Langfuse is simple to self-host and keep updated. It currently requires only a single docker container. → Self Hosting Instructions

Templated deployments: Railway, GCP Cloud Run, AWS Fargate, Kubernetes and others

Get Started

API Keys

You need a Langfuse public and secret key to get started. Sign up here and find them in your project settings.

Ingesting Data · Instrumenting Your Application

Note: We recommend using our fully async, typed SDKs that allow you to instrument any LLM application with any underlying model. They are available in Python (Decorators) & JS/TS. The SDKs will always be the most fully featured and stable way to ingest data into Langfuse.

You may want to use another integration to get started quickly or implement a use case that we do not yet support. However, we recommend to migrate to the Langfuse SDKs over time to ensure performance and stability.

See the → Quickstart to integrate Langfuse.

Integrations

Integration Supports Description
SDK Python, JS/TS Manual instrumentation using the SDKs for full flexibility.
OpenAI Python, JS/TS Automated instrumentation using drop-in replacement of OpenAI SDK.
Langchain Python, JS/TS Automated instrumentation by passing callback handler to Langchain application.
LlamaIndex Python Automated instrumentation via LlamaIndex callback system.
Haystack Python Automated instrumentation via Haystack content tracing system.
LiteLLM Python, JS/TS (proxy only) Use any LLM as a drop in replacement for GPT. Use Azure, OpenAI, Cohere, Anthropic, Ollama, VLLM, Sagemaker, HuggingFace, Replicate (100+ LLMs).
API Directly call the public API. OpenAPI spec available.

Packages that integrate with Langfuse:

Name Description
Instructor Library to get structured LLM outputs (JSON, Pydantic)
Mirascope Python toolkit for building LLM applications.
AI SDK by Vercel Typescript SDK that makes streaming LLM outputs super easy.
Flowise JS/TS no-code builder for customized LLM flows.
Langflow Python-based UI for LangChain, designed with react-flow to provide an effortless way to experiment and prototype flows.
Superagent Open Source AI Assistant Framework & API for prototyping and deployment of agents.

Questions and feedback

Ideas and roadmap

Support and feedback

In order of preference the best way to communicate with us:

Contributing to Langfuse

  • Vote on Ideas
  • Raise and comment on Issues
  • Open a PR - see CONTRIBUTING.md for details on how to setup a development environment.

License

This repository is MIT licensed, except for the ee folders. See LICENSE and docs for more details.

Misc

GET API to export your data

GET routes to use data in downstream applications (e.g. embedded analytics). You can also access them conveniently via the SDKs (docs).

Security & Privacy

We take data security and privacy seriously. Please refer to our Security and Privacy page for more information.

Telemetry

By default, Langfuse automatically reports basic usage statistics of self-hosted instances to a centralized server (PostHog).

This helps us to:

  1. Understand how Langfuse is used and improve the most relevant features.
  2. Track overall usage for internal and external (e.g. fundraising) reporting.

None of the data is shared with third parties and does not include any sensitive information. We want to be super transparent about this and you can find the exact data we collect here.

You can opt-out by setting TELEMETRY_ENABLED=false.

langfuse's People

Contributors

18feb06 avatar clemra avatar dependabot[bot] avatar diwakarkashyap avatar eltociear avatar fancyweb avatar flxwu avatar frankendeba avatar functorism avatar gitstart-app[bot] avatar gokuljs avatar hassiebp avatar inosrahul avatar jhlopen avatar khareyash05 avatar knok16 avatar kpcofgs avatar ladislasdellinger avatar marcklingen avatar marliessophie avatar maxdeichmann avatar mishushakov avatar mukesh1811 avatar p5 avatar porter-deployment-app[bot] avatar richardkruemmel avatar sebhs avatar simonstorms avatar tonoy30 avatar yarikoptic avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

langfuse's Issues

feat: Disable user sign up based on environment variable

Describe the feature you'd like to request

I would like to self host langfuse in the cloud on a public URL and limit who can log in to langfuse.
Currently if the system/URL is public, anybody could sign up and log in.

Describe the solution you'd like to see

I'm not sure what a good solution would be for this.

An example solution I can think off:

  • having a setting to disable sign up

Then first I would deploy while sign up is enabled and create an account.
Then I disable sign up and deploy again

Usage:

docker run --name langfuse \
-e DATABASE_URL=...
...
SIGN_UP_DISABLED=TRUE
ghcr.io/langfuse/langfuse:latest

Additional information

No response

Contribute

  • Yes, I can implement this and raise a PR

[LFC-178] feat: delete project

Describe the feature you'd like to request

Add delete project button to project settings

Describe the solution you'd like to see

  • dialog
    • use shadcn ui button component
    • modal to confirm deletion
  • trpc route for deletion
  • project:delete as new scope only granted to project owners
    • hide button in frontend if not owner
    • enforce in trpc route that user needs to be owner to delete project
  • cascading deletion of observation/traces/scores/memberships in db

Additional information

No response

Contribute

  • Yes, I can implement this and raise a PR

From SyncLinear.com | LFC-178

[LFC-215] Page through items in lists (traces, generations)

Goal: simplify it to jump from one trace/generation to the next in a list

Example (rough sketch)

Screenshot 2023-10-02 at 14 45 48@2x

Nice to have: keyboard navigation (k & j)

Thoughts on potential implementation

  • Challenge: how to store state of (filtered) table to actually page through items of the previously visited table view
    • store list & link to table in
      • session storage
      • query param

LFC-215

feat: Add CONTRIBUTING.md File

Describe the feature you'd like to request

Add CONTRIBUTING.md File

Describe the solution you'd like to see

CONTRIBUTING.md File , if will helpful for new contributor

Additional information

No response

[LFC-85] [LFC-6] feat: Add "change password" flow to user settings

Describe the feature you'd like to request

Currently users can only be created and the initial password that is chosen at /auth/sign-up is hashed and saved to user table. Users cannot change or reset the password.

Suggested core feature

  • New page for user settings accessible via the user menu that currently only includes "sign out"
  • Form to set a new password while being logged in

Users who do not know there current password, cannot change it -> need password reset flow; this would require adding transactional emails which is probably out of scope to make this easy to finish. Happy to contribute if you want to go for it though.

Describe the solution you'd like to see

  • Add update route to user trpc router, user id of signed in user is available in context (ctx.session.user.id) -> src/server/api/routers/users.ts
    • Update user table using prisma

Additional information

The project uses:

  • shadcn/ui for ui components, find them in src/components/ui
  • trpc for typed APIs used by the frontend, check out how the creation of new API keys works for reference
    • src/features/publicApi/components/CreateApiKeyButton.tsx
    • src/features/publicApi/server/apiKeyRouter.ts

From SyncLinear.com | LFC-6

LFC-85

[LFC-181] [Python-SDK] Create Langchain handler based on Spans

Currently, we are able to generate Langfuse callback handlers based on traces. See docs here.

I think it would be great to have the same for observations. We have users, who want to call Langchain multiple times per trace and hence we need a way to wrap Langchain calls into spans. Right now, in the callback handler, we always create a Trace if none exist, and spans and generations based on the hierarchy given by Langchain via run_id and parent_run_id For this, we would need to:

  • be able to instantiate the Langchain Callback handler with an observationId
  • In all places, where we usually generate a Trace and Span (root of a Langchain execution), instead of generating a Trace, only generate the span with the observationId as parentObservationId

LFC-181

feat: PR template for contributions

Describe the feature you'd like to request

I suggest having a proper template for PRs. When a person contributes and makes a PR if the information needs to be filled like a description of the feat/bug, which issue it resolves etc.

Describe the solution you'd like to see

A template for PR like :

Description
Describe what changes you made.

Issue Resolved # (issue)

Checks

  • My code passes all tests
  • I have formatted the code
  • It generates no new warnings
  • Have written documentation or added comments for it

Additional information

No response

Contribute

  • Yes, I can implement this and raise a PR

feat: Add metadata to manually created spans / observations

Describe the feature you'd like to request

Would love to be able to set metadata fields on manually created spans:

Describe the solution you'd like to see

langfuse_handler.setNextSpan(next_span_id, metadata_field_1="foo",metadata_field_2="bar")

Additional information

No response

Contribute

  • Yes, I can implement this and raise a PR

feat: Avoid type errors on Create* pydantic models

Describe the feature you'd like to request

Currently, the Create* models, e.g. CreateGeneration declare all arguments as required to the static type checking. This is unhelpful, because most of them accept None as the default value and work fine.
Example error message from VSCode

I think the core confusion is around typing.Optional used in the models. It means the field can be set to None, but the field is still considered required, unless there is a default provided.

Describe the solution you'd like to see

Here's a minimal example to consider. The current code looks like this:

from pydantic import BaseModel
from typing import Optional
class Model(BaseModel):
   field:  Optional[str]

r = Model()  # error, the field is required
r = Model(field=None)  # OK
r = Model(field="fool)  # OK

It could be changed to this instead:

class Model(BaseModel):
    field: Opitional[str] = None

r = Model() # OK
r = Model(field=None)  # OK, same result

Additional information

No response

Contribute

  • Yes, I can implement this and raise a PR

[LFC-185] [JS-SDK] Create Langchain handler based on a Trace

In the Python package, users can create Langchain handlers based on a trace. Here are the docs.

The same behavior does not exist for this JS/TS package yet, but we want to change that. The core langfuse SDK is built in a way to support es5. Langchain does not support es5, hence we have to build it into the langchain-langfuse package.

I would take the following approach:

  • Wrap the Langchain SDK (LangfuseCore and LangfuseTraceClient) in the langchain-langfuse package
  • Add a method to LangfuseTraceClient to export the callback handler based on a trace
  • For LangfuseCore::trace, return the LangfuseTraceClient
  • Change the callback handler to take a traceId. Within the callback handler, before creating a trace, ensure that the traceId is not set. Otherwise, use the traceId to generate spans and generations

LFC-185

[LFC-193] Analytics: self-serve and open source

Currently, there is an analytics alpha based on Looker. The next step is to replace it with a full open source analytics stack.

Goals

  • Fast
  • Fully OSS
  • Self-serve chart builder
  • Support for some metrics/semantic layer of prebuilt aggregations
  • Prebuilt dashboards

LFC-193

[LFC-195] bug: need to refresh page when session changes (signup/login/logout), self-hosting

Describe the bug

After starting self-hosted Langfuse, interaction with the signup/login mechanism hangs. Usually a refresh of the page will resolve the issue.

To reproduce

Here's my docker compose file:

version: '3.8'
services:
  <other-service-A>:
    ...
  <other-service-B>:
    ...
  postgres:
    image: postgres:latest
    environment:
      POSTGRES_USER: xxxx
      POSTGRES_PASSWORD: xxxx
      POSTGRES_DB: xxxx
    ports:
      - "5432:5432"
    volumes:
      - ${HOME}/data/postgres:/var/lib/postgresql/data
    networks:
      - network1
  langfuse:
    image: ghcr.io/langfuse/langfuse:latest
    environment:
      DATABASE_URL: postgresql://xxxx:xxxx@postgres:5432/postgres
      NEXTAUTH_SECRET: mysecret
      NEXTAUTH_URL: http://XXX.XX.0.1:3002
    ports:
      - 3002:3000
    networks:
      - network1
networks:
  network1:
    name: my-network
    attachable: true
    ipam:
      driver: default
      config:
        - subnet: XXX.XX.0.0/16
          ip_range: XXX.XX.5.0/24
          gateway: XXX.XX.0.1

Additional information

Network

image

LFC-195

[LFC-183] [Python-SDK] Add update function to traces

Users want to be able to update their traces as they are able to update spans and generations. The user should have the following experience:

trace = langfuse.trace(CreateTrace(...))
trace = trace.update(UpdateTrace(...))

# ability to generate everything that can be a child of a trace
span = trace.span(...)
generation = trace.generation(...)

  • Create a StatefulTraceClient which is a StatefulClient but is also able to update Traces.
    • To update the Traces, we can use the same request as when creating the Trace, as the API route upserts.
    • Langfuse::trace should return the StatefulTraceClient
    • Langfuse::score should return the StatefulTraceClient in case we scored a Trace
  • Add an integration test for the added functionality
  • Add docs to the docs repo (https://github.com/langfuse/langfuse-docs)

LFC-183

[LFC-164] Add random token to increase security of link sharing traces

Currently, traces can be shared publicly via /public/traces/:traceid

While the traces are intended to be public when shared, the ids can be guessed (if they are incremental) or brute forced (if they are short). Trace ids default to uuids but can be set manually via the API/SDKs leading to potentially short/sequential ids.

Example: https://cloud.langfuse.com/public/traces/lf.docs.conversation.4AO8dBC

Desired outcome

Before: /public/traces/my-guessable-id-1

After: /public/traces/my-guessable-id-1?token=7v8dnt782itbguidt7t7c6ds9t

Suggested implementation

  • New column publicToken in table traces (via Prisma ORM)
  • Create new random token whenever trace is shared via trpc route; delete when unshared
  • Add token to url of public link as queryparam (?token=…)
    • Add it to url that is copied when creating the sharable link
    • Check if token is correct in byIdPublic trpc route on /public/traces/:traceid page

From SyncLinear.com | LFC-164

[LFC-182] [Python-SDK] Track OpenAI response headers as metadata in Langchain Python integration

Many users have problems with OpenAI rate limiting and OpenAI provides response headers to track current usage of the API.

Details: https://platform.openai.com/docs/guides/rate-limits/rate-limits-in-headers
Screenshot 2023-09-12 at 17 54 05@2x

Suggested change

  • Capture the known response headers for OpenAI generations in the Langchain Callback handler (on_llm_end)
  • Add them as key-value pairs to the metadata of the generation in Langfuse Update (UpdateGeneration)

Happy to support on this, ping me on discord!

LFC-182

feat: Add "change password" flow to user settings

Describe the feature you'd like to request

Currently users can only be created and the initial password that is chosen at /auth/sign-up is hashed and saved to user table. Users cannot change or reset the password.

Suggested core feature

  • New page for user settings accessible via the user menu that currently only includes "sign out"
  • Form to set a new password while being logged in

Users who do not know there current password, cannot change it -> need password reset flow; this would require adding transactional emails which is probably out of scope to make this easy to finish. Happy to contribute if you want to go for it though.

Describe the solution you'd like to see

  • Add update route to user trpc router, user id of signed in user is available in context (ctx.session.user.id) -> src/server/api/routers/users.ts
    • Update user table using prisma

Additional information

The project uses:

  • shadcn/ui for ui components, find them in src/components/ui
  • trpc for typed APIs used by the frontend, check out how the creation of new API keys works for reference
    • src/features/publicApi/components/CreateApiKeyButton.tsx
    • src/features/publicApi/server/apiKeyRouter.ts

LF-655

[LFC-202] Add playwright tests (to CI)

We should start testing the frontend application as well to control for regressions affecting the front-end application.

To get started, I'd suggest to add playwright testing to the tests run on CI (npm run test). Ideas for test cases:

From SyncLinear.com | LFC-202

[LFC-184] [JS-SDK] OpenAI SDK wrapper

For users who just want to log OpenAI calls to Langfuse, getting started with Langfuse could be as simple as:

// openai
import OpenAI from 'openai';

const openai = new OpenAI({
  apiKey: 'my api key',
});

// openai with Langfuse
import OpenAI from 'langfuse-openai';

const openai = new OpenAI({
  apiKey: 'my api key',
  langfuseSecretKey: 'lf-sk-...',
  langfusePublicKey: 'lf-pk-...',
});

LFC-184

[LFC-192] OpenTelemetry

OTel as (1) additional data source for more execution details, and (2) alternative integration to replace current SDKs

LFC-192

bug: Cannot access Analytics report

Describe the bug

When click Analytics on the live demo, the Usage, Latency, Scores tab always reporting Please sign in to your Google account, then reload this page to view this report, but I have alraedy logged in to my Google account, am I missing anything?

To reproduce

  • Login to Langfuse
  • navigate to langfuse-docs project
  • click Analytics, you will see the message

Additional information

No response

[LFC-171] Scroll to observations when opening an observation

Current behavior

When clicking on specific Generations or Spans in the detail view of traces, we adjust the link. The Link has the following structure: https://cloud.langfuse.com/project/<project-id>/traces/<trace-id>?observation=<observation-id>

The changes according to the navigation in the Trace.

When opening this Link, the Trace Detail view renders with the trace navigation scrolled all the way up. In the following picture, we cannot see the currently opened Generation, which should have a grey background.

Desired behavior

Traces can become very long and finding the specific Generation or Span can be tedious. Hence it would be great to scroll to the right element on page load as in the following picture

From SyncLinear.com | LFC-171

[LFC-194] [Docs] Add new integrations to docs

[LFC-189] Dataset management for testing/experimentation

  • Manage set of reference prompts/inputs to run tests on
  • Add to the dataset from production traces
  • Use within SDKs to run experiments

Workflows from SDK

  1. Get all data set items by data set name (input & expected output), all that are not archived
  2. User creates run (name) on data set
  3. User runs app on each item and traces it with langfuse; needs to provide data set item id and run id (can both be from sdk state maybe) that the execution trace belongs to
  4. Optionally: user runs evals and adds scores to the observation or trace

Todo

  • Average score on run

From SyncLinear.com | LFC-189

bug: Construction icon overlap the AlertTitle in analytics page

Provide environment information

[email protected]
Ok to proceed? (y)

System:
OS: Windows 10 10.0.22621
CPU: (8) x64 AMD Ryzen 7 3700U with Radeon Vega Mobile Gfx
Memory: 3.95 GB / 13.95 GB
Binaries:
Node: 18.13.0 - C:\Program Files\nodejs\node.EXE
Yarn: 1.22.19 - ~\AppData\Roaming\npm\yarn.CMD
npm: 9.6.7 - C:\Program Files\nodejs\npm.CMD
pnpm: 7.15.0 - ~\AppData\Roaming\npm\pnpm.CMD

Describe the bug

image

Link to reproduction

https://cloud.langfuse.com/project/clkpwwm0m000gmm094odg11gi/analytics

To reproduce

check the above link

Additional information

No response

[LF-657] feat: POST Completion request to langfuse

Describe the feature you'd like to request

I'm trying to make a callback to send data to langfuse. C
This will allow me to use langfuse with other tools like promptlayer, helicone etc

Describe the solution you'd like to see

Can you expose an API endpoint that allows me to pass langfuse the ChatCompletion() kwargs, response etc ?

Additional information

No response

LF-657

[LFC-162] Share trace as public link

Why

  • Good for docs / marketing
  • Helps people share with coworkers

How

  • public flag on trace
  • can be set via button in frontend
  • New /public/traces/:traceId page
  • Alternative tRPC router to get the trace when not being authenticated as long as the trace is public

From SyncLinear.com | LFC-162

[LFC-201] [SDKs] Use env vars to configure the Langfuse Client

Currently the Python and JS/TS SDK expect a private and secret key in their respective constructor. It would be nice to set these via environment variables, e.g. for use in staging and CI:

Envs

  • LANGFUSE_PUBLIC_KEY, in JS also check NEXT_PUBLIC_LANGFUSE_PUBLIC_KEY as there are many users using Next.JS for their applications
  • LANGFUSE_SECRET_KEY
  • LANGFUSE_HOST

This will lead to not requiring these params in the constructors. Log info to console that Langfuse is disabled if public key or secret key are missing.

Edit: envs should be an alternative way to set public/secret key and host. We need to support constructor arguments for backwards compatibility. Thus, they become optional.

  • Python SDK
  • JS SDK

From SyncLinear.com | LFC-201

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.