Git Product home page Git Product logo

semantic_video_search's Introduction

Semantic Video Search

semantic_video_search

This plugin is a Python plugin that allows for you to semantically search your video datasets by frames or by video!

๐Ÿ”Ž With a single prompt, find exactly what you are looking for across every frame in your dataset!

Installation

fiftyone plugins download https://github.com/danielgural/semantic_video_search

Operators

semantic_video_search

Sorts based on a prompt through your video dataset for the most similar videos by using a similiarty index of your choice. Use Twelve Labs to create an index of your videos to do video semantic search. You can generate the index with the create semantic video index operator. Search on four different embeddings: visual, text in video, conversations, and logos.

For more info on Twelve Labs Search, look here! Twelve Labs features a free tier so its easy to get started right away!

semantic_frames_search

Sorts based on a prompt through your video dataset using single frame embeddings. You will need to generate a similarity index with FiftyOne Brain with textual embeddings before hand. You can even use Vector DB backends such as Qdrant, Pinecone, Milvus, or LanceDB for frame based search. Sort your frames as seen below, or by full video!

frames

create_semantic_video_index

Generates a Twelve Labs similarity index for you video dataset. You can change multiple parameters such as the types of embeddings to search on. The index must contain the embedding type in order to search on. You can also decide to upload your entire dataset, selected samples, or the current view to the database. It is highly recommended you delegate this operator due to its long runtime. Also note that videos must be at least four seconds long!

Video Semantic Search

To start using the Video Semantic Search, first generate a Twelve Labs API Token. The token will be used by the plugin to access the Twelve Labs API. To set these variables, define them before running the app.

export API_KEY=<YOUR_API_KEY>
export API_URL=https://api.twelvelabs.io/v1.1

Alternatively, if you are running the operator as a delegated operator, you can pass them in through the app.

index

Choose the index name and the embeddings you would like to include in your index. After execution, videos will be uploaded and index will be created. Additionally, a new field with Twelve Labs + Index Name will be added to your sample to correlate a sample with a Twelve Labs UID.

After your index has been created, its time for search! To use semantic video search, input the index name you want to search on, the prompt you are searching through, and which embeddings to search through. Your index needs to have these embeddings in order to search. Afterwards, your dataset will be sorted based on the most similar samples!

Here's a quick demo below!

sort

Semantic Frames Search

Any text similarity index should be computed before hand using FiftyOne Brain's Compute Similarity. The similarity index should also be on the frames of the image, in order to properly generate image embeddings.

Heres an example:

import fiftyone as fo
import fiftyone.zoo as foz
import fiftyone.brain as fob

dataset = foz.load_zoo_dataset("quickstart-video")
frames = dataset.to_frames(sample_frames=True)

results= fob.compute_similarity(
    frames,
    model="clip-vit-base32-torch",
    brain_key="sim",
)

session = fo.launch_app(frames) #run plugin

Afterwards, choose to sort by frames or videos. This determines what type of view, video or frames, is returned. Pass a prompt in and sort your dataset!

sort_frames

semantic_video_search's People

Contributors

danielgural avatar

Stargazers

 avatar Dickson Neoh avatar  avatar  avatar Chitra Singh avatar  avatar  avatar Ramsey avatar weiWang avatar James Le avatar Brian Moore avatar Jacob Marks avatar

Watchers

 avatar

Forkers

nahidalam

semantic_video_search's Issues

api support

Hey, im wondering if this plugin support api or cli?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.