Git Product home page Git Product logo

swarm-search's Introduction

Swarm Search

The set of modules augmenting Google and YouTube search pages with data from OpenSearch-compatible engines.

Getting Started

Google Search

The dapplet works at Google search pages. In the tab "all" any content type will be shown, in the tab "video" - only videos.

Watch the video

YouTube Search

At the YouTube only video search results are available.

Watch the video

Indexing via Swarm Gateway

Swarm Gateway is the website allowing any user to upload a small file to Swarm free. We've created the dapplet "Swarm Indexer" to augment this webpage for collecting metadata from user.

Watch the video

Indexing via Media Downloader

Media Downloader is a dapplet created at Liberate Data Week Hackathon. We added the feature which allows you to add video to the index and make it available via Swarm Search dapplet.

Watch the video

Change Search Engine

Two search engines were verified at the development:

DevianArt's backend containing a huge collection of media content.

https://backend.deviantart.com/rss.xml?q={searchTerms}&offset={startIndex}&limit={count}

Swarm Search server written as a mock of non-developed yet search engine which allows to add files uploaded to Swarm via dapplets working on Swarm Gateway and Media Downloader.

https://swarm-search-server.herokuapp.com/rss?q={searchTerms}&count={count}&offset={startIndex}&type={type?}

Any OpenSearch-compatible search engine can be specified in the dapplet's settings.

The instruction about how to change a search engine in the dapplet's settings is in following video.

Watch the video

Project Architecture

Communication diagram

Actors

  • Uploader - an user uploading files to the Swarm.
  • Searcher - an user searching something with activated Search Dapplet.

Components

  • Indexer Dapplet - augments Swarm Gateway to collect metadata from Uploader.
  • Search Dapplet - injects search results in third party websites.
  • Search API - a server which proxies the Elasticsearch engine and transforms data to OpenSearch compatible format.
  • Elasticsearch - an engine implementing full-text search.
  • Swarm Gateway - a website for free files uploading to the Swarm.
  • Bee Nodes - Swarm network storing data in a decentralized way.

Bee Nodes

A: File uploading (indexing)

A1: An Uploader (user) attaches a file to the Swarm Gateway and fills out the manifest form for indexing.

A2: Swarm Gateway sends a file to Bee node.

A3: Bee node returns a swarm reference hash.

A4: The Indexer Dapplet intercepts the uploaded file and swarm reference.

A5: The Indexer Dapplet sends file, reference and metadata to the Search Backend.

A6: Search API retranslates the query to Elasticsearch.

B: File searching

B1: Searcher opens the website and sends a query.

B2: Search Dapplet intercepts the entered query from the website.

B3: Search Dapplet sends OpenSearch-compatible query to fetch search results.

B4: Search API receives OpenSearch query and transforms it to ElasticSearch request.

B5: Elasticsearch returns search results in JSON format.

B6: Search API transforms JSON to OpenSearch's XML and returns to the dapplet.

B7: Search Dapplet injects search results to the website.

B8: Searcher can see external search results and open them.

Custom OpenSearch Query

The Search Dapplet uses additional type parameter to filter search results by content type.

This parameter is not specified by OpenSearch specification and must be implemented by a search server if you want to have content type specific search.

Valid value of this parameter is video.

/rss?q={searchTerms}&count={count}&offset={startIndex}&type={type?}

Development

Build Project

This project is designed as monorepo, so NPM Workspaces feature is required to install dependencies.

npm install

To start the development server use command:

npm start

Elasticsearch Installation

  1. Install Elasticsearch by following this official guide

  2. Install Ingest Attachment Plugin which allows to search by files content.

  3. Create the piplene and add processors that allows searching by file content and removes unused sorces fields.

PUT http://localhost:9200/_ingest/pipeline/attachment
{
    "description": "Extract attachment information",
    "processors": [
        {
            "attachment": {
                "field": "data",
                "target_field": "attachment"
            }
        },
        {
            "remove": {
                "field": "data"
            }
        }
    ]
}
  1. Create the index
PUT http://localhost:9200/fs_index
  1. Create /packages/search-server/.env file with URL to the Elasticsearch HTTP API and start development!

The URL must ending at slash / symbol.

ELASTICSEARCH_URL=http://localhost:9200/

swarm-search's People

Contributors

alsakhaev avatar ni-2 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.