Git Product home page Git Product logo

aws-audio-recog's Introduction

aws-audio-recog

Serverless audio recognition task

Resources

image

image

image image image image

Steps of the implementation

audio file creation

File has been created at the https://www.narakeet.com/ with the following text from official aws docs:

Lambda runs your code on a high-availability compute infrastructure and performs all of the administration of the compute resources, including server and operating system maintenance, capacity provisioning and automatic scaling, and logging.

Cloud infrustructure entities

  1. API Gateway invoke URL: https://lvydkx8cvb.execute-api.us-east-1.amazonaws.com/dev
  2. S3 bucket apalii-audio-samples with audio sample
  3. S3 bucket apalii-recognition-results with transribes
  4. Lambda: recognition-task-consumer
  5. Lambda: recognition-task-producer
  6. Lambda: recognition-post-processing
  7. Lambda: recognition-results
  8. SQS queue: recognition
  9. DynamoDB: table recognition-results
  10. DynamoDB: table recognition-results

Flow

  • First lambda: recognition-task-producer takes arguments from the POST request and creates task at the SQS
  • SQS has another lambda recognition-task-consumer as a trigger which creates record in DynamoDB and also job at AWS Transcribe service
  • AWS Transcribe creates a file with the results which will trigger the 3rd lambda recognition-post-processing
  • Lambda recognition-post-processing reads the file, finds substring and saves results at the DynamoDb table recognition-results

Request example

curl -X POST \
  https://lvydkx8cvb.execute-api.us-east-1.amazonaws.com/dev \
  -H 'Connection: keep-alive' \
  -H 'Content-Type: application/json' \
  -H 'Host: lvydkx8cvb.execute-api.us-east-1.amazonaws.com' \
  -d '{
    "audio_url": "s3://apalii-audio-samples/Lambda.mp3",
    "sentences": [
        "including server and operating system",
        "can you hear me?"
    ]
}'

Results endpoint example

https://lvydkx8cvb.execute-api.us-east-1.amazonaws.com/dev/result?job_id=52fd082e-4e99-4189-9283-f420d63c5132

aws-audio-recog's People

Contributors

apalii avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.