Git Product home page Git Product logo

deepgram-python-sdk's Introduction

Deepgram Python SDK

Discord GitHub Workflow Status PyPI Contributor Covenant

Official Python SDK for Deepgram. Power your apps with world-class speech and Language AI models.

Documentation

You can learn more about the Deepgram API at developers.deepgram.com.

Getting an API Key

🔑 To access the Deepgram API you will need a free Deepgram API Key.

Requirements

Python (version ^3.10)

Installation

To install the latest version available (which will guarantee change over time):

pip install deepgram-sdk

If you are going to write an application to consume this SDK, it's highly recommended and a programming staple to pin to at least a major version of an SDK (ie ==2.*) or with due diligence, to a minor and/or specific version (ie ==2.1.* or ==2.12.0, respectively). If you are unfamiliar with semantic versioning or semver, it's a must-read.

In a requirements.txt file, pinning to a major (or minor) version, like if you want to stick to using the SDK v2.12.0 release, that can be done like this:

deepgram-sdk==2.*

Or using pip:

pip install deepgram-sdk==2.*

Pinning to a specific version can be done like this in a requirements.txt file:

deepgram-sdk==2.12.0

Or using pip:

pip install deepgram-sdk==2.12.0

We guarantee that major interfaces will not break in a given major semver (ie 2.* release). However, all bets are off moving from a 2.* to 3.* major release. This follows standard semver best-practices.

Quickstarts

This SDK aims to reduce complexity and abtract/hide some internal Deepgram details that clients shouldn't need to know about. However you can still tweak options and settings if you need.

PreRecorded Audio Transcription Quickstart

You can find a walkthrough on our documentation site. Transcribing Pre-Recorded Audio can be done using the following sample code:

AUDIO_URL = {
    "url": "https://static.deepgram.com/examples/Bueller-Life-moves-pretty-fast.wav"
}

## STEP 1 Create a Deepgram client using the API key from environment variables
deepgram: DeepgramClient = DeepgramClient("", ClientOptionsFromEnv())

## STEP 2 Call the transcribe_url method on the prerecorded class
options: PrerecordedOptions = PrerecordedOptions(
    model="nova-2",
    smart_format=True,
)
response = deepgram.listen.prerecorded.v("1").transcribe_url(AUDIO_URL, options)
print(f"response: {response}\n\n")

Live Audio Transcription Quickstart

You can find a walkthrough on our documentation site. Transcribing Live Audio can be done using the following sample code:

deepgram: DeepgramClient = DeepgramClient()

dg_connection = deepgram.listen.live.v("1")

def on_open(self, open, **kwargs):
    print(f"\n\n{open}\n\n")

def on_message(self, result, **kwargs):
    sentence = result.channel.alternatives[0].transcript
    if len(sentence) == 0:
        return
    print(f"speaker: {sentence}")

def on_metadata(self, metadata, **kwargs):
    print(f"\n\n{metadata}\n\n")

def on_speech_started(self, speech_started, **kwargs):
    print(f"\n\n{speech_started}\n\n")

def on_utterance_end(self, utterance_end, **kwargs):
    print(f"\n\n{utterance_end}\n\n")

def on_error(self, error, **kwargs):
    print(f"\n\n{error}\n\n")

def on_close(self, close, **kwargs):
    print(f"\n\n{close}\n\n")

dg_connection.on(LiveTranscriptionEvents.Open, on_open)
dg_connection.on(LiveTranscriptionEvents.Transcript, on_message)
dg_connection.on(LiveTranscriptionEvents.Metadata, on_metadata)
dg_connection.on(LiveTranscriptionEvents.SpeechStarted, on_speech_started)
dg_connection.on(LiveTranscriptionEvents.UtteranceEnd, on_utterance_end)
dg_connection.on(LiveTranscriptionEvents.Error, on_error)
dg_connection.on(LiveTranscriptionEvents.Close, on_close)

options: LiveOptions = LiveOptions(
    model="nova-2",
    punctuate=True,
    language="en-US",
    encoding="linear16",
    channels=1,
    sample_rate=16000,
    ## To get UtteranceEnd, the following must be set:
    interim_results=True,
    utterance_end_ms="1000",
    vad_events=True,
)
dg_connection.start(options)

## create microphone
microphone = Microphone(dg_connection.send)

## start microphone
microphone.start()

## wait until finished
input("Press Enter to stop recording...\n\n")

## Wait for the microphone to close
microphone.finish()

## Indicate that we've finished
dg_connection.finish()

print("Finished")

Examples

There are examples for every API call in this SDK. You can find all of these examples in the examples folder at the root of this repo.

These examples provide:

Analyze Text:

PreRecorded Audio:

Live Audio Transcription:

Management API exercise the full CRUD operations for:

To run each example set the DEEPGRAM_API_KEY as an environment variable, then cd into each example folder and execute the example: go run main.py.

Logging

This SDK provides logging as a means to troubleshoot and debug issues encountered. By default, this SDK will enable Information level messages and higher (ie Warning, Error, etc) when you initialize the library as follows:

deepgram: DeepgramClient = DeepgramClient()

To increase the logging output/verbosity for debug or troubleshooting purposes, you can set the DEBUG level but using this code:

config: DeepgramClientOptions = DeepgramClientOptions(
    verbose=logging.DEBUG,
)
deepgram: DeepgramClient = DeepgramClient("", config)

Backwards Compatibility

Older SDK versions will receive Priority 1 (P1) bug support only. Security issues, both in our code and dependencies, are promptly addressed. Significant bugs without clear workarounds are also given priority attention.

Development and Contributing

Interested in contributing? We ❤️ pull requests!

To make sure our community is safe for all, be sure to review and agree to our Code of Conduct. Then see the Contribution guidelines for more information.

Prerequisites

In order to develop new features for the SDK itself, you first need to uninstall any previous installation of the deepgram-sdk and then install/pip the dependencies contained in the requirements.txt then instruct python (via pip) to use the SDK by installing it locally.

From the root of the repo, that would entail:

pip uninstall deepgram-sdk
pip install -r requirements.txt
pip install -e .

Testing

If you are looking to contribute or modify pytest code, then you need to install the following dependencies:

pip install -r requirements-dev.txt

Getting Help

We love to hear from you so if you have questions, comments or find a bug in the project, let us know! You can either:

deepgram-python-sdk's People

Contributors

ambergeldar avatar bd-g avatar bekahhw avatar briancbarrow avatar brianhillis-dg avatar damiendeepgram avatar dependabot[bot] avatar dotaadarsh avatar dvonthenen avatar frumsdotxyz avatar geekchick avatar jcdyer avatar jjmaldonis avatar jkroll-deepgram avatar jpvajda avatar lugia19 avatar lukeocodes avatar michaeljolley avatar mirceapasoi avatar phazonoverload avatar robertlent avatar roerohan avatar saarthshah avatar samgdf avatar sandrarodgers avatar shirgoldbird avatar tomprimozic avatar universaldevgenius avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

deepgram-python-sdk's Issues

Could not open socket

What is the current behavior?

When running we get the error "Could not open socket"

Steps to reproduce

 Just run the line:
 deepgramLive = await deepgram.transcription.live(PARAMS)

To make it faster to diagnose the root problem. Tell us how can we reproduce the bug.

Expected behavior

  We should get more details on why the socket could not be opened. 
  Side Note: I suspect this might be due to HTTPs issues, is there a way to ignore SSL errors?

What would you expect to happen when following the steps above?
Error details to help debug the core issue

Please tell us about your environment

 Python 3.10 on MacOs

We want to make sure the problem isn't specific to your operating system or programming language.

  • Operating System/Version: Windows 10
  • Language: [all | TypeScript | Python | PHP | etc]
  • Browser: Chrome

Other information

Anything else we should know? (e.g. detailed explanation, stack-traces, related issues, suggestions how to fix, links for us to have context, eg. stack overflow, codepen, etc)

Add Docstrings in the Python SDK

Proposed changes

Add docstrings in the Python SDK for all the classes and functions in the deepgram -> transcription.py file. Please add @geekchick as a reviewer on your pull request.

Context

This change is important so it can help other developers using the SDK to quickly understand what's happening in a class or function, therefore decreasing development time and helping junior developers get up and running.

Second initialization of LiveTranscription doesn't send data to deepgram

What is the current behavior?

I have an interaction with a stream of audio that should be transcripted. It's not like a podcast where it is a single large stream, rather it's more conversational where there could be a sentence and a pause of 30 seconds or more and then another sentence. I figure the best thing to do is use a unique instance for each interaction.

The problem is that subsequent instances of the LiveTranscription object don't seem to send data. The first time through the code it works fine but when deepgramLive is initialized a second time, the TRANSCRIPT_RECEIVED handler is never fired until eventually it closes, that is, the CLOSE handler is called.

One thing to note is if I don't reinitialize deepgramLive, the stream can pick up subsequent phrases.

Steps to reproduce

@app.websocket("/ws/test")
async def test(websocket: WebSocket):
    await websocket.accept()
    async def receive_deepgram_transcript(msg):
        nonlocal transcript
        if msg.get("is_final"):
            transcript = (
                msg.get("channel", {})
                .get("alternatives", [{}])[0]
                .get("transcript", "")
            )
            if transcript != "":
                print(transcript)

    deepgramLive = None
    transcript = None
    while True:
        while not transcript:
            if not deepgramLive:
                deepgramLive = await dg_client.transcription.live({ 'punctuate': True, 'interim_results': False, 'language': 'en-US' })
                deepgramLive.registerHandler(deepgramLive.event.CLOSE, lambda c: print(f'Connection closed with code {c}.'))
                deepgramLive.registerHandler(deepgramLive.event.TRANSCRIPT_RECEIVED, receive_deepgram_transcript)
            data = await websocket.receive()
            deepgramLive.send(data['bytes'])
        else:
            print("closing")
            await deepgramLive.finish()
            deepgramLive = None

Unable to use DeepGram Python Sdk on Windows

I am trying to transcript a file using the DeepGram Python SDK, but I am getting an ssl certification error. This error occurs when I try to run a transcription of a file on Windows. I expected it to return a string of the audio. I am using a python 3.10 virtual environment.

  • Operating System/Version: Windows 10
  • Language: Python
  • Browser: Chrome

N best fails for batch transcription

What is the current behavior?

I need to get N best as many as possible, time stamps and scores but when I ask for more then 2 N best it fails it is intermittent even with the same audio file and a request for the same number of n best.

I am using Windows 10 64 bit 21H2 19044.2965 Windows Feature Experience Pack 1000.19041.1000.0

this is a little python file and I am using a wave file I recorded that lasts just a few seconds

What's happening that seems wrong?

I get a bunch of error messages and it crashes

Traceback (most recent call last):
File "C:\Users\Andre\source\repos\Quintessence\Python\Deepgram_Batch\Lib\Main.py", line 47, in
main()
File "C:\Users\Andre\source\repos\Quintessence\Python\Deepgram_Batch\Lib\Main.py", line 42, in main
response = deepgram.transcription.sync_prerecorded(source, {'punctuate': True, 'alternatives': 5})
File "C:\Users\Andre\source\repos\Quintessence\Python\Deepgram_Batch\lib\site-packages\deepgram\transcription.py", line 355, in sync_prerecorded
return SyncPrerecordedTranscription(
File "C:\Users\Andre\source\repos\Quintessence\Python\Deepgram_Batch\lib\site-packages\deepgram\transcription.py", line 111, in call
return _sync_request(
File "C:\Users\Andre\source\repos\Quintessence\Python\Deepgram_Batch\lib\site-packages\deepgram_utils.py", line 156, in _sync_request
raise exc # stream is now invalid as payload
File "C:\Users\Andre\source\repos\Quintessence\Python\Deepgram_Batch\lib\site-packages\deepgram_utils.py", line 153, in _sync_request
return attempt()
File "C:\Users\Andre\source\repos\Quintessence\Python\Deepgram_Batch\lib\site-packages\deepgram_utils.py", line 148, in attempt
raise (Exception(f'DG: {exc}') if exc.status < 500 else exc)
File "C:\Users\Andre\source\repos\Quintessence\Python\Deepgram_Batch\lib\site-packages\deepgram_utils.py", line 139, in attempt
with urllib.request.urlopen(req) as resp:
File "C:\Users\Andre\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 214, in urlopen
return opener.open(url, data, timeout)
File "C:\Users\Andre\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 523, in open
response = meth(req, response)
File "C:\Users\Andre\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 632, in http_response
response = self.parent.error(
File "C:\Users\Andre\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 561, in error
return self._call_chain(*args)
File "C:\Users\Andre\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 494, in _call_chain
result = func(*args)
File "C:\Users\Andre\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 641, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 503: Service Unavailable

Process finished with exit code 1

Steps to reproduce

run this program after you add code to set:

DEEPGRAM_API_KEY =

PATH_TO_FILE =

from deepgram import Deepgram
import json
import os

def main():
# Initializes the Deepgram SDK
deepgram = Deepgram(DEEPGRAM_API_KEY)
# Open the audio file
with open(PATH_TO_FILE, 'rb') as audio:
# ...or replace mimetype as appropriate
source = {'buffer': audio, 'mimetype': 'audio/wav'}
response = deepgram.transcription.sync_prerecorded(source, {'punctuate': True, 'alternatives': 5})
print(json.dumps(response, indent=4))

main()

if you reduce 'alternatives': 5 to 2 it will work
if you reduce it to 3 or 4 it may run some times but not other times

Error when importing deepgram (unhashable type: 'list')

Current behavior

I'm trying to test the deepgram-sdk. However, I keep on encountering this issue:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\mycomputer\.virtualenvs\Python_Stuff-EaH8Zb-h\lib\site-packages\deepgram\__init__.py", line 2, in <module>
    from ._types import Options
  File "C:\Users\mycomputer\.virtualenvs\Python_Stuff-EaH8Zb-h\lib\site-packages\deepgram\_types.py", line 172, in <module>
    EventHandler = Union[Callable[[Any], None], Callable[[Any], Awaitable[None]]]
  File "c:\users\mycomputer\appdata\local\programs\python\python39\lib\typing.py", line 243, in inner
    return func(*args, **kwds)
  File "c:\users\mycomputer\appdata\local\programs\python\python39\lib\typing.py", line 316, in __getitem__
    return self._getitem(self, parameters)
  File "c:\users\mycomputer\appdata\local\programs\python\python39\lib\typing.py", line 421, in Union
    parameters = _remove_dups_flatten(parameters)
  File "c:\users\mycomputer\appdata\local\programs\python\python39\lib\typing.py", line 215, in _remove_dups_flatten
    all_params = set(params)
TypeError: unhashable type: 'list'

Steps to reproduce

  • Open terminal and run pipenv shell in an empty directory (or any directory of your choice)
  • Run pipenv install deepgram-sdk
  • Run python then type from deepgram import Deepgram or import deepgram. Press Enter afterwards.
  • The error should show by now

My environment

  • Operating System/Version: Windows 10
  • Language: Python 3.9.0
  • Virtual Environment used: Pipenv

Getting error in pre-recorded audio

from deepgram import Deepgram
import asyncio
import json

# Your Deepgram API Key
apiKey = "******"

# Name and extension of the file you downloaded (e.g., sample.wav)
PATH_TO_FILE = 'Voice_sample_1.m4a'


async def transcribe_audio(audio_file):
    # Initialize the deepgram SDK
    dg_client = Deepgram(apiKey)
    # Open the audio file
    with open(audio_file, 'rb') as audio:
        # Replace mimetype as appropriate
        source = {'buffer': audio, 'mimetype': 'audio/m4a'}
        response = await dg_client.transcription.prerecorded(source, options={"punctuate": True})
        print(json.dumps(response, indent=4))
        return json.dumps(response, indent=4)


transcribedData = asyncio.run(transcribe_audio(PATH_TO_FILE))
with open("trancribedData.json", 'w') as jsonfile:
    jsonfile.write(transcribedData)
print("Transcribed data ready!")
jsonFile = open("trancribedData.json")
fileData = json.load(jsonFile)
print(fileData["results"]["channels"][0]["alternatives"][0]["transcript"])
filedata1 = fileData["results"]["channels"]
print(filedata1[0]["alternatives"][0]["transcript"])

I'm getting this after executing the above code. Please help me what I did wrong?

raise ClientConnectorCertificateError(req.connection_key, exc) from exc
aiohttp.client_exceptions.ClientConnectorCertificateError: Cannot connect to host api.deepgram.com:443 ssl:True [SSLCertVerificationError: (1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1091)')]

Static Type Checking via mypy - PEP 561 support

Proposed changes

I'm new to python so excuse me if I'm missing something obvious.

You've carefully added type hints to the SDK, amazing. When I add a static type checker to my project, like mypy, it can't seem to find them. According to mypy's documentation, packages are supposed to declare their compatibility with a py.typed file in the package directory. See here: https://mypy.readthedocs.io/en/stable/installed_packages.html#creating-pep-561-compatible-packages

Context

Static type checking dramatically reduces developer errors. You've already created types, adding compatibility to PEP 561 would allow us to use static checkers

Possible Implementation

See above

Other information

Am I missing anything? Another way to run a static type checker perhaps?

Exception: DG: server rejected WebSocket connection: HTTP 401 (OS:MAC)

import pyaudio
from deepgram import Deepgram
import deepgram

client = Deepgram("api key")

Set the sample rate and chunk size for the audio stream

SAMPLE_RATE = 16000
CHUNK_SIZE = 1024

Create a new PyAudio object

audio = pyaudio.PyAudio()
audio.get_device_info_by_index(0)

Open the audio stream using PyAudio

stream = audio.open(
format=pyaudio.paInt16,
channels=1,
rate=SAMPLE_RATE,
input=True,
frames_per_buffer=CHUNK_SIZE
)

async def getaudiotranscript(data):
deepgramLive = await client.transcription.live({'punctuate': True, 'interim_results': False})
deepgramLive.registerHandler(deepgramLive.event.CLOSE, lambda c: print(f'Connection closed with code {c}.'))
deepgramLive.send(data)
result = client.transcription(data)
print(result.text)

Loop until the user stops the program

while True:
# Read a chunk of audio from the stream
data = stream.read(CHUNK_SIZE)

# Send the audio chunk to Deepgram for recognition
reposnse = await getaudiotranscript(data)



# Print the recognized text to the console

Close the audio stream

stream.close()

ssl.SSLCertVerificationError for batch and streaming starter code of deepgram

What is the current behavior?

What's happening that seems wrong?

Steps to reproduce

To make it faster to diagnose the root problem. Tell us how can we reproduce the bug.

Expected behavior

What would you expect to happen when following the steps above?

Please tell us about your environment

We want to make sure the problem isn't specific to your operating system or programming language.

  • Operating System/Version: Windows 10
  • Language: [all | TypeScript | Python | PHP | etc]
  • Browser: Chrome

Other information

Anything else we should know? (e.g. detailed explanation, stack-traces, related issues, suggestions how to fix, links for us to have context, eg. stack overflow, codepen, etc)

Support Persian(Farsi) language

Proposed changes

Provide a detailed description of the change or addition you are proposing

Adding Persian(Farsi) recognition model.

Context

Why is this change important to you? How would you use it? How can it benefit other users?

Possible Implementation

Not obligatory, but suggest an idea for implementing addition or change

Other information

Anything else we should know? (e.g. detailed explanation, related issues, links for us to have context, eg. stack overflow, codepen, etc)

┆Issue is synchronized with this Asana task by Unito

Typing in _type.py is out of date with what is actually returned from API

What is the current behavior?

the types in _types.py don't actually represent what is returned from the API. Specifically I am using the Paragraphs feature, and the API returns paragraphs.paragraphs.speaker which I would like to access, but the typing does not list the speaker as a field (which is optionally there if you specify diarization).

Steps to reproduce

  • Use the Python SDK
  • Specify Smart formatting and Diarization
  • Transcribe something
  • Use typing
  • Try access the the speaker field from within the paragraphs and your typing will complain because the SDK doesn't actually contain the possible fields returned from the API.

Expected behavior

I would expect to see speaker: Optional[int] here

Please tell us about your environment

We want to make sure the problem isn't specific to your operating system or programming language.

  • Operating System/Version: macOS 12.5.1
  • Language: Python
  • Browser: Chrome

Other information

deepgram-sdk==2.11.0

Balance deduction in summary report

Proposed changes

Summarize Usage api returns following details
end,start,requests,hours

but we dont understand the daily how much we spend ?
so please add how much money we get charged for that day for total requests or total minutes

Exception on import python3.9

What is the current behavior?

from deepgram import Deepgram results in an exception

TypeError: unhashable type: 'list'
Most relevant lines:

File ".../python3.9/site-packages/deepgram/_types.py", line 171, in <module>
    EventHandler = Union[Callable[[Any], None], Callable[[Any], Awaitable[None]]]

Steps to reproduce

Python3.9, pip install, from deepgram import Deepgram

Please tell us about your environment

We want to make sure the problem isn't specific to your operating system or programming language.
Ubuntu 21.04

Other information

Is solved by changing
EventHandler = Union[Callable[[Any], None], Callable[[Any], Awaitable[None]]]
into
EventHandler = Union[Callable[Any, None], Callable[Any, Awaitable[None]]]

in deepgram/_types.py

Full trace:
Traceback (most recent call last):
File "/home/script/tmi_archive/manage.py", line 22, in
main()
File "/home/script/tmi_archive/manage.py", line 18, in main
execute_from_command_line(sys.argv)
File "/home/simon/.local/share/virtualenvs/tmi_archive-Ytjv44Ey/lib/python3.9/site-packages/django/core/management/init.py", line 425, in execute_from_command_line
utility.execute()
File "/home/simon/.local/share/virtualenvs/tmi_archive-Ytjv44Ey/lib/python3.9/site-packages/django/core/management/init.py", line 419, in execute
self.fetch_command(subcommand).run_from_argv(self.argv)
File "/home/simon/.local/share/virtualenvs/tmi_archive-Ytjv44Ey/lib/python3.9/site-packages/django/core/management/init.py", line 263, in fetch_command
klass = load_command_class(app_name, subcommand)
File "/home/simon/.local/share/virtualenvs/tmi_archive-Ytjv44Ey/lib/python3.9/site-packages/django/core/management/init.py", line 39, in load_command_class
module = import_module('%s.management.commands.%s' % (app_name, name))
File "/home/opt/anaconda/lib/python3.9/importlib/init.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 1030, in _gcd_import
File "", line 1007, in _find_and_load
File "", line 986, in _find_and_load_unlocked
File "", line 680, in _load_unlocked
File "", line 790, in exec_module
File "", line 228, in _call_with_frames_removed
File "/home/script/tmi_archive/talks/management/commands/transcribe-deepgram.py", line 2, in
from deepgram import Deepgram
File "/home/simon/.local/share/virtualenvs/tmi_archive-Ytjv44Ey/lib/python3.9/site-packages/deepgram/init.py", line 2, in
from ._types import Options
File "/home/simon/.local/share/virtualenvs/tmi_archive-Ytjv44Ey/lib/python3.9/site-packages/deepgram/_types.py", line 171, in
EventHandler = Union[Callable[[Any], None], Callable[[Any], Awaitable[None]]]
File "/home/opt/anaconda/lib/python3.9/typing.py", line 243, in inner
return func(*args, **kwds)
File "/home/opt/anaconda/lib/python3.9/typing.py", line 316, in getitem
return self._getitem(self, parameters)
File "/home/opt/anaconda/lib/python3.9/typing.py", line 421, in Union
parameters = _remove_dups_flatten(parameters)
File "/home/opt/anaconda/lib/python3.9/typing.py", line 215, in _remove_dups_flatten
all_params = set(params)
TypeError: unhashable type: 'list'

LiveTranscription

What is the current behavior?

I keep seeing the following error emitted by my server

Task exception was never retrieved
future: <Task finished name='Task-192191' coro=<LiveTranscription._start() done, defined at /Users/sishaar/.pyenv/versions/3.11.4/lib/python3.11/site-packages/deepgram/transcription.py:178> exception=ConnectionClosedOK(Close(code=1000, reason=''), Close(code=1000, reason=''), True)>
Traceback (most recent call last):
  File "/Users/sishaar/.pyenv/versions/3.11.4/lib/python3.11/site-packages/deepgram/transcription.py", line 222, in _start
    await self._socket.send(body)
  File "/Users/sishaar/.pyenv/versions/3.11.4/lib/python3.11/site-packages/websockets/legacy/protocol.py", line 635, in send
    await self.ensure_open()
  File "/Users/sishaar/.pyenv/versions/3.11.4/lib/python3.11/site-packages/websockets/legacy/protocol.py", line 935, in ensure_open
    raise self.connection_closed_exc()
websockets.exceptions.ConnectionClosedOK: received 1000 (OK); then sent 1000 (OK)

Here is the file I use to manage my Deepgram connections - am I not registering/handling errors correctly?

class _DeepgramConnectionPool:
  def __init__(self):
    self.connections: LiveTranscription = []
    self.deepgram_hosted = Deepgram('<elided>')
    self.deepgram_onprem = Deepgram({'api_key': '', 'api_url': '<elided>'})

  async def get_connection(self, config: dict = {}):
    # default to use hosted deepgram
    self.deepgram = self.deepgram_hosted
    if config.get("enable_to_use_deepgram_on_premise") == "true":
      self.deepgram = self.deepgram_onprem
    try:
      deepgram_config = config.get('country_code_configs', {})
      model = deepgram_config.get("model", '<elided>')
      tier = 'enhanced' if model == 'nooks' else deepgram_config.get("tier", 'base')
      version = 'v1' if model == 'nooks' else deepgram_config.get('version', 'latest')

      async def _get_connection():
        return await self.deepgram.transcription.live(
          encoding='mulaw',
          model=model,
          tier=tier,
          version=version,
          sample_rate=8000,
          punctuate=True,
          interim_results=True,
          language='en-US',
          times=False,
        )

      connection: LiveTranscription = await Retry(_get_connection, delay_s=0.1, factor=1.2)
      connection.register_handler(LiveTranscriptionEvent.ERROR, lambda error: logger.error('Error during connection', exception=error))
      return connection
    except ConnectionClosedOK:
      logger.log('Deepgram websocket connection closed')
    except Exception as e:
      logger.error('Error creating connection: ', exception=e)
      raise e

DeepgramConnectionPool = _DeepgramConnectionPool()

Later on, I close the connection with the following piece of code

    async def close_deepgram(self):
        self.logger.debug("Closing Deepgram Connection")
        if self.deepgram:
            await self.deepgram.finish()
            self.logger.log('Closed Deepgram Connection')

Please tell us about your environment

We want to make sure the problem isn't specific to your operating system or programming language.

  • **Operating System/Version: MacOS
  • **Language: Python

Import Error

What is the current behavior?

from deepgram import Deepgram gives below error:
SyntaxError: future feature annotations is not defined

Steps to reproduce

Python version 3.6.8
pip install deepgram-sdk
from deepgram import Deepgram

Expected behavior

Successful import of Deepgram class

Please tell us about your environment

We want to make sure the problem isn't specific to your operating system or programming language.

  • Operating System/Version: Ubuntu 18.04
  • Language: python 3.6.8

calling sync_prerecorded fails with ssl.SSLEOFError: EOF occurred in violation of protocol (_ssl.c:2396)

What is the current behavior?

    dg_client = Deepgram(DEEPGRAM_API_KEY)

    with open(PATH_TO_FILE, 'rb') as audio:
        source = {'buffer': audio, 'mimetype': MIMETYPE}
        options = { "punctuate": True, "model": "base"}
        response = dg_client.transcription.sync_prerecorded(source, options)

What's happening that seems wrong?

fails with

Traceback (most recent call last):
  File "/opt/homebrew/lib/python3.10/site-packages/deepgram/_utils.py", line 139, in attempt
    with urllib.request.urlopen(req) as resp:
  File "/opt/homebrew/Cellar/[email protected]/3.10.9/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py", line 216, in urlopen
    return opener.open(url, data, timeout)
  File "/opt/homebrew/Cellar/[email protected]/3.10.9/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py", line 519, in open
    response = self._open(req, data)
  File "/opt/homebrew/Cellar/[email protected]/3.10.9/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py", line 536, in _open
    result = self._call_chain(self.handle_open, protocol, protocol +
  File "/opt/homebrew/Cellar/[email protected]/3.10.9/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py", line 496, in _call_chain
    result = func(*args)
  File "/opt/homebrew/Cellar/[email protected]/3.10.9/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py", line 1391, in https_open
    return self.do_open(http.client.HTTPSConnection, req,
  File "/opt/homebrew/Cellar/[email protected]/3.10.9/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py", line 1351, in do_open
    raise URLError(err)
urllib.error.URLError: <urlopen error EOF occurred in violation of protocol (_ssl.c:2396)>

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/mikhailkuznetcov/Developer/speech-rec-python/deepgram_version.py", line 36, in <module>
    main()
  File "/Users/mikhailkuznetcov/Developer/speech-rec-python/deepgram_version.py", line 28, in main
    response = dg_client.transcription.sync_prerecorded(source, options)
  File "/opt/homebrew/lib/python3.10/site-packages/deepgram/transcription.py", line 355, in sync_prerecorded
    return SyncPrerecordedTranscription(
  File "/opt/homebrew/lib/python3.10/site-packages/deepgram/transcription.py", line 111, in __call__
    return _sync_request(
  File "/opt/homebrew/lib/python3.10/site-packages/deepgram/_utils.py", line 153, in _sync_request
    return attempt()
  File "/opt/homebrew/lib/python3.10/site-packages/deepgram/_utils.py", line 148, in attempt
    raise (Exception(f'DG: {exc}') if exc.status < 500 else exc)
AttributeError: 'URLError' object has no attribute 'status'

Steps to reproduce

To make it faster to diagnose the root problem. Tell us how can we reproduce the bug.

Expected behavior

  • transcribe the file

What would you expect to happen when following the steps above?

Please tell us about your environment

Python:

We want to make sure the problem isn't specific to your operating system or programming language.

  • Operating System/Version: Mac M1
  • Language: Python

Other information

Anything else we should know? (e.g. detailed explanation, stack-traces, related issues, suggestions how to fix, links for us to have context, eg. stack overflow, codepen, etc)

Passing keywords parameter fails to url encode

What's happening that seems wrong?

Pass keywords parameter as a list of strings.

        response = await dg_client.transcription.prerecorded(
            source,
            {
                "punctuate": True,
                "diarize": True,
                "numerals": True,
                "utterances": True,
                "keywords": [
                    "10-Q",
                    "non-GAAP",
                    "GAAP",
                    "CRM",
ValueError: not enough values to unpack (expected 2, got 1)

The data looks like

[('keywords', '10-Q')]
[('keywords', 'non-GAAP')]

and it should look like

('keywords', '10-Q')
('keywords', 'non-GAAP')

So an extra array is ending up around each item.

Wrong speaker detection/ Wrong labeling of speaker

Wrong speaker detection/ Wrong labeling of speaker when I am trying to transcribe the mp4 video file. I am using python sdk. I am using below settings:

'tier':'enhanced',
'punctuate': True,'
diarize':True,'
utterances':True,'
utt_split':0.3
I am attaching the expected output file, actual output file
current_whatsapp.docx
expected_whatsapp.docx

Can someone solve my problem?

Below is the json response of transcription:

{'metadata': {'transaction_key': 'deprecated', 'request_id': '2a124439-6333-4b0f-9ac7-0063b303e6ba', 'sha256': '2af7b928fe91cfca4b51126b54d19c69af3b5c39db65da7d9d87d01e74faf7ca', 'created': '2022-11-10T05:27:39.785Z', 'duration': 59.94669, 'channels': 1, 'models': ['125125fb-e391-458e-a227-a60d6426f5d6'], 'model_info': {'125125fb-e391-458e-a227-a60d6426f5d6': {'name': 'general-enhanced', 'version': '2022-05-18.0', 'tier': 'enhanced'}}}, 'results': {'channels': [{'alternatives': [{'transcript': "Hello, Kamiji. How are you? I'm fine. And you? Fine. So tell me about yourself. myself. I'm a software engineer. I'm working in a global IT app as project manager. Okay. And I have more than ten year experience in PHP. During my experience, I worked on PHP. So which projects are you working on? Right now I'm working on? Right now I'm working on single project. That is it. that is ASR means speech recognition on that. Which challenges are you facing? Right now, I'm facing some challenges related to the speaker changes. Like, when I transcribe the video into text format. Sometimes the speak speaker labeling are wrong. Okay. I'm disconnecting the", 'confidence': 0.94970703, 'words': [{'word': 'hello', 'start': 1.1992188, 'end': 1.5195312, 'confidence': 0.75268555, 'speaker': 0, 'speaker_confidence': 0.06817669, 'punctuated_word': 'Hello,'}, {'word': 'kamiji', 'start': 1.5195312, 'end': 1.9990234, 'confidence': 0.6906738, 'speaker': 0, 'speaker_confidence': 0.06817669, 'punctuated_word': 'Kamiji.'}, {'word': 'how', 'start': 1.9990234, 'end': 2.1582031, 'confidence': 0.9946289, 'speaker': 0, 'speaker_confidence': 0.06817669, 'punctuated_word': 'How'}, {'word': 'are', 'start': 2.1582031, 'end': 2.3183594, 'confidence': 0.98095703, 'speaker': 0, 'speaker_confidence': 0.06817669, 'punctuated_word': 'are'}, {'word': 'you', 'start': 2.3183594, 'end': 2.8183594, 'confidence': 0.9995117, 'speaker': 0, 'speaker_confidence': 0.06817669, 'punctuated_word': 'you?'}, {'word': "i'm", 'start': 3.0390625, 'end': 3.2773438, 'confidence': 0.79833984, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': "I'm"}, {'word': 'fine', 'start': 3.2773438, 'end': 3.5976562, 'confidence': 0.78344727, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'fine.'}, {'word': 'and', 'start': 3.5976562, 'end': 3.9179688, 'confidence': 0.5410156, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'And'}, {'word': 'you', 'start': 3.9179688, 'end': 4.4179688, 'confidence': 0.8601074, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'you?'}, {'word': 'fine', 'start': 4.5585938, 'end': 5.0585938, 'confidence': 0.9968262, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'Fine.'}, {'word': 'so', 'start': 5.3554688, 'end': 5.5976562, 'confidence': 0.9975586, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'So'}, {'word': 'tell', 'start': 5.5976562, 'end': 5.8359375, 'confidence': 0.93359375, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'tell'}, {'word': 'me', 'start': 5.8359375, 'end': 5.9960938, 'confidence': 0.99902344, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'me'}, {'word': 'about', 'start': 5.9960938, 'end': 6.2382812, 'confidence': 0.9980469, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'about'}, {'word': 'yourself', 'start': 6.2382812, 'end': 6.7382812, 'confidence': 0.7758789, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'yourself.'}, {'word': 'myself', 'start': 7.6835938, 'end': 8.183594, 'confidence': 0.61417645, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'myself.'}, {'word': "i'm", 'start': 8.8828125, 'end': 9.0859375, 'confidence': 0.85961914, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': "I'm"}, {'word': 'a', 'start': 9.0859375, 'end': 9.203125, 'confidence': 0.99609375, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'a'}, {'word': 'software', 'start': 9.203125, 'end': 9.640625, 'confidence': 0.99853516, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'software'}, {'word': 'engineer', 'start': 9.640625, 'end': 10.140625, 'confidence': 0.9729004, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'engineer.'}, {'word': "i'm", 'start': 10.2421875, 'end': 10.484375, 'confidence': 0.8195801, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': "I'm"}, {'word': 'working', 'start': 10.484375, 'end': 10.84375, 'confidence': 0.94970703, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'working'}, {'word': 'in', 'start': 10.84375, 'end': 11.125, 'confidence': 0.99609375, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'in'}, {'word': 'a', 'start': 11.125, 'end': 11.203125, 'confidence': 0.6328125, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'a'}, {'word': 'global', 'start': 11.203125, 'end': 11.640625, 'confidence': 0.98828125, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'global'}, {'word': 'it', 'start': 11.640625, 'end': 12.0, 'confidence': 0.99853516, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'IT'}, {'word': 'app', 'start': 12.0, 'end': 12.203125, 'confidence': 0.5722656, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'app'}, {'word': 'as', 'start': 12.203125, 'end': 12.703125, 'confidence': 0.9086914, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'as'}, {'word': 'project', 'start': 13.640625, 'end': 14.0, 'confidence': 0.6201172, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'project'}, {'word': 'manager', 'start': 14.0, 'end': 14.5, 'confidence': 0.9519043, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'manager.'}, {'word': 'okay', 'start': 15.2265625, 'end': 15.546875, 'confidence': 0.7788086, 'speaker': 0, 'speaker_confidence': 0.041602314, 'punctuated_word': 'Okay.'}, {'word': 'and', 'start': 15.546875, 'end': 16.046875, 'confidence': 0.79345703, 'speaker': 0, 'speaker_confidence': 0.041602314, 'punctuated_word': 'And'}, {'word': 'i', 'start': 17.09375, 'end': 17.21875, 'confidence': 0.95654297, 'speaker': 0, 'speaker_confidence': 0.56206894, 'punctuated_word': 'I'}, {'word': 'have', 'start': 17.21875, 'end': 17.46875, 'confidence': 0.99560547, 'speaker': 0, 'speaker_confidence': 0.56206894, 'punctuated_word': 'have'}, {'word': 'more', 'start': 17.46875, 'end': 17.65625, 'confidence': 0.99853516, 'speaker': 0, 'speaker_confidence': 0.56206894, 'punctuated_word': 'more'}, {'word': 'than', 'start': 17.65625, 'end': 17.90625, 'confidence': 0.9995117, 'speaker': 0, 'speaker_confidence': 0.56206894, 'punctuated_word': 'than'}, {'word': 'ten', 'start': 17.90625, 'end': 18.09375, 'confidence': 0.9941406, 'speaker': 0, 'speaker_confidence': 0.56206894, 'punctuated_word': 'ten'}, {'word': 'year', 'start': 18.09375, 'end': 18.34375, 'confidence': 0.9902344, 'speaker': 0, 'speaker_confidence': 0.56206894, 'punctuated_word': 'year'}, {'word': 'experience', 'start': 18.34375, 'end': 18.84375, 'confidence': 0.9995117, 'speaker': 0, 'speaker_confidence': 0.56206894, 'punctuated_word': 'experience'}, {'word': 'in', 'start': 18.90625, 'end': 19.140625, 'confidence': 0.99853516, 'speaker': 0, 'speaker_confidence': 0.56206894, 'punctuated_word': 'in'}, {'word': 'php', 'start': 19.140625, 'end': 19.640625, 'confidence': 0.90063477, 'speaker': 0, 'speaker_confidence': 0.56206894, 'punctuated_word': 'PHP.'}, {'word': 'during', 'start': 20.421875, 'end': 20.703125, 'confidence': 0.9951172, 'speaker': 0, 'speaker_confidence': 0.56206894, 'punctuated_word': 'During'}, {'word': 'my', 'start': 20.703125, 'end': 20.90625, 'confidence': 0.9506836, 'speaker': 0, 'speaker_confidence': 0.56206894, 'punctuated_word': 'my'}, {'word': 'experience', 'start': 20.90625, 'end': 21.40625, 'confidence': 0.96850586, 'speaker': 0, 'speaker_confidence': 0.56206894, 'punctuated_word': 'experience,'}, {'word': 'i', 'start': 21.46875, 'end': 21.625, 'confidence': 0.9975586, 'speaker': 0, 'speaker_confidence': 0.56206894, 'punctuated_word': 'I'}, {'word': 'worked', 'start': 21.625, 'end': 21.90625, 'confidence': 0.4501953, 'speaker': 0, 'speaker_confidence': 0.56206894, 'punctuated_word': 'worked'}, {'word': 'on', 'start': 21.90625, 'end': 22.0625, 'confidence': 0.9453125, 'speaker': 0, 'speaker_confidence': 0.56206894, 'punctuated_word': 'on'}, {'word': 'php', 'start': 22.0625, 'end': 22.5625, 'confidence': 0.28125, 'speaker': 0, 'speaker_confidence': 0.56206894, 'punctuated_word': 'PHP.'}, {'word': 'so', 'start': 25.96875, 'end': 26.140625, 'confidence': 0.13635254, 'speaker': 0, 'speaker_confidence': 0.019748092, 'punctuated_word': 'So'}, {'word': 'which', 'start': 26.140625, 'end': 26.328125, 'confidence': 0.9736328, 'speaker': 0, 'speaker_confidence': 0.019748092, 'punctuated_word': 'which'}, {'word': 'projects', 'start': 26.328125, 'end': 26.78125, 'confidence': 0.9975586, 'speaker': 0, 'speaker_confidence': 0.0, 'punctuated_word': 'projects'}, {'word': 'are', 'start': 26.78125, 'end': 26.890625, 'confidence': 0.9995117, 'speaker': 0, 'speaker_confidence': 0.0, 'punctuated_word': 'are'}, {'word': 'you', 'start': 26.890625, 'end': 27.015625, 'confidence': 0.9824219, 'speaker': 0, 'speaker_confidence': 0.0, 'punctuated_word': 'you'}, {'word': 'working', 'start': 27.015625, 'end': 27.296875, 'confidence': 0.9975586, 'speaker': 0, 'speaker_confidence': 0.0, 'punctuated_word': 'working'}, {'word': 'on', 'start': 27.296875, 'end': 27.796875, 'confidence': 0.8391113, 'speaker': 0, 'speaker_confidence': 0.0, 'punctuated_word': 'on?'}, {'word': 'right', 'start': 28.375, 'end': 28.65625, 'confidence': 0.86621094, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'Right'}, {'word': 'now', 'start': 28.65625, 'end': 28.703125, 'confidence': 0.99316406, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'now'}, {'word': "i'm", 'start': 28.703125, 'end': 28.75, 'confidence': 0.77368164, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': "I'm"}, {'word': 'working', 'start': 28.75, 'end': 28.796875, 'confidence': 0.99853516, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'working'}, {'word': 'on', 'start': 28.796875, 'end': 28.84375, 'confidence': 0.75878906, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'on?'}, {'word': 'right', 'start': 28.84375, 'end': 28.890625, 'confidence': 0.62402344, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'Right'}, {'word': 'now', 'start': 28.890625, 'end': 28.9375, 'confidence': 0.99121094, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'now'}, {'word': "i'm", 'start': 28.9375, 'end': 29.171875, 'confidence': 0.76293945, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': "I'm"}, {'word': 'working', 'start': 29.171875, 'end': 29.53125, 'confidence': 0.99902344, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'working'}, {'word': 'on', 'start': 29.53125, 'end': 30.03125, 'confidence': 0.99902344, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'on'}, {'word': 'single', 'start': 30.65625, 'end': 31.015625, 'confidence': 0.74316406, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'single'}, {'word': 'project', 'start': 31.015625, 'end': 31.5, 'confidence': 0.7504883, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'project.'}, {'word': 'that', 'start': 31.5, 'end': 31.65625, 'confidence': 0.8725586, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'That'}, {'word': 'is', 'start': 31.65625, 'end': 31.84375, 'confidence': 0.95654297, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'is'}, {'word': 'it', 'start': 31.84375, 'end': 32.34375, 'confidence': 0.8676758, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'it.'}, {'word': 'that', 'start': 32.96875, 'end': 33.21875, 'confidence': 0.38134766, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'that'}, {'word': 'is', 'start': 33.21875, 'end': 33.71875, 'confidence': 0.89990234, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'is'}, {'word': 'asr', 'start': 34.0625, 'end': 34.5625, 'confidence': 0.87939453, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'ASR'}, {'word': 'means', 'start': 35.4375, 'end': 35.9375, 'confidence': 0.6923828, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'means'}, {'word': 'speech', 'start': 35.9375, 'end': 36.21875, 'confidence': 0.52246094, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'speech'}, {'word': 'recognition', 'start': 36.21875, 'end': 36.71875, 'confidence': 0.92529297, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'recognition'}, {'word': 'on', 'start': 37.71875, 'end': 37.84375, 'confidence': 0.5708008, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'on'}, {'word': 'that', 'start': 37.84375, 'end': 38.34375, 'confidence': 0.9433594, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'that.'}, {'word': 'which', 'start': 38.625, 'end': 38.875, 'confidence': 0.9770508, 'speaker': 0, 'speaker_confidence': 0.0, 'punctuated_word': 'Which'}, {'word': 'challenges', 'start': 38.875, 'end': 39.34375, 'confidence': 0.8046875, 'speaker': 0, 'speaker_confidence': 0.57842153, 'punctuated_word': 'challenges'}, {'word': 'are', 'start': 39.34375, 'end': 39.5, 'confidence': 0.9995117, 'speaker': 0, 'speaker_confidence': 0.57842153, 'punctuated_word': 'are'}, {'word': 'you', 'start': 39.5, 'end': 39.625, 'confidence': 0.99902344, 'speaker': 0, 'speaker_confidence': 0.57842153, 'punctuated_word': 'you'}, {'word': 'facing', 'start': 39.625, 'end': 40.125, 'confidence': 0.99658203, 'speaker': 0, 'speaker_confidence': 0.57842153, 'punctuated_word': 'facing?'}, {'word': 'right', 'start': 41.71875, 'end': 41.96875, 'confidence': 0.63623047, 'speaker': 0, 'speaker_confidence': 0.57842153, 'punctuated_word': 'Right'}, {'word': 'now', 'start': 41.96875, 'end': 42.15625, 'confidence': 0.81274414, 'speaker': 0, 'speaker_confidence': 0.57842153, 'punctuated_word': 'now,'}, {'word': "i'm", 'start': 42.15625, 'end': 42.5625, 'confidence': 0.98950195, 'speaker': 0, 'speaker_confidence': 0.57842153, 'punctuated_word': "I'm"}, {'word': 'facing', 'start': 42.5625, 'end': 43.0, 'confidence': 0.9868164, 'speaker': 0, 'speaker_confidence': 0.57842153, 'punctuated_word': 'facing'}, {'word': 'some', 'start': 43.0, 'end': 43.25, 'confidence': 0.9995117, 'speaker': 0, 'speaker_confidence': 0.57842153, 'punctuated_word': 'some'}, {'word': 'challenges', 'start': 43.25, 'end': 43.75, 'confidence': 0.9995117, 'speaker': 0, 'speaker_confidence': 0.57842153, 'punctuated_word': 'challenges'}, {'word': 'related', 'start': 44.03125, 'end': 44.53125, 'confidence': 0.98583984, 'speaker': 0, 'speaker_confidence': 0.57842153, 'punctuated_word': 'related'}, {'word': 'to', 'start': 44.53125, 'end': 44.625, 'confidence': 0.9995117, 'speaker': 0, 'speaker_confidence': 0.57842153, 'punctuated_word': 'to'}, {'word': 'the', 'start': 44.625, 'end': 44.84375, 'confidence': 0.9716797, 'speaker': 0, 'speaker_confidence': 0.57842153, 'punctuated_word': 'the'}, {'word': 'speaker', 'start': 44.84375, 'end': 45.1875, 'confidence': 0.9902344, 'speaker': 0, 'speaker_confidence': 0.57842153, 'punctuated_word': 'speaker'}, {'word': 'changes', 'start': 45.1875, 'end': 45.6875, 'confidence': 0.9050293, 'speaker': 0, 'speaker_confidence': 0.57842153, 'punctuated_word': 'changes.'}, {'word': 'like', 'start': 46.34375, 'end': 46.84375, 'confidence': 0.9001465, 'speaker': 0, 'speaker_confidence': 0.57842153, 'punctuated_word': 'Like,'}, {'word': 'when', 'start': 47.0625, 'end': 47.3125, 'confidence': 0.99902344, 'speaker': 0, 'speaker_confidence': 0.57842153, 'punctuated_word': 'when'}, {'word': 'i', 'start': 47.3125, 'end': 47.8125, 'confidence': 0.9995117, 'speaker': 0, 'speaker_confidence': 0.5727463, 'punctuated_word': 'I'}, {'word': 'transcribe', 'start': 48.5625, 'end': 49.0625, 'confidence': 0.9313965, 'speaker': 0, 'speaker_confidence': 0.5727463, 'punctuated_word': 'transcribe'}, {'word': 'the', 'start': 49.3125, 'end': 49.56, 'confidence': 0.9580078, 'speaker': 0, 'speaker_confidence': 0.5727463, 'punctuated_word': 'the'}, {'word': 'video', 'start': 49.8125, 'end': 50.3125, 'confidence': 0.9995117, 'speaker': 0, 'speaker_confidence': 0.5727463, 'punctuated_word': 'video'}, {'word': 'into', 'start': 50.59375, 'end': 51.0, 'confidence': 0.91308594, 'speaker': 0, 'speaker_confidence': 0.5727463, 'punctuated_word': 'into'}, {'word': 'text', 'start': 51.0, 'end': 51.3125, 'confidence': 0.7885742, 'speaker': 0, 'speaker_confidence': 0.5727463, 'punctuated_word': 'text'}, {'word': 'format', 'start': 51.3125, 'end': 51.8125, 'confidence': 0.78125, 'speaker': 0, 'speaker_confidence': 0.5727463, 'punctuated_word': 'format.'}, {'word': 'sometimes', 'start': 52.6875, 'end': 53.1875, 'confidence': 0.9589844, 'speaker': 0, 'speaker_confidence': 0.5727463, 'punctuated_word': 'Sometimes'}, {'word': 'the', 'start': 53.1875, 'end': 53.375, 'confidence': 0.49975586, 'speaker': 0, 'speaker_confidence': 0.5727463, 'punctuated_word': 'the'}, {'word': 'speak', 'start': 53.375, 'end': 53.625, 'confidence': 0.84521484, 'speaker': 0, 'speaker_confidence': 0.5727463, 'punctuated_word': 'speak'}, {'word': 'speaker', 'start': 53.84375, 'end': 54.25, 'confidence': 0.71191406, 'speaker': 0, 'speaker_confidence': 0.5727463, 'punctuated_word': 'speaker'}, {'word': 'labeling', 'start': 54.25, 'end': 54.71875, 'confidence': 0.9682617, 'speaker': 0, 'speaker_confidence': 0.5727463, 'punctuated_word': 'labeling'}, {'word': 'are', 'start': 54.71875, 'end': 54.96875, 'confidence': 0.48291016, 'speaker': 0, 'speaker_confidence': 0.5727463, 'punctuated_word': 'are'}, {'word': 'wrong', 'start': 54.96875, 'end': 55.46875, 'confidence': 0.9030762, 'speaker': 0, 'speaker_confidence': 0.098423064, 'punctuated_word': 'wrong.'}, {'word': 'okay', 'start': 56.40625, 'end': 56.90625, 'confidence': 0.77368164, 'speaker': 0, 'speaker_confidence': 0.098423064, 'punctuated_word': 'Okay.'}, {'word': "i'm", 'start': 58.0625, 'end': 58.40625, 'confidence': 0.95996094, 'speaker': 0, 'speaker_confidence': 0.098423064, 'punctuated_word': "I'm"}, {'word': 'disconnecting', 'start': 58.40625, 'end': 58.90625, 'confidence': 0.99902344, 'speaker': 0, 'speaker_confidence': 0.098423064, 'punctuated_word': 'disconnecting'}, {'word': 'the', 'start': 59.1875, 'end': 59.6875, 'confidence': 0.88964844, 'speaker': 0, 'speaker_confidence': 0.098423064, 'punctuated_word': 'the'}]}]}], 'utterances': [{'start': 1.1992188, 'end': 6.7382812, 'confidence': 0.87348634, 'channel': 0, 'transcript': "Hello, Kamiji. How are you? I'm fine. And you? Fine. So tell me about yourself.", 'words': [{'word': 'hello', 'start': 1.1992188, 'end': 1.5195312, 'confidence': 0.75268555, 'speaker': 0, 'speaker_confidence': 0.06817669, 'punctuated_word': 'Hello,'}, {'word': 'kamiji', 'start': 1.5195312, 'end': 1.9990234, 'confidence': 0.6906738, 'speaker': 0, 'speaker_confidence': 0.06817669, 'punctuated_word': 'Kamiji.'}, {'word': 'how', 'start': 1.9990234, 'end': 2.1582031, 'confidence': 0.9946289, 'speaker': 0, 'speaker_confidence': 0.06817669, 'punctuated_word': 'How'}, {'word': 'are', 'start': 2.1582031, 'end': 2.3183594, 'confidence': 0.98095703, 'speaker': 0, 'speaker_confidence': 0.06817669, 'punctuated_word': 'are'}, {'word': 'you', 'start': 2.3183594, 'end': 2.8183594, 'confidence': 0.9995117, 'speaker': 0, 'speaker_confidence': 0.06817669, 'punctuated_word': 'you?'}, {'word': "i'm", 'start': 3.0390625, 'end': 3.2773438, 'confidence': 0.79833984, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': "I'm"}, {'word': 'fine', 'start': 3.2773438, 'end': 3.5976562, 'confidence': 0.78344727, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'fine.'}, {'word': 'and', 'start': 3.5976562, 'end': 3.9179688, 'confidence': 0.5410156, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'And'}, {'word': 'you', 'start': 3.9179688, 'end': 4.4179688, 'confidence': 0.8601074, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'you?'}, {'word': 'fine', 'start': 4.5585938, 'end': 5.0585938, 'confidence': 0.9968262, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'Fine.'}, {'word': 'so', 'start': 5.3554688, 'end': 5.5976562, 'confidence': 0.9975586, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'So'}, {'word': 'tell', 'start': 5.5976562, 'end': 5.8359375, 'confidence': 0.93359375, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'tell'}, {'word': 'me', 'start': 5.8359375, 'end': 5.9960938, 'confidence': 0.99902344, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'me'}, {'word': 'about', 'start': 5.9960938, 'end': 6.2382812, 'confidence': 0.9980469, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'about'}, {'word': 'yourself', 'start': 6.2382812, 'end': 6.7382812, 'confidence': 0.7758789, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'yourself.'}], 'speaker': 0, 'id': 'efde132f-d302-4900-b368-901c67ad5c72'}, {'start': 7.6835938, 'end': 8.183594, 'confidence': 0.61417645, 'channel': 0, 'transcript': 'myself.', 'words': [{'word': 'myself', 'start': 7.6835938, 'end': 8.183594, 'confidence': 0.61417645, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'myself.'}], 'speaker': 0, 'id': '61f3c5d5-e229-4520-9e0f-f4abf924173c'}, {'start': 8.8828125, 'end': 12.703125, 'confidence': 0.89109296, 'channel': 0, 'transcript': "I'm a software engineer. I'm working in a global IT app as", 'words': [{'word': "i'm", 'start': 8.8828125, 'end': 9.0859375, 'confidence': 0.85961914, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': "I'm"}, {'word': 'a', 'start': 9.0859375, 'end': 9.203125, 'confidence': 0.99609375, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'a'}, {'word': 'software', 'start': 9.203125, 'end': 9.640625, 'confidence': 0.99853516, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'software'}, {'word': 'engineer', 'start': 9.640625, 'end': 10.140625, 'confidence': 0.9729004, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'engineer.'}, {'word': "i'm", 'start': 10.2421875, 'end': 10.484375, 'confidence': 0.8195801, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': "I'm"}, {'word': 'working', 'start': 10.484375, 'end': 10.84375, 'confidence': 0.94970703, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'working'}, {'word': 'in', 'start': 10.84375, 'end': 11.125, 'confidence': 0.99609375, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'in'}, {'word': 'a', 'start': 11.125, 'end': 11.203125, 'confidence': 0.6328125, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'a'}, {'word': 'global', 'start': 11.203125, 'end': 11.640625, 'confidence': 0.98828125, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'global'}, {'word': 'it', 'start': 11.640625, 'end': 12.0, 'confidence': 0.99853516, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'IT'}, {'word': 'app', 'start': 12.0, 'end': 12.203125, 'confidence': 0.5722656, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'app'}, {'word': 'as', 'start': 12.203125, 'end': 12.703125, 'confidence': 0.9086914, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'as'}], 'speaker': 0, 'id': '7030b852-6e2b-4271-be1f-c000668762b0'}, {'start': 13.640625, 'end': 14.5, 'confidence': 0.78601074, 'channel': 0, 'transcript': 'project manager.', 'words': [{'word': 'project', 'start': 13.640625, 'end': 14.0, 'confidence': 0.6201172, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'project'}, {'word': 'manager', 'start': 14.0, 'end': 14.5, 'confidence': 0.9519043, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'manager.'}], 'speaker': 0, 'id': '0433d626-1ab4-4e51-a7a0-e346801b2720'}, {'start': 15.2265625, 'end': 16.046875, 'confidence': 0.7861328, 'channel': 0, 'transcript': 'Okay. And', 'words': [{'word': 'okay', 'start': 15.2265625, 'end': 15.546875, 'confidence': 0.7788086, 'speaker': 0, 'speaker_confidence': 0.041602314, 'punctuated_word': 'Okay.'}, {'word': 'and', 'start': 15.546875, 'end': 16.046875, 'confidence': 0.79345703, 'speaker': 0, 'speaker_confidence': 0.041602314, 'punctuated_word': 'And'}], 'speaker': 0, 'id': 'f9754b97-95c0-4cfe-813b-4ed088bf4df3'}, {'start': 17.09375, 'end': 19.640625, 'confidence': 0.98147243, 'channel': 0, 'transcript': 'I have more than ten year experience in PHP.', 'words': [{'word': 'i', 'start': 17.09375, 'end': 17.21875, 'confidence': 0.95654297, 'speaker': 0, 'speaker_confidence': 0.56206894, 'punctuated_word': 'I'}, {'word': 'have', 'start': 17.21875, 'end': 17.46875, 'confidence': 0.99560547, 'speaker': 0, 'speaker_confidence': 0.56206894, 'punctuated_word': 'have'}, {'word': 'more', 'start': 17.46875, 'end': 17.65625, 'confidence': 0.99853516, 'speaker': 0, 'speaker_confidence': 0.56206894, 'punctuated_word': 'more'}, {'word': 'than', 'start': 17.65625, 'end': 17.90625, 'confidence': 0.9995117, 'speaker': 0, 'speaker_confidence': 0.56206894, 'punctuated_word': 'than'}, {'word': 'ten', 'start': 17.90625, 'end': 18.09375, 'confidence': 0.9941406, 'speaker': 0, 'speaker_confidence': 0.56206894, 'punctuated_word': 'ten'}, {'word': 'year', 'start': 18.09375, 'end': 18.34375, 'confidence': 0.9902344, 'speaker': 0, 'speaker_confidence': 0.56206894, 'punctuated_word': 'year'}, {'word': 'experience', 'start': 18.34375, 'end': 18.84375, 'confidence': 0.9995117, 'speaker': 0, 'speaker_confidence': 0.56206894, 'punctuated_word': 'experience'}, {'word': 'in', 'start': 18.90625, 'end': 19.140625, 'confidence': 0.99853516, 'speaker': 0, 'speaker_confidence': 0.56206894, 'punctuated_word': 'in'}, {'word': 'php', 'start': 19.140625, 'end': 19.640625, 'confidence': 0.90063477, 'speaker': 0, 'speaker_confidence': 0.56206894, 'punctuated_word': 'PHP.'}], 'speaker': 0, 'id': '2555d72a-3eb6-4b03-9417-424c452d7780'}, {'start': 20.421875, 'end': 22.5625, 'confidence': 0.7983747, 'channel': 0, 'transcript': 'During my experience, I worked on PHP.', 'words': [{'word': 'during', 'start': 20.421875, 'end': 20.703125, 'confidence': 0.9951172, 'speaker': 0, 'speaker_confidence': 0.56206894, 'punctuated_word': 'During'}, {'word': 'my', 'start': 20.703125, 'end': 20.90625, 'confidence': 0.9506836, 'speaker': 0, 'speaker_confidence': 0.56206894, 'punctuated_word': 'my'}, {'word': 'experience', 'start': 20.90625, 'end': 21.40625, 'confidence': 0.96850586, 'speaker': 0, 'speaker_confidence': 0.56206894, 'punctuated_word': 'experience,'}, {'word': 'i', 'start': 21.46875, 'end': 21.625, 'confidence': 0.9975586, 'speaker': 0, 'speaker_confidence': 0.56206894, 'punctuated_word': 'I'}, {'word': 'worked', 'start': 21.625, 'end': 21.90625, 'confidence': 0.4501953, 'speaker': 0, 'speaker_confidence': 0.56206894, 'punctuated_word': 'worked'}, {'word': 'on', 'start': 21.90625, 'end': 22.0625, 'confidence': 0.9453125, 'speaker': 0, 'speaker_confidence': 0.56206894, 'punctuated_word': 'on'}, {'word': 'php', 'start': 22.0625, 'end': 22.5625, 'confidence': 0.28125, 'speaker': 0, 'speaker_confidence': 0.56206894, 'punctuated_word': 'PHP.'}], 'speaker': 0, 'id': '61a1bffa-9c00-41e5-a0cb-d060f3e2d4ca'}, {'start': 25.96875, 'end': 27.796875, 'confidence': 0.8465925, 'channel': 0, 'transcript': 'So which projects are you working on?', 'words': [{'word': 'so', 'start': 25.96875, 'end': 26.140625, 'confidence': 0.13635254, 'speaker': 0, 'speaker_confidence': 0.019748092, 'punctuated_word': 'So'}, {'word': 'which', 'start': 26.140625, 'end': 26.328125, 'confidence': 0.9736328, 'speaker': 0, 'speaker_confidence': 0.019748092, 'punctuated_word': 'which'}, {'word': 'projects', 'start': 26.328125, 'end': 26.78125, 'confidence': 0.9975586, 'speaker': 0, 'speaker_confidence': 0.0, 'punctuated_word': 'projects'}, {'word': 'are', 'start': 26.78125, 'end': 26.890625, 'confidence': 0.9995117, 'speaker': 0, 'speaker_confidence': 0.0, 'punctuated_word': 'are'}, {'word': 'you', 'start': 26.890625, 'end': 27.015625, 'confidence': 0.9824219, 'speaker': 0, 'speaker_confidence': 0.0, 'punctuated_word': 'you'}, {'word': 'working', 'start': 27.015625, 'end': 27.296875, 'confidence': 0.9975586, 'speaker': 0, 'speaker_confidence': 0.0, 'punctuated_word': 'working'}, {'word': 'on', 'start': 27.296875, 'end': 27.796875, 'confidence': 0.8391113, 'speaker': 0, 'speaker_confidence': 0.0, 'punctuated_word': 'on?'}], 'speaker': 0, 'id': 'f9004e9d-9cde-4a64-b032-4070a409bfb6'}, {'start': 28.375, 'end': 30.03125, 'confidence': 0.87666017, 'channel': 0, 'transcript': "Right now I'm working on? Right now I'm working on", 'words': [{'word': 'right', 'start': 28.375, 'end': 28.65625, 'confidence': 0.86621094, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'Right'}, {'word': 'now', 'start': 28.65625, 'end': 28.703125, 'confidence': 0.99316406, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'now'}, {'word': "i'm", 'start': 28.703125, 'end': 28.75, 'confidence': 0.77368164, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': "I'm"}, {'word': 'working', 'start': 28.75, 'end': 28.796875, 'confidence': 0.99853516, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'working'}, {'word': 'on', 'start': 28.796875, 'end': 28.84375, 'confidence': 0.75878906, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'on?'}, {'word': 'right', 'start': 28.84375, 'end': 28.890625, 'confidence': 0.62402344, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'Right'}, {'word': 'now', 'start': 28.890625, 'end': 28.9375, 'confidence': 0.99121094, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'now'}, {'word': "i'm", 'start': 28.9375, 'end': 29.171875, 'confidence': 0.76293945, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': "I'm"}, {'word': 'working', 'start': 29.171875, 'end': 29.53125, 'confidence': 0.99902344, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'working'}, {'word': 'on', 'start': 29.53125, 'end': 30.03125, 'confidence': 0.99902344, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'on'}], 'speaker': 0, 'id': 'f6d42af6-2873-4bac-8083-4179fcb9d676'}, {'start': 30.65625, 'end': 32.34375, 'confidence': 0.83808595, 'channel': 0, 'transcript': 'single project. That is it.', 'words': [{'word': 'single', 'start': 30.65625, 'end': 31.015625, 'confidence': 0.74316406, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'single'}, {'word': 'project', 'start': 31.015625, 'end': 31.5, 'confidence': 0.7504883, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'project.'}, {'word': 'that', 'start': 31.5, 'end': 31.65625, 'confidence': 0.8725586, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'That'}, {'word': 'is', 'start': 31.65625, 'end': 31.84375, 'confidence': 0.95654297, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'is'}, {'word': 'it', 'start': 31.84375, 'end': 32.34375, 'confidence': 0.8676758, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'it.'}], 'speaker': 0, 'id': 'dec794ef-efb7-47ac-84f7-0a69a179acf9'}, {'start': 32.96875, 'end': 33.71875, 'confidence': 0.640625, 'channel': 0, 'transcript': 'that is', 'words': [{'word': 'that', 'start': 32.96875, 'end': 33.21875, 'confidence': 0.38134766, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'that'}, {'word': 'is', 'start': 33.21875, 'end': 33.71875, 'confidence': 0.89990234, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'is'}], 'speaker': 0, 'id': '1fe6444e-bdde-4f77-9b86-39bf61f0f322'}, {'start': 34.0625, 'end': 34.5625, 'confidence': 0.87939453, 'channel': 0, 'transcript': 'ASR', 'words': [{'word': 'asr', 'start': 34.0625, 'end': 34.5625, 'confidence': 0.87939453, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'ASR'}], 'speaker': 0, 'id': '6cf98d9a-55f4-4683-9b0e-30edc887eb7a'}, {'start': 35.4375, 'end': 36.71875, 'confidence': 0.7133789, 'channel': 0, 'transcript': 'means speech recognition', 'words': [{'word': 'means', 'start': 35.4375, 'end': 35.9375, 'confidence': 0.6923828, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'means'}, {'word': 'speech', 'start': 35.9375, 'end': 36.21875, 'confidence': 0.52246094, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'speech'}, {'word': 'recognition', 'start': 36.21875, 'end': 36.71875, 'confidence': 0.92529297, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'recognition'}], 'speaker': 0, 'id': '17cb8e5d-03e2-481d-b19c-4d145e37502d'}, {'start': 37.71875, 'end': 40.125, 'confidence': 0.8987165, 'channel': 0, 'transcript': 'on that. Which challenges are you facing?', 'words': [{'word': 'on', 'start': 37.71875, 'end': 37.84375, 'confidence': 0.5708008, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'on'}, {'word': 'that', 'start': 37.84375, 'end': 38.34375, 'confidence': 0.9433594, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'that.'}, {'word': 'which', 'start': 38.625, 'end': 38.875, 'confidence': 0.9770508, 'speaker': 0, 'speaker_confidence': 0.0, 'punctuated_word': 'Which'}, {'word': 'challenges', 'start': 38.875, 'end': 39.34375, 'confidence': 0.8046875, 'speaker': 0, 'speaker_confidence': 0.57842153, 'punctuated_word': 'challenges'}, {'word': 'are', 'start': 39.34375, 'end': 39.5, 'confidence': 0.9995117, 'speaker': 0, 'speaker_confidence': 0.57842153, 'punctuated_word': 'are'}, {'word': 'you', 'start': 39.5, 'end': 39.625, 'confidence': 0.99902344, 'speaker': 0, 'speaker_confidence': 0.57842153, 'punctuated_word': 'you'}, {'word': 'facing', 'start': 39.625, 'end': 40.125, 'confidence': 0.99658203, 'speaker': 0, 'speaker_confidence': 0.57842153, 'punctuated_word': 'facing?'}], 'speaker': 0, 'id': '40604b98-dace-46af-b209-809840722b07'}, {'start': 41.71875, 'end': 45.6875, 'confidence': 0.9342374, 'channel': 0, 'transcript': "Right now, I'm facing some challenges related to the speaker changes.", 'words': [{'word': 'right', 'start': 41.71875, 'end': 41.96875, 'confidence': 0.63623047, 'speaker': 0, 'speaker_confidence': 0.57842153, 'punctuated_word': 'Right'}, {'word': 'now', 'start': 41.96875, 'end': 42.15625, 'confidence': 0.81274414, 'speaker': 0, 'speaker_confidence': 0.57842153, 'punctuated_word': 'now,'}, {'word': "i'm", 'start': 42.15625, 'end': 42.5625, 'confidence': 0.98950195, 'speaker': 0, 'speaker_confidence': 0.57842153, 'punctuated_word': "I'm"}, {'word': 'facing', 'start': 42.5625, 'end': 43.0, 'confidence': 0.9868164, 'speaker': 0, 'speaker_confidence': 0.57842153, 'punctuated_word': 'facing'}, {'word': 'some', 'start': 43.0, 'end': 43.25, 'confidence': 0.9995117, 'speaker': 0, 'speaker_confidence': 0.57842153, 'punctuated_word': 'some'}, {'word': 'challenges', 'start': 43.25, 'end': 43.75, 'confidence': 0.9995117, 'speaker': 0, 'speaker_confidence': 0.57842153, 'punctuated_word': 'challenges'}, {'word': 'related', 'start': 44.03125, 'end': 44.53125, 'confidence': 0.98583984, 'speaker': 0, 'speaker_confidence': 0.57842153, 'punctuated_word': 'related'}, {'word': 'to', 'start': 44.53125, 'end': 44.625, 'confidence': 0.9995117, 'speaker': 0, 'speaker_confidence': 0.57842153, 'punctuated_word': 'to'}, {'word': 'the', 'start': 44.625, 'end': 44.84375, 'confidence': 0.9716797, 'speaker': 0, 'speaker_confidence': 0.57842153, 'punctuated_word': 'the'}, {'word': 'speaker', 'start': 44.84375, 'end': 45.1875, 'confidence': 0.9902344, 'speaker': 0, 'speaker_confidence': 0.57842153, 'punctuated_word': 'speaker'}, {'word': 'changes', 'start': 45.1875, 'end': 45.6875, 'confidence': 0.9050293, 'speaker': 0, 'speaker_confidence': 0.57842153, 'punctuated_word': 'changes.'}], 'speaker': 0, 'id': 'a005d2b7-9517-488f-82d4-9d66d3187e6d'}, {'start': 46.34375, 'end': 47.8125, 'confidence': 0.96622723, 'channel': 0, 'transcript': 'Like, when I', 'words': [{'word': 'like', 'start': 46.34375, 'end': 46.84375, 'confidence': 0.9001465, 'speaker': 0, 'speaker_confidence': 0.57842153, 'punctuated_word': 'Like,'}, {'word': 'when', 'start': 47.0625, 'end': 47.3125, 'confidence': 0.99902344, 'speaker': 0, 'speaker_confidence': 0.57842153, 'punctuated_word': 'when'}, {'word': 'i', 'start': 47.3125, 'end': 47.8125, 'confidence': 0.9995117, 'speaker': 0, 'speaker_confidence': 0.5727463, 'punctuated_word': 'I'}], 'speaker': 0, 'id': 'ed305b95-e17c-4da2-a11f-263615a605ad'}, {'start': 48.5625, 'end': 51.8125, 'confidence': 0.8953044, 'channel': 0, 'transcript': 'transcribe the video into text format.', 'words': [{'word': 'transcribe', 'start': 48.5625, 'end': 49.0625, 'confidence': 0.9313965, 'speaker': 0, 'speaker_confidence': 0.5727463, 'punctuated_word': 'transcribe'}, {'word': 'the', 'start': 49.3125, 'end': 49.56, 'confidence': 0.9580078, 'speaker': 0, 'speaker_confidence': 0.5727463, 'punctuated_word': 'the'}, {'word': 'video', 'start': 49.8125, 'end': 50.3125, 'confidence': 0.9995117, 'speaker': 0, 'speaker_confidence': 0.5727463, 'punctuated_word': 'video'}, {'word': 'into', 'start': 50.59375, 'end': 51.0, 'confidence': 0.91308594, 'speaker': 0, 'speaker_confidence': 0.5727463, 'punctuated_word': 'into'}, {'word': 'text', 'start': 51.0, 'end': 51.3125, 'confidence': 0.7885742, 'speaker': 0, 'speaker_confidence': 0.5727463, 'punctuated_word': 'text'}, {'word': 'format', 'start': 51.3125, 'end': 51.8125, 'confidence': 0.78125, 'speaker': 0, 'speaker_confidence': 0.5727463, 'punctuated_word': 'format.'}], 'speaker': 0, 'id': 'f521d234-ad31-4ea6-9443-16e74f40b1cc'}, {'start': 52.6875, 'end': 55.46875, 'confidence': 0.7671596, 'channel': 0, 'transcript': 'Sometimes the speak speaker labeling are wrong.', 'words': [{'word': 'sometimes', 'start': 52.6875, 'end': 53.1875, 'confidence': 0.9589844, 'speaker': 0, 'speaker_confidence': 0.5727463, 'punctuated_word': 'Sometimes'}, {'word': 'the', 'start': 53.1875, 'end': 53.375, 'confidence': 0.49975586, 'speaker': 0, 'speaker_confidence': 0.5727463, 'punctuated_word': 'the'}, {'word': 'speak', 'start': 53.375, 'end': 53.625, 'confidence': 0.84521484, 'speaker': 0, 'speaker_confidence': 0.5727463, 'punctuated_word': 'speak'}, {'word': 'speaker', 'start': 53.84375, 'end': 54.25, 'confidence': 0.71191406, 'speaker': 0, 'speaker_confidence': 0.5727463, 'punctuated_word': 'speaker'}, {'word': 'labeling', 'start': 54.25, 'end': 54.71875, 'confidence': 0.9682617, 'speaker': 0, 'speaker_confidence': 0.5727463, 'punctuated_word': 'labeling'}, {'word': 'are', 'start': 54.71875, 'end': 54.96875, 'confidence': 0.48291016, 'speaker': 0, 'speaker_confidence': 0.5727463, 'punctuated_word': 'are'}, {'word': 'wrong', 'start': 54.96875, 'end': 55.46875, 'confidence': 0.9030762, 'speaker': 0, 'speaker_confidence': 0.098423064, 'punctuated_word': 'wrong.'}], 'speaker': 0, 'id': '87a5b8f7-069c-4ffd-ac46-0ac464cdc7e0'}, {'start': 56.40625, 'end': 56.90625, 'confidence': 0.77368164, 'channel': 0, 'transcript': 'Okay.', 'words': [{'word': 'okay', 'start': 56.40625, 'end': 56.90625, 'confidence': 0.77368164, 'speaker': 0, 'speaker_confidence': 0.098423064, 'punctuated_word': 'Okay.'}], 'speaker': 0, 'id': 'f70db353-b977-4cda-94f8-819dfdb53505'}, {'start': 58.0625, 'end': 59.6875, 'confidence': 0.94954425, 'channel': 0, 'transcript': "I'm disconnecting the", 'words': [{'word': "i'm", 'start': 58.0625, 'end': 58.40625, 'confidence': 0.95996094, 'speaker': 0, 'speaker_confidence': 0.098423064, 'punctuated_word': "I'm"}, {'word': 'disconnecting', 'start': 58.40625, 'end': 58.90625, 'confidence': 0.99902344, 'speaker': 0, 'speaker_confidence': 0.098423064, 'punctuated_word': 'disconnecting'}, {'word': 'the', 'start': 59.1875, 'end': 59.6875, 'confidence': 0.88964844, 'speaker': 0, 'speaker_confidence': 0.098423064, 'punctuated_word': 'the'}], 'speaker': 0, 'id': '2697a274-547d-4486-abb3-9b5e7229a40c'}]}}

Sentiment (Listen) supported

  • old param analyze_sentiment should be kept for backwards compatibility cases.
  • new param sentiment added for sentiment analysis for /listen requests.

Not able to install deepgram-sdk on Mac

Unable to install deepgram-sdk and import it

I've tried running the following commands
pip install deepgram-sdk
as well as
pip3 install deepgram-sdk
to install it using Python3

I even checked the libraries installed by typing:
pip list
and
pip3 list

And I was able to see that deepgram-sdk 0.3.0 was installed on both.
But I am still not able to import it.

Please help

Topics (Listen) supported

  • old param detect_topics should be kept for backwards compatibility cases.
  • new param topics added for topic detection for /listen requests.

Version bump automation

Proposed changes

We need the version to update automatically with the CI/CD pipeline.

Context

We want to keep the version in sync without having to manually type in the new version every time there is an update.

Possible Implementation

Possibly a github action

Other information

This may be updated in the setup.py file, or it may be updated in the entry point file (init.py)

Deepgram Python speech Live. __call__ does not start coroutine for sending

What is the current behavior?

deepgram.transcription.live()

Code in some cases ignoring send() call.
So recognition does not start up to finish() call or resulting timeout.

Steps to reproduce

If the main data sending loop is in the coroutine too.

Expected behavior

Expected to be working like in the example

Deepgram Python speech Live. call does not start coroutine for sending

Please tell us about your environment

We want to make sure the problem isn't specific to your operating system or programming language.

  • Operating System/Version: Linux, any
  • Language: Python
  • Browser: n/a

Other information

After extensive debugging was found that at def __call__ there is line

asyncio.create_task(self._start())

Which creates a coroutine, but it never actually start if the main loop is also coroutine and no asyncio.sleep or similar was called. If the coroutine is not suspended or waiting, a new task is not starting. create_task() is not thread.start !

Suggested change: after that line add
await asyncio.sleep(0)

Or change the code sample for streaming to do deepgramLive.send() via await

Include webvtt and srt formatting

Proposed changes

Provide methods to transform a Deepgram prerecorded transcription response into WebVTT or SRT captions
We've decided to do this as a separate package

Context

This feature was available in V2 as to_WebVTT() or .to_SRT(). We need to continue to provide this feature in V3

Possible Implementation

See V2 implementation: https://github.com/deepgram/deepgram-python-sdk/blob/main/deepgram/extra.py

Take this V2 implementation and create a standalone python package to create captions. The standalone package can be a dependency in this SDK so users can use it from the SDK, or they can use it independently.

Other information

Deepgram timeouts

Exception: DG: 408, message='Request Timeout', url=URL('https://api.deepgram.com/v1/listen?smart_format=true&utterances=true&summarize=v2&model=video&diarize=true')

Worked fine till a few hours ago.

Python API | Option to add username, secret and key

What is the current behavior?

We consume DG in our production. I have been provided with username, key and secret for testing some features for prod build.

What's happening that seems wrong?

I read the docs, but I don't see examples where I can given key, secret and username. The instantiation example contains api_key and api_url in the params dictionary. By just giving key and url, it throws unautorized exception.

sync_prerecorded doesn't return...

What is the current behavior?

Calling sync_prerecorded sometimes doesn't return, even if the request is processed on dashboard

DEEPGRAM_TRANSCRIPT_API_OPTIONS = {
    "model": "general",
    "tier": "enhanced",
    "utterances": True,
    "punctuate": True,
    "smart_format": True,
    "paragraphs": True,
    "diarize": True,
    "language": "en",
}
deepgram = Deepgram(API_KEY)
with open("test.mp3", "rb") as fp:
    source = {"buffer": fp, "mimetype": "audio/wav"}
    print("printed")
    res = deepgram.transcription.sync_prerecorded(
        source, DEEPGRAM_TRANSCRIPT_API_OPTIONS
    )
    print("not printed")

Expected behavior

sync_prerecorded should return or throw exception...

Please tell us about your environment

  • Operating System/Version: Windows 10 Pro
  • Language: Python 3.10.11

Other information

I'm sending .mp3 files up to 500 MB

Update SDK for pre-Python v3 or notate it is for v3+

Received some feedback on the SDK. If not overly complicated, we should think about making it accessible to more versions of Python. If we are using something that requires 3+, we should denote that in the README.md

Hey for the Python SDK
We should specify it works for Python 3+
Had a prospect just run into an issue running it on 2.9
So I recommend trying on 3.6+, and they said that worked
I think it’s due to an from __future __ import annotations

AttributeError: ‘URLError’ object has no attribute ‘status’

What is the current behavior?

Using the following code, we get an error. This is because Deepgram Nova doesn't support Spanish. However, the error relates to parsing the response incorrectly. Which appears to be a bug.

(not the actual code from the ticket)

from deepgram import Deepgram
import json
import osDEEPGRAM_API_KEY = os.getenv('DEEPGRAM_API_KEY') 

PATH_TO_FILE = 'test.mp3'
MIMETYPE = 'audio/mp3'def main():
    dg_client = Deepgram(DEEPGRAM_API_KEY)
    
    with open(PATH_TO_FILE, 'rb') as audio:
        source = {'buffer': audio, 'mimetype': MIMETYPE}
        options = {"utterances":True,"diarize":True,"utterances":True, "model": "nova", "language": "en-US","paragraphs":True,"smart_format":True }

// ...

        response = dg_client.transcription.sync_prerecorded(source, options)
        print(response['results'])
​
main()

The error:

Traceback (most recent call last):

// ...

AttributeError: ‘URLError’ object has no attribute ‘status’

Update the setup.py file

Proposed changes

The setup.py file is out of date. It needs to be updated for this new version release.

Context

We have changed some of the dependencies, such as using httpx instead of aiohttp to make http requests. The author is also out of date. Basically, this needs a possible rewrite due to the significant number of changes we are making for v3.

Possible Implementation

Research what is needed in a setup.py file and make appropriate changes.

Other information

Need to consider the version - this needs to stay in sync as the SDK is updated.

Cannot import Deepgram

Hi, I've installed deepgram via "pip install deepgram-sdk"

The installation is successful, and the deepgram-sdk appears in my Python Lib/site-packages/ folder with the init.py file inside.

After the successful install, I attempt to import to my script with "from deepgram import Deepgram"

But Eclipse keeps showing "Unresolved import: Deepgram"

Not sure exactly why this is happening, as this is the first time something like this has happened to me. Any help would be greeatly appreciated.

False language detection

Detected language is false

It works fine for language detection in audio file
but sometimes it detects false language of the audio file..
I've multiple english audio file but for some of them it detects language as hindi and russian for the audio files..

Use BlackFormatter for VSCode

Proposed changes

It has been recommended that we add a code formatter to the project:

Black is a common one for Python. It would be nice to run this through a code formatter all in one go, so we don't end up needing to make handfuls of formatting fixes which can clutter PR diffs later.

Context

This will help to keep the project formatted consistently no matter who is contributing.

Possible Implementation

Black

Remove MIME type for Batch

Proposed changes

Our ASR API never requires a MIME type unless you are making a fetch request with a JSON body, in which case it must be application/json.

Context

NA

Possible Implementation

NA

Other information

NA

For captions, use punctuated_word if it exists in the response

Proposed changes

Currently, the to_caption function uses "word" instead of "punctuated_word" to create the captions even when "punctuated_word" exist in the response. I would like that to be changed so "punctuated_word" is used if it exists.

Context

My proposed change is actually the existing behavior of the toSRT function in deepgram-node-sdk.

`ValueError: I/O operation on closed file`

What is the current behavior?

A ValueError: I/O operation on closed file is raised when calling dg_client.transcription.prerecorded and the following conditions satisfy

  • The passed buffer is a stream object (e.g. file)
  • The request failed for some reason (e.g. bad token)

Such an error is misleading. It is a side effect described in expected behaviour.

What's happening that seems wrong?

The SDK automatically reattempts requests if they fail. However, stream objects (often) cannot be re-read from the beginning.

Steps to reproduce

To make it faster to diagnose the root problem. Tell us how can we reproduce the bug.

DEEPGRAM_API_KEY = "BAD_TOKEN"
dg_client = Deepgram(DEEPGRAM_API_KEY)
with open("some_file_that_exists.wav", "rb") as f:
    await dg_client.transcription.prerecorded({"buffer": f, "mimetype": "audio/wav"})

Expected behavior

If the buffer is a stream object, the SDK should not automatically retry the request because streams cannot be directly restarted from the beginning. Retrying the request will cause ValueError: I/O operation on closed file exception because the stream is fully consumed and hence closed after the first attempt. Instead, the request should be only made at most once. If it fails, the real exception is thrown (e.g. in case of bad token, an Unauthorized exception should be thrown).

What would you expect to happen when following the steps above?

Please tell us about your environment

We want to make sure the problem isn't specific to your operating system or programming language.

  • Operating System/Version: Not relevant
  • Language: Python
  • Browser: Not relevant

Other information

Anything else we should know? (e.g. detailed explanation, stack-traces, related issues, suggestions how to fix, links for us to have context, eg. stack overflow, codepen, etc)

Suggested fix

In _request defined in _utils.py of the SDK, there is a retry logic.

async def _request(
    path: str, options: Options,
    method: str = 'GET', payload: Payload = None,
    headers: Optional[Mapping[str, str]] = None
) -> Any:
    # ...
    tries = RETRY_COUNT
    while tries > 0:
        try:
            return await attempt()
        except Exception as exc:
            print(exc)
            tries -= 1
            continue

To fix the problem, check the type of payload. If it is stream-like, only try once by assigning 1 to tries.

Add the PyTest Testing Framework to the Python SDK

Proposed changes

Add the PyTest testing tool to the Python SDK. Once it's added write a test for the the PrerecordedTranscription class and the async def prerecorded function inside the Transcription class, in the transcription.py file. Make sure your tests run and pass locally before committing them. When all your tests pass, please do a pull request and assign @geekchick as a reviewer.

Context

Currently, the Python SDK doesn't have any tests, and these are important. Adding PyTest would be an appreciated contribution so we can have test coverage. This would allow developers using the Python SDK to have a better experience as well as to help our Deepgram developers catch defects before releasing to production.

Why is this change important to you? How would you use it? How can it benefit other users?

This change is significant because we can finally have test coverage.

Idling transcriber

What is the current behavior?

I try to transcribe some audio with the help of LiveTranscription and sometimes, the program idles.

⚠️ I am not really transcribing audio live and I think the issue is related to that. I want to test the performance of DG's streaming API on recorded data. What I have been doing so far is sending audio chunks without trying to emulate live timing (I just send them one after the other, i.e. faster than real time).

From what I have seen, the issue might be related to the LiveTranscription object not receiving the closing response from the server after sending the finish message. Because I send the data faster than real time, their is quite some gap between the moment when I call the finish() method and when the server actually processes it. And this seems to lead to the websocket being closed (I can see this in the debug log) before the client can get/process the closing signal from the server. I am not familiar with websockets so not completely sure here.

Steps to reproduce

Try to send recorded audio data without waiting between two chunks to simulate real time conditions.

Expected behavior

Transcription should proceed normally.

Please tell us about your environment

We want to make sure the problem isn't specific to your operating system or programming language.

  • Operating System/Version: Ubuntu 22.04
  • Language: Python

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.