Git Product home page Git Product logo

sushi's Introduction

Sushi Build Status

Automatic shifter for SRT and ASS subtitle based on audio streams.

Purpose

Imagine you've got a subtitle file synced to one video file, but you want to use these subtitles with some other video you've got via totally legal means. The common example is TV vs. BD releases, PAL vs. NTSC video and releases in different countries. In a lot of cases, subtitles won't match right away and you need to sync them.

The purpose of this script is to avoid all the hassle of manual syncing. It attempts to synchronize subtitles by finding similarities in audio streams. The script is very fast and can be used right when you want to watch something.

Downloads

The latest Windows binary release can always be found in the releases section. You need the 7z archive in the top entry.

How it works

You need to provide two audio files and a subtitle file that matches one of those files. For every line of the subtitles, the script will extract corresponding audio from the source audio stream and will try to find the closest similar pattern in the destination audio stream, obtaining a shift value which is later applied to the subtitles.

Detailed explanation of Sushi workflow and description of command-line arguments can be found in the wiki.

Usage

The minimal command line looks like this:

python sushi.py --src hdtv.wav --dst bluray.wav --script subs.ass

Output file name is optional - "{destination_path}.sushi.{subtitles_format}" is used by default. See the usage page of the wiki for further examples.

Do note that WAV is not the only format Sushi can work with. It can process audio/video files directly and decode various audio formats, provided that ffmpeg is available. For additional info refer to the Demuxing part of the wiki.

Requirements

Sushi should work on Windows, Linux and OS X. Please open an issue if it doesn't. To run it, you have to have the following installed:

  1. Python 2.7.x
  2. NumPy (1.8 or newer)
  3. OpenCV 2.4.x or newer (on Windows putting this file in the same folder as Sushi should be enough, assuming you use x86 Python)

Optionally, you might want:

  1. FFmpeg for any kind of demuxing
  2. MkvExtract for faster timecodes extraction when demuxing
  3. SCXvid-standalone if you want Sushi to make keyframes
  4. Colorama to add colors to console output on Windows

The provided Windows binaries include all required components and Colorama so you don't have to install them if you use the binary distribution. You still have to download other applications yourself if you want to use Sushi's demuxing capabilities.

Installation on Mac OS X

No binary packages are provided for OS X right now so you'll have to use the script form. Assuming you have python 2, pip and homebrew installed, run the following:

brew tap homebrew/science
brew install git opencv
pip install numpy
git clone https://github.com/tp7/sushi
# create a symlink if you want to run sushi globally
ln -s `pwd`/sushi/sushi.py /usr/local/bin/sushi
# install some optional dependencies
brew install ffmpeg mkvtoolnix

If you don't have pip, you can install numpy with homebrew, but that will probably add a few more dependencies.

brew tap homebrew/python
brew install numpy

Installation on Linux

If you have apt-get available, the installation process is trivial.

sudo apt-get update
sudo apt-get install git python python-numpy python-opencv
git clone https://github.com/tp7/sushi
ln -s `pwd`/sushi/sushi.py /usr/local/bin/sushi

Limitations

This script will never be able to property handle frame-by-frame typesetting. If underlying video stream changes (e.g. has different telecine pattern), you might get incorrect output.

This script cannot improve bad timing. If original lines are mistimed, they will be mistimed in the output file too.

In short, while this might be safe for immediate viewing, you probably shouldn't use it to blindly shift subtitles for permanent storing.

sushi's People

Contributors

shinchiro avatar tomato39 avatar tp7 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

sushi's Issues

Audio Support!

I'll try to phrase this in a way that makes sense. Say you have a release from Japan and you want to maintain the original video (source tends to be darker) but you want to include the dub from NA. I would compare the Japanese tracks against each other to determine the offset required for the English dub to properly line up with the Japanese video source.

This is due to the fact that some regions tend to add random delays on some releases or chain link episodes in the same m2ts file for bluray thus making it time consuming to split them without offsetting the audio.

At the moment I am using this program on my subtitles to gauge the offset required and then shoving the value in eac3to to offset my audio to match the new video stream. This would be an amazing feature to add to Sushi!

Lead-in disappear

The problem I spotted is while sushi attempt to get best match, it will remove the lead-in in dialogue line. After shifted, some dialogue line start accurately when voice appear while in original script the dialogue line start before voice appear.

A dirty workaround is to put ~100-200ms lead-in after it calculate the diff but its not ideal solution

Audio Fingerprint

How about using audio fingerprints for comparing the two audio files? Do you don't use acousting fingerprinting because of the short duration of subtitle lines?

"This application has requested the Runtime to terminate it in an unusual way"

Hi
Whenever I'm trying to run the script, I'm getting this error:

This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.

And I'm also getting an "Automatic subtitle shifter based on audio stopped working" window. I tried reinstalling Visual C++ runtime and it didn't help. Too bad nothing tells me the name of runtime that stopped working, so I could investigate more.
This happens after wav files are loaded. Demuxing and converting works fine.
Any ideas?

Deal with unsupported operand type(s) for -: 'NoneType' and 'float'

Hello,
Can you deal in this port with error like

  File "sushi/__main__.py", line 139, in parse_args_and_run
    run(args)
  File "sushi/__init__.py", line 635, in run
    calculate_shifts(src_stream, dst_stream, search_groups,
  File "sushi/__init__.py", line 430, in calculate_shifts
    shift = new_time - original_time
TypeError: unsupported operand type(s) for -: 'NoneType' and 'float'

Or issues like #33, #32
Thanks

Exception thrown when shift is 0

Exception thrown when shift is 0:

DEBUG: 0:00:01.19-0:00:02.66: shift: 0.000000000000, diff: 0.002688802546
DEBUG: 0:00:04.32-0:00:07.07: shift: 0.000000000000, diff: 0.002874625614
DEBUG: 0:00:07.39-0:00:09.66: shift: 0.000000000000, diff: 0.002503768541
DEBUG: 0:00:09.66-0:00:10.79: shift: 0.000000000000, diff: 0.002807354787
DEBUG: 0:00:11.15-0:00:12.64: shift: 0.000000000000, diff: 0.002250483725
DEBUG: 0:00:13.56-0:00:14.86: shift: 0.000000000000, diff: 0.001567815663
DEBUG: 0:00:15.03-0:00:15.85: shift: 0.000000000000, diff: 0.001549927867
DEBUG: 0:00:17.85-0:00:19.76: shift: 0.000000000000, diff: 0.002090329072
DEBUG: 0:00:19.90-0:00:20.76: shift: 0.000000000000, diff: 0.001576033654
DEBUG: 0:00:21.18-0:00:23.31: shift: 0.000000000000, diff: 0.002960107289
DEBUG: 0:00:23.70-0:00:25.95: shift: 0.000000000000, diff: 0.001852342742
DEBUG: 0:00:26.12-0:00:28.41: shift: 0.000000000000, diff: 0.002203550423
DEBUG: 0:00:29.77-0:00:33.89: shift: 0.000000000000, diff: 0.001649969025
DEBUG: 0:00:34.67-0:00:37.25: shift: 0.000000000000, diff: 0.002339180093
DEBUG: 0:00:37.65-0:00:39.25: shift: 0.000000000000, diff: 0.002496596426
DEBUG: 0:00:39.75-0:00:43.15: shift: 0.000000000000, diff: 0.001602056785
DEBUG: 0:00:43.40-0:00:45.73: shift: 0.000000000000, diff: 0.001993585844
DEBUG: 0:00:45.84-0:00:47.31: shift: 0.000000000000, diff: 0.003977150191
DEBUG: 0:00:47.73-0:00:51.47: shift: 0.000000000000, diff: 0.001395597355
DEBUG: 0:00:51.97-0:00:54.01: shift: 0.000000000000, diff: 0.002481085947
DEBUG: 0:00:54.20-0:00:55.75: shift: 0.000000000000, diff: 0.002320632571
DEBUG: 0:00:58.02-0:00:58.98: shift: 0.000000000000, diff: 0.002469737083
DEBUG: 0:00:59.59-0:01:01.21: shift: 0.000000000000, diff: 0.002223457908
DEBUG: 0:01:02.26-0:01:04.93: shift: 0.000000000000, diff: 0.004037849605
DEBUG: 0:01:09.48-0:01:10.83: shift: 0.000000000000, diff: 0.003072518157
DEBUG: 0:01:12.93-0:01:14.63: shift: 0.000000000000, diff: 0.001762418309
DEBUG: 0:01:14.78-0:01:15.34: shift: 0.000000000000, diff: 0.002920626896
DEBUG: 0:01:15.74-0:01:19.14: shift: 0.000000000000, diff: 0.001746656490
DEBUG: 0:01:19.56-0:01:22.23: shift: 0.000000000000, diff: 0.001691341284
DEBUG: 0:01:22.34-0:01:23.45: shift: 0.000000000000, diff: 0.001279613003
DEBUG: 0:01:23.83-0:01:24.81: shift: 0.000000000000, diff: 0.002340727253
DEBUG: 0:01:25.26-0:01:28.55: shift: 0.000000000000, diff: 0.002036283957
DEBUG: 0:01:29.14-0:01:30.99: shift: 0.000000000000, diff: 0.001907807891
DEBUG: 0:01:31.79-0:01:33.19: shift: 0.000000000000, diff: 0.002912359079
DEBUG: 0:01:33.78-0:01:36.97: shift: 0.000000000000, diff: 0.001371849445
DEBUG: 0:01:38.76-0:01:42.01: shift: 0.000000000000, diff: 0.001710817683
DEBUG: 0:01:42.39-0:01:43.88: shift: 0.000000000000, diff: 0.001745783142
DEBUG: 0:01:44.13-0:01:44.60: shift: 0.000000000000, diff: 0.001525572385
DEBUG: 0:01:44.85-0:01:47.70: shift: 0.000000000000, diff: 0.001737685176
DEBUG: 0:01:47.70-0:01:49.38: shift: 0.000000000000, diff: 0.002091805451
DEBUG: 0:01:53.50-0:01:57.60: shift: 0.000000000000, diff: 0.001467410126
DEBUG: 0:01:57.72-0:02:00.96: shift: 0.000000000000, diff: 0.001545166830
DEBUG: 0:02:01.27-0:02:05.32: shift: 0.000000000000, diff: 0.001293733832
DEBUG: 0:02:09.12-0:02:10.47: shift: 0.000000000000, diff: 0.001630330808
DEBUG: 0:03:51.06-0:03:53.83: shift: 0.000000000000, diff: 0.001127057709
DEBUG: 0:03:54.36-0:03:58.35: shift: 0.000000000000, diff: 0.000927194254
DEBUG: 0:03:59.77-0:04:01.81: shift: 0.000000000000, diff: 0.001083734096
DEBUG: 0:04:02.34-0:04:04.56: shift: 0.000000000000, diff: 0.000963501865
DEBUG: 0:04:04.56-0:04:06.80: shift: 0.000000000000, diff: 0.000431542780
DEBUG: Fixing 0 events near start
DEBUG: Fixing 0 events near end
INFO: Group (start: 0:00:01.19, end: 0:04:06.56, lines: 50), shifts (start: 0.0, end: 0.0, average: 0.0)
Traceback (most recent call last):
File "sushi.py", line 714, in
File "sushi.py", line 706, in parse_args_and_run
File "sushi.py", line 602, in run
File "sushi.py", line 276, in snap_groups_to_keyframes
File "numpy\core\fromnumeric.pyo", line 2716, in mean
File "numpy\core_methods.pyo", line 62, in _mean
TypeError: unsupported operand type(s) for +: 'float' and 'NoneType'

Custom charset support for subtitles

Demux/open/write subtitles with user-specified charset.

Something like this should work for FFmpeg (cp1252 example):

args.extend(['-scodec', 'copy', '-sub_charenc', 'cp1252'])

And then use cp1252 while opening the file instead of utf-8-sig.

This might not be actually needed as subtitles in weird charsets seem to be quite rare.

Link between window parameter and sample rate

I don't understand something.

I have two audio tracks. The destination track begins 4950ms after the source track, so I shifted my subs like this in the first place :

sushi --src 001.wav --dst 002.wav --script 001.srt

However, the first seconds before the intro are too late in the subs (the rest of the subs after the intro are perfect).

So after reading the wiki, I understood that increasing the window would help.

I tried :

sushi --src 001.wav --dst 002.wav --window 15 --script 001.srt

But the resulting .ass file is the exact same.

So I just took the line from the wiki, adding sample rate, without knowing what I was doing :

sushi --src 001.wav --dst 002.wav --window 15 --sample-rate 24000 --script 001.srt

And now the shifted subtitle file is perfectly synced in the beginning. So :

1) Why do I have to specify a sample rate, what's it's role ?
2) Does it have to be 24KHz everytime ?
3) Also, about the window, I wanted to know what's the unit ? window 15 = 15ms ? 15seconds ?

Sushi failed to execute [traceback and sample provided]

When I try to run Sushi on the fourth episode of a show I have, it gives me the following error:

WARNING: Detected possibly broken segment starting at 0:45:47.56, increasing the window from 10 to 30
Traceback (most recent call last):
  File "sushi.py", line 840, in <module>
  File "sushi.py", line 834, in parse_args_and_run
  File "sushi.py", line 675, in run
  File "sushi.py", line 467, in calculate_shifts
TypeError: unsupported operand type(s) for -: 'NoneType' and 'float'
Failed to execute script sushi

I've re-checked the files and also tried demuxing the audio streams (aac source and flac destination) and using the audio as an input instead of the video files but nothing helped.

Audio files, script and output log here: https://drive.google.com/open?id=1AhdPiWiV6sox5Cl_5p9bOobwJAfH_Yym

Automatically select default script in video

I think its nice if sushi found multiple scripts in a video, it automaticaly select script which marked default/yes option, if only --src-script is not specified. Only when none script marked with default/yes, it throw error as in the current version.

OpenCV Assertion failed

Relevant wavs, script and log here: http://www.mediafire.com/download/mo42o9y398myrxg/smile18.7z

OpenCV Error: Assertion failed (corrsize.height <= img.rows + templ.rows - 1 && corrsize.width <= img.cols + templ.cols - 1) in cv::crossCorr, file ..\..\..\..\opencv\modules\imgproc\src\templmatch.cpp, line 70
Traceback (most recent call last):
  File "sushi.py", line 714, in <module>
  File "sushi.py", line 706, in parse_args_and_run
  File "sushi.py", line 565, in run
  File "sushi.py", line 401, in calculate_shifts
  File "wav.pyo", line 175, in find_substream
cv2.error: ..\..\..\..\opencv\modules\imgproc\src\templmatch.cpp:70: error: (-215) corrsize.height <= img.rows + templ.rows - 1 && corrsize.width <= img.cols + templ.cols - 1 in function cv::crossCorr    

Change main .py shebang line

I wonder if you would mind changing sushi.py's shebang line to #!/usr/bin/env python2.
In any system that has python3 as it's main python interpreter this error will be thrown as things are right now:

Traceback (most recent call last):
  File "/usr/local/bin/sushi", line 9, in <module>
    from itertools import takewhile, izip, chain
ImportError: cannot import name 'izip'

Non-dialogue lines throw off the timing

I processed subtitles for a film, the first few lines were a translation for opening credits.

I checked the first (voiced) line and Sushi throws it off by a minute despite it being delayed by roughly a second in reality.

Aegisub screenshots:
DVD audio with DVD subtitles: http://0x0.st/No.png
BD audio with DVD subtitles: http://0x0.st/Ni.png (the noise right after the audio selection is the voice line in question)
BD audio with Sushi output: http://0x0.st/N-.png (completely off)

After removing lines at the start of the subtitles that corresponded to opening credits, the output was accurate. But the same thing happened later with subtitles for signs and other non-dialogue lines.

Negative start time

Hello,

I used Sushi to adjust some fansubs to my DVD rip and it works fantastically. So much effort saved! The only annoyance is that a Comment from the script gets set to a negative time which breaks pysubs2.

Some of the original script:

[Events]
Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
Comment: 0,0:00:05.00,0:00:25.00,Dialogue,,0000,0000,0000,,Shion no Ou Episode 1
Comment: 0,0:00:26.00,0:00:46.00,Dialogue,,0000,0000,0000,,Translated by Beloculus and Shirokaze





Dialogue: 0,0:03:19.33,0:03:23.88,Dialogue,Glasses Guy,0000,0000,0000,,That's Ishiwatari Shion. Her parents were murdered seven years ago.
Dialogue: 0,0:03:24.76,0:03:30.41,Dialogue,,0000,0000,0000,,Now Yasuoka, the 8th-dan professional kishi, is raising her as his adopted daughter, Yasuoka Shion.
Dialogue: 0,0:03:24.76,0:03:30.41,TL-note,,0000,0000,0000,,[kishi - shougi player; dan - professional rank]
Dialogue: 0,0:03:31.32,0:03:32.62,Dialogue,Guy2,0000,0000,0000,,She looks like she's doing well to me.
Dialogue: 0,0:03:33.49,0:03:37.26,Dialogue,,0000,0000,0000,,From looking at her, you'd never guess what she suffered.
Dialogue: 0,0:03:39.81,0:03:45.12,Dialogue,Glasses Guy,0000,0000,0000,,She was so shocked, she lost the ability to speak.
Dialogue: 0,0:03:53.59,0:03:55.11,Dialogue,OldD00d,0000,0000,0000,,You're early, Shion-chan!
...

The corresponding lines after Sushi:

[Events]
Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
Dialogue: 0,0:03:09.26,0:03:13.81,Dialogue,Glasses Guy,0000,0000,0000,,That's Ishiwatari Shion. Her parents were murdered seven years ago.
Dialogue: 0,0:03:14.69,0:03:20.34,Dialogue,,0000,0000,0000,,Now Yasuoka, the 8th-dan professional kishi, is raising her as his adopted daughter, Yasuoka Shion.
Dialogue: 0,0:03:14.69,0:03:20.34,TL-note,,0000,0000,0000,,[kishi - shougi player; dan - professional rank]
Dialogue: 0,0:03:21.25,0:03:22.59,Dialogue,Guy2,0000,0000,0000,,She looks like she's doing well to me.
Dialogue: 0,0:03:23.42,0:03:27.19,Dialogue,,0000,0000,0000,,From looking at her, you'd never guess what she suffered.
Dialogue: 0,0:03:29.74,0:03:35.05,Dialogue,Glasses Guy,0000,0000,0000,,She was so shocked, she lost the ability to speak.
Dialogue: 0,0:03:43.52,0:03:45.04,Dialogue,OldD00d,0000,0000,0000,,You're early, Shion-chan!
...
Comment: 0,-1:59:54.93,0:00:14.93,Dialogue,,0000,0000,0000,,Shion no Ou Episode 1
Comment: 0,0:00:15.93,0:00:35.93,Dialogue,,0000,0000,0000,,Translated by Beloculus and Shirokaze

There is a warning in the log but I'm not sure it's relevant:

Done reading WAV [Live-eviL]_Shion_no_Ou_-_01_[D9EA7B12]_(HD_1280x720_x264).mkv.sushi.wav in 0.258970022202s
Done reading WAV title00.mkv.sushi.wav in 0.300753831863s
WARNING: Events from 0:03:19.33 to 0:03:32.62 will most likely be broken!
0:03:19.33-0:03:23.88: shift: -10.0749166667, diff: 0.0160427261
0:03:24.76-0:03:30.41: shift: -10.0749166667, diff: 0.0160427261
0:03:31.32-0:03:32.62: shift: -10.0749166667, diff: 0.0160427261
0:03:33.49-0:03:37.26: shift: -10.0749166667, diff: 0.0160427261
0:03:39.81-0:03:45.12: shift: -10.0746666667, diff: 0.0311006922
0:03:53.59-0:03:55.11: shift: -10.0743333333, diff: 0.0179431252
0:03:55.66-0:03:57.36: shift: -10.0742500000, diff: 0.0264426861
0:03:59.96-0:04:01.61: shift: -10.0741666667, diff: 0.0026668445
0:04:01.77-0:04:02.70: shift: -10.0740833333, diff: 0.0133727500
0:04:03.91-0:04:04.99: shift: -10.0740833333, diff: 0.0323387235
0:04:05.20-0:04:06.88: shift: -10.0740000000, diff: 0.0089343116
0:04:10.61-0:04:12.10: shift: -10.0738333333, diff: 0.0110946260
0:04:12.90-0:04:14.44: shift: -10.0737500000, diff: 0.0145030702
0:04:14.71-0:04:18.21: shift: -10.0736666667, diff: 0.0153479120
0:04:18.32-0:04:18.96: shift: -10.0735000000, diff: 0.0106523000
0:04:19.46-0:04:20.94: shift: -10.0735833333, diff: 0.0126280347
0:04:21.35-0:04:23.24: shift: -10.0735000000, diff: 0.0202134904
0:04:30.23-0:04:30.95: shift: -10.0732500000, diff: 0.0054914616
0:04:31.69-0:04:34.03: shift: -10.0731666667, diff: 0.0190706756
0:04:34.34-0:04:36.72: shift: -10.0731666667, diff: 0.0179015696
0:04:37.08-0:04:39.64: shift: -10.0730000000, diff: 0.0108878547
0:04:39.64-0:04:42.00: shift: -10.0729166667, diff: 0.0398460627
0:04:42.39-0:04:43.02: shift: -10.0728333333, diff: 0.0015222272
0:04:44.54-0:04:48.45: shift: -10.0728333333, diff: 0.0169096291
0:04:48.99-0:04:52.40: shift: -10.0726666667, diff: 0.0272329599
0:04:57.73-0:04:58.35: shift: -10.0724166667, diff: 0.0266115982
0:04:59.55-0:05:01.47: shift: -10.0724166667, diff: 0.0128994361
0:05:02.21-0:05:03.95: shift: -10.0723333333, diff: 0.0040576793
0:05:10.64-0:05:14.73: shift: -10.0720000000, diff: 0.0023407829
0:05:16.58-0:05:17.19: shift: -10.0719166667, diff: 0.0046165138
0:05:17.97-0:05:19.43: shift: -10.0718333333, diff: 0.0033922039
0:05:25.86-0:05:26.35: shift: -10.0715833333, diff: 0.0063417838
0:05:27.21-0:05:28.24: shift: -10.0716666667, diff: 0.0056696692
0:05:29.39-0:05:30.58: shift: -10.0715000000, diff: 0.0450871587
0:05:33.07-0:05:34.35: shift: -10.0713333333, diff: 0.0305093545
0:05:34.64-0:05:36.67: shift: -10.0712500000, diff: 0.0113493856
0:05:42.65-0:05:44.53: shift: -10.0710833333, diff: 0.0086083282
0:05:44.53-0:05:46.25: shift: -10.0710833333, diff: 0.0130630946
0:05:46.25-0:05:48.52: shift: -10.0709166667, diff: 0.0061518694
0:05:49.79-0:05:53.67: shift: -10.0707500000, diff: 0.0262013394
0:05:54.59-0:05:57.18: shift: -10.0705833333, diff: 0.0136909895
0:06:01.01-0:06:02.27: shift: -10.0705000000, diff: 0.0161587354
0:06:03.25-0:06:04.37: shift: -10.0704166667, diff: 0.0133024054
0:06:05.93-0:06:10.07: shift: -10.0702500000, diff: 0.0145463217
0:06:19.60-0:06:20.21: shift: -10.0699166667, diff: 0.0328927599
0:06:19.60-0:06:21.96: shift: -10.0699166667, diff: 0.0122795328
0:06:22.51-0:06:24.47: shift: -10.0698333333, diff: 0.0259894282
0:06:29.14-0:06:30.72: shift: -10.0696666667, diff: 0.0404760838
0:06:30.94-0:06:33.11: shift: -10.0695833333, diff: 0.0568382964
0:06:34.25-0:06:35.09: shift: -10.0695000000, diff: 0.0090527572
0:06:37.27-0:06:38.69: shift: -10.0694166667, diff: 0.0662252530
0:06:41.84-0:06:42.90: shift: -10.0693333333, diff: 0.0384984612
0:06:44.47-0:06:46.94: shift: -10.0692500000, diff: 0.0523613170
0:06:48.19-0:06:49.89: shift: -10.0691666667, diff: 0.0353762619
0:06:51.22-0:06:51.90: shift: -10.0690833333, diff: 0.0023499622
0:06:52.35-0:06:53.62: shift: -10.0690000000, diff: 0.0069617578
0:06:55.43-0:06:56.04: shift: -10.0690000000, diff: 0.0184131581
0:06:56.44-0:06:57.31: shift: -10.0689166667, diff: 0.0072663231
0:06:57.44-0:06:58.21: shift: -10.0689166667, diff: 0.0220487416
0:06:58.81-0:06:59.50: shift: -10.0688333333, diff: 0.0027468833
0:07:00.13-0:07:00.82: shift: -10.0687500000, diff: 0.0165236462
0:07:00.82-0:07:01.62: shift: -10.0687500000, diff: 0.0141194491
0:07:02.24-0:07:03.85: shift: -10.0687500000, diff: 0.0115574179
0:07:04.66-0:07:05.95: shift: -10.0686666667, diff: 0.0022046498
0:07:06.62-0:07:07.84: shift: -10.0685833333, diff: 0.0092978003
0:07:16.24-0:07:19.12: shift: -10.0682500000, diff: 0.0147325993
0:07:20.21-0:07:22.66: shift: -10.0681666667, diff: 0.0058768694
0:07:24.66-0:07:26.22: shift: -10.0680833333, diff: 0.0113335755
0:07:26.99-0:07:29.22: shift: -10.0680000000, diff: 0.0180891547
0:07:29.75-0:07:30.32: shift: -10.0679166667, diff: 0.0239957497
0:07:33.18-0:07:35.61: shift: -10.0677500000, diff: 0.0258373059
0:07:36.26-0:07:36.67: shift: -10.0676666667, diff: 0.0174219012
0:07:37.91-0:07:39.51: shift: -10.0676666667, diff: 0.0226128381
0:07:40.18-0:07:41.08: shift: -10.0675833333, diff: 0.0032307969
0:07:42.18-0:07:43.36: shift: -10.0675000000, diff: 0.0198013633
0:07:43.69-0:07:45.44: shift: -10.0675000000, diff: 0.0365006849
0:07:45.62-0:07:46.92: shift: -10.0674166667, diff: 0.0361319780
0:07:47.41-0:07:49.92: shift: -10.0673333333, diff: 0.0169233661
0:07:50.23-0:07:53.01: shift: -10.0672500000, diff: 0.0154723478
0:07:54.53-0:07:56.39: shift: -10.0671666667, diff: 0.0052956082
0:07:58.81-0:08:01.09: shift: -10.0670000000, diff: 0.0046345782
0:08:01.09-0:08:02.14: shift: -10.0670000000, diff: 0.0045773699
0:08:02.60-0:08:05.15: shift: -10.0669166667, diff: 0.0193675105
0:08:07.56-0:08:09.17: shift: -10.0667500000, diff: 0.0073132259
0:08:10.34-0:08:11.52: shift: -10.0666666667, diff: 0.0064636790
0:08:12.28-0:08:14.75: shift: -10.0666666667, diff: 0.0182234142
0:08:14.75-0:08:15.86: shift: -10.0665833333, diff: 0.0089578992
0:08:26.71-0:08:29.00: shift: -10.0661666667, diff: 0.0132888760
0:08:29.65-0:08:31.15: shift: -10.0660833333, diff: 0.0071075330
0:08:31.66-0:08:33.46: shift: -10.0660833333, diff: 0.0080763577
0:08:34.54-0:08:37.35: shift: -10.0659166667, diff: 0.0414445214
0:08:54.15-0:08:55.00: shift: -10.0654166667, diff: 0.0557024665
0:08:56.22-0:08:57.31: shift: -10.0653333333, diff: 0.0178125538
0:08:58.83-0:09:00.08: shift: -10.0652500000, diff: 0.0015951162
0:09:00.94-0:09:01.80: shift: -10.0651666667, diff: 0.0054818452
0:09:02.53-0:09:03.63: shift: -10.0651666667, diff: 0.0214655474
0:09:08.90-0:09:09.40: shift: -10.0649166667, diff: 0.0095842239
0:09:09.96-0:09:10.38: shift: -10.0649166667, diff: 0.0022488737
0:09:11.08-0:09:11.38: shift: -10.0649166667, diff: 0.0029593774
0:09:12.06-0:09:12.57: shift: -10.0649166667, diff: 0.0026305481
0:09:13.79-0:09:14.30: shift: -10.0647500000, diff: 0.0235852730
0:09:14.77-0:09:15.87: shift: -10.0647500000, diff: 0.0023932576
0:09:16.98-0:09:21.23: shift: -10.0646666667, diff: 0.0230068676
0:09:22.38-0:09:22.78: shift: -10.0645000000, diff: 0.0030883974
0:09:24.99-0:09:25.87: shift: -10.0645000000, diff: 0.0212193318
0:09:27.14-0:09:29.86: shift: -10.0644166667, diff: 0.0175663177
0:09:30.88-0:09:31.97: shift: -10.0643333333, diff: 0.0406519696
0:09:32.60-0:09:36.06: shift: -10.0641666667, diff: 0.0122156227
0:09:36.79-0:09:39.05: shift: -10.0640833333, diff: 0.0087634698
0:09:39.72-0:09:41.10: shift: -10.0640000000, diff: 0.0123981312
0:09:41.61-0:09:43.37: shift: -10.0639166667, diff: 0.0072004301
0:09:44.07-0:09:45.35: shift: -10.0638333333, diff: 0.0306197759
0:09:48.41-0:09:50.36: shift: -10.0637500000, diff: 0.0145830279
0:10:04.39-0:10:05.59: shift: -10.0632500000, diff: 0.0061610774
0:10:05.95-0:10:08.10: shift: -10.0632500000, diff: 0.0319278054
0:10:08.10-0:10:10.77: shift: -10.0631666667, diff: 0.0069306106
0:10:12.63-0:10:13.50: shift: -10.0630833333, diff: 0.1404766291
0:10:17.10-0:10:18.59: shift: -10.0301666667, diff: 0.0167840216
0:10:20.22-0:10:22.87: shift: -10.0300833333, diff: 0.0101267258
0:10:23.64-0:10:26.58: shift: -10.0300000000, diff: 0.0170741789
0:10:26.96-0:10:30.35: shift: -10.0298333333, diff: 0.0204968285
0:10:30.98-0:10:33.21: shift: -10.0297500000, diff: 0.0326728150
0:10:37.02-0:10:38.15: shift: -10.0295833333, diff: 0.0061703664
0:10:39.13-0:10:41.31: shift: -10.0295000000, diff: 0.0289994106
0:10:43.35-0:10:44.81: shift: -10.0294166667, diff: 0.0273348819
0:10:46.27-0:10:48.88: shift: -10.0292500000, diff: 0.0163622070
0:10:51.93-0:10:52.93: shift: -10.0292500000, diff: 0.0471388176
0:10:53.31-0:10:55.48: shift: -10.0291666667, diff: 0.0531878211
0:10:55.93-0:10:57.57: shift: -10.0290833333, diff: 0.0109349312
0:10:59.88-0:11:00.94: shift: -10.0289166667, diff: 0.0382964611
0:11:01.43-0:11:02.75: shift: -10.0289166667, diff: 0.0363764241
0:11:03.91-0:11:06.78: shift: -10.0287500000, diff: 0.0186582394
0:11:07.95-0:11:08.79: shift: -10.0286666667, diff: 0.0295403507
0:11:09.40-0:11:09.95: shift: -10.0286666667, diff: 0.0082643619
0:11:12.90-0:11:15.16: shift: -10.0285000000, diff: 0.0094746649
0:11:15.32-0:11:17.97: shift: -10.0284166667, diff: 0.0080757150
0:11:18.44-0:11:19.27: shift: -10.0283333333, diff: 0.0051666670
0:11:19.69-0:11:20.78: shift: -10.0283333333, diff: 0.0223919097
0:11:21.67-0:11:22.50: shift: -10.0283333333, diff: 0.0150837600
0:11:23.15-0:11:24.61: shift: -10.0281666667, diff: 0.0224381350
0:11:26.43-0:11:27.25: shift: -10.0281666667, diff: 0.0033538193
0:11:27.95-0:11:31.57: shift: -10.0280000000, diff: 0.0340700001
0:11:32.49-0:11:33.30: shift: -10.0279166667, diff: 0.0014359219
0:11:34.82-0:11:36.09: shift: -10.0278333333, diff: 0.0372635908
0:11:37.08-0:11:42.39: shift: -10.0276666667, diff: 0.0590576008
0:11:43.31-0:11:44.62: shift: -10.0275833333, diff: 0.0120725846
0:11:45.00-0:11:45.40: shift: -10.0275833333, diff: 0.0042488542
0:11:46.83-0:11:48.40: shift: -10.0275000000, diff: 0.0092206355
0:11:49.03-0:11:50.87: shift: -10.0274166667, diff: 0.0124267805
0:11:59.07-0:12:01.46: shift: -10.0270833333, diff: 0.0082237208
0:12:02.40-0:12:05.58: shift: -10.0270000000, diff: 0.0082223089
0:12:06.92-0:12:09.17: shift: -10.0269166667, diff: 0.0361199118
0:12:10.39-0:12:12.67: shift: -10.0267500000, diff: 0.0112945940
0:12:13.42-0:12:14.64: shift: -10.0266666667, diff: 0.0456666611
0:12:19.77-0:12:20.57: shift: -10.0265000000, diff: 0.0338494740
0:12:23.34-0:12:24.68: shift: -10.0264166667, diff: 0.0430998318
0:12:26.39-0:12:27.14: shift: -10.0263333333, diff: 0.0176890586
0:12:27.97-0:12:28.83: shift: -10.0262500000, diff: 0.0049278289
0:12:29.64-0:12:30.49: shift: -10.0262500000, diff: 0.0619982556
0:12:43.41-0:12:46.12: shift: -10.0257500000, diff: 0.0191510413
0:12:46.88-0:12:49.69: shift: -10.0256666667, diff: 0.0210833773
0:12:50.64-0:12:51.56: shift: -10.0255833333, diff: 0.0083223199
0:12:51.86-0:12:53.52: shift: -10.0255000000, diff: 0.0170511771
0:12:59.85-0:13:01.37: shift: -10.0253333333, diff: 0.0394835435
0:13:02.11-0:13:04.57: shift: -10.0252500000, diff: 0.0323240347
0:13:04.91-0:13:07.08: shift: -10.0251666667, diff: 0.0209357440
0:13:07.82-0:13:11.68: shift: -10.0250000000, diff: 0.0464235023
0:13:12.22-0:13:13.59: shift: -10.0249166667, diff: 0.0055773454
0:13:16.69-0:13:22.76: shift: -10.0247500000, diff: 0.0222932827
0:13:30.04-0:13:30.88: shift: -10.0244166667, diff: 0.0043760557
0:13:31.67-0:13:32.93: shift: -10.0243333333, diff: 0.0176947359
0:13:50.90-0:13:52.41: shift: -10.0238333333, diff: 0.0907037109
0:13:59.69-0:14:00.15: shift: -10.0235833333, diff: 0.0259278230
0:14:02.55-0:14:03.44: shift: -10.0234166667, diff: 0.0051996699
0:14:04.84-0:14:06.93: shift: -10.0233333333, diff: 0.0207122229
0:14:08.80-0:14:12.76: shift: -10.0231666667, diff: 0.0046508517
0:14:15.68-0:14:17.31: shift: -10.0230000000, diff: 0.0148351220
0:14:17.82-0:14:22.02: shift: -10.0229166667, diff: 0.0120503586
0:14:22.52-0:14:24.53: shift: -10.0228333333, diff: 0.0048430841
0:14:24.86-0:14:27.71: shift: -10.0227500000, diff: 0.0504534170
0:14:28.41-0:14:30.13: shift: -10.0226666667, diff: 0.0159432665
0:14:30.97-0:14:32.09: shift: -10.0225833333, diff: 0.0086063212
0:14:32.38-0:14:33.24: shift: -10.0225833333, diff: 0.0450677201
0:14:36.56-0:14:37.92: shift: -10.0224166667, diff: 0.0022939085
0:14:38.60-0:14:39.43: shift: -10.0223333333, diff: 0.0167754367
0:14:39.91-0:14:42.09: shift: -10.0223333333, diff: 0.0095676258
0:14:42.84-0:14:44.67: shift: -10.0222500000, diff: 0.0110375900
0:14:51.20-0:14:53.45: shift: -10.0220000000, diff: 0.0164526850
0:14:53.92-0:14:54.60: shift: -10.0219166667, diff: 0.0208934750
0:14:54.80-0:14:56.33: shift: -10.0218333333, diff: 0.0458712205
0:14:56.54-0:14:57.57: shift: -10.0218333333, diff: 0.0138326194
0:15:00.06-0:15:01.19: shift: -10.0216666667, diff: 0.0615870468
0:15:02.23-0:15:03.57: shift: -10.0216666667, diff: 0.0044092205
0:15:10.01-0:15:11.07: shift: -10.0214166667, diff: 0.0122596761
0:15:35.35-0:15:35.84: shift: -10.0206666667, diff: 0.0013511137
0:15:36.48-0:15:37.56: shift: -10.0205833333, diff: 0.0063833492
0:15:38.50-0:15:39.77: shift: -10.0205833333, diff: 0.0019257308
0:15:40.53-0:15:44.65: shift: -10.0205000000, diff: 0.0109840520
0:15:45.50-0:15:48.06: shift: -10.0203333333, diff: 0.0037477119
0:15:48.71-0:15:49.15: shift: -10.0202500000, diff: 0.0022769524
0:15:49.98-0:15:50.51: shift: -10.0201666667, diff: 0.0036433937
0:15:51.13-0:15:52.05: shift: -10.0201666667, diff: 0.0047909189
0:15:52.39-0:15:53.29: shift: -10.0201666667, diff: 0.0030971407
0:15:53.97-0:15:57.05: shift: -10.0200833333, diff: 0.0099693118
0:15:57.68-0:15:58.16: shift: -10.0200000000, diff: 0.0042637298
0:15:58.49-0:15:59.34: shift: -10.0200000000, diff: 0.0313785188
0:16:24.23-0:16:24.73: shift: -10.0191666667, diff: 0.0601536594
0:16:29.36-0:16:29.86: shift: -10.0190833333, diff: 0.0066142781
0:16:30.62-0:16:31.54: shift: -10.0190000000, diff: 0.0039308965
0:16:32.09-0:16:34.21: shift: -10.0189166667, diff: 0.0406668968
0:16:36.85-0:16:37.26: shift: -10.0188333333, diff: 0.0281934440
0:16:37.97-0:16:38.46: shift: -10.0187500000, diff: 0.0404690020
0:16:46.33-0:16:50.66: shift: -10.0185000000, diff: 0.0113125285
0:16:51.40-0:16:52.09: shift: -10.0184166667, diff: 0.0157139264
0:17:03.84-0:17:05.60: shift: -10.0180000000, diff: 0.0060796263
0:17:06.15-0:17:08.46: shift: -10.0179166667, diff: 0.0084399357
0:17:12.60-0:17:18.95: shift: -10.0177500000, diff: 0.0311517902
0:17:19.84-0:17:22.26: shift: -10.0175833333, diff: 0.0028558623
0:17:22.26-0:17:26.32: shift: -10.0174166667, diff: 0.0245140456
0:17:26.83-0:17:29.49: shift: -10.0172500000, diff: 0.0253188629
0:17:30.43-0:17:32.93: shift: -10.0171666667, diff: 0.0329287946
0:17:33.70-0:17:34.46: shift: -10.0170833333, diff: 0.0072095753
0:17:36.76-0:17:38.15: shift: -10.0170000000, diff: 0.0119782500
0:17:39.01-0:17:40.60: shift: -10.0169166667, diff: 0.0165681001
0:17:42.01-0:17:43.69: shift: -10.0168333333, diff: 0.0097501129
0:17:44.21-0:17:45.79: shift: -10.0168333333, diff: 0.0232035201
0:17:46.01-0:17:46.90: shift: -10.0167500000, diff: 0.0035492375
0:17:48.65-0:17:55.69: shift: -10.0165833333, diff: 0.0390662253
0:17:56.78-0:18:01.44: shift: -10.0163333333, diff: 0.0372110419
0:18:10.97-0:18:12.22: shift: -10.0160000000, diff: 0.0070439945
0:18:12.74-0:18:14.50: shift: -10.0159166667, diff: 0.0187203251
0:18:15.36-0:18:17.52: shift: -10.0159166667, diff: 0.0143903149
0:18:18.13-0:18:18.75: shift: -10.0157500000, diff: 0.0042749494
0:18:21.28-0:18:24.36: shift: -10.0156666667, diff: 0.0127525590
0:18:25.12-0:18:26.73: shift: -10.0156666667, diff: 0.0139098661
0:18:27.44-0:18:30.09: shift: -10.0155000000, diff: 0.0148578677
0:18:34.98-0:18:35.59: shift: -10.0152500000, diff: 0.0164480489
0:18:35.96-0:18:38.95: shift: -10.0152500000, diff: 0.0224525891
0:18:42.67-0:18:43.95: shift: -10.0150833333, diff: 0.0141381361
0:18:47.84-0:18:48.41: shift: -10.0150000000, diff: 0.0188393481
0:18:50.17-0:18:51.63: shift: -10.0148333333, diff: 0.0173311215
0:18:52.02-0:18:52.88: shift: -10.0147500000, diff: 0.0131832510
0:18:53.16-0:18:55.21: shift: -10.0147500000, diff: 0.0160423443
0:19:04.46-0:19:05.29: shift: -10.0144166667, diff: 0.0055841194
0:19:05.90-0:19:09.73: shift: -10.0143333333, diff: 0.0177655090
0:19:10.26-0:19:11.24: shift: -10.0142500000, diff: 0.0128555167
0:19:11.71-0:19:12.47: shift: -10.0141666667, diff: 0.0115818093
0:19:12.98-0:19:14.15: shift: -10.0141666667, diff: 0.0034496426
0:19:15.68-0:19:18.03: shift: -10.0140833333, diff: 0.0252406541
0:19:19.68-0:19:22.67: shift: -10.0140000000, diff: 0.0367866270
0:19:24.22-0:19:28.56: shift: -10.0137500000, diff: 0.0403634533
0:19:30.62-0:19:33.29: shift: -10.0136666667, diff: 0.0130973579
0:19:34.28-0:19:37.25: shift: -10.0135000000, diff: 0.0063206381
0:19:38.06-0:19:39.13: shift: -10.0134166667, diff: 0.0102590453
0:19:39.85-0:19:41.95: shift: -10.0134166667, diff: 0.0065645664
0:20:40.88-0:20:42.26: shift: -10.0115000000, diff: 0.0197005048
0:20:50.70-0:20:52.12: shift: -10.0112500000, diff: 0.0035662001
0:20:52.66-0:20:55.89: shift: -10.0111666667, diff: 0.0194858536
0:20:56.60-0:20:57.12: shift: -10.0111666667, diff: 0.0720029324
0:21:03.59-0:21:04.11: shift: -10.0109166667, diff: 0.0062783645
Chapter start points: [u'0:00:00.00', u'0:01:27.00', u'0:02:56.97', u'0:10:15.12', u'0:21:24.99', u'0:22:54.92']
Fixing 1 border events right before 0:10:13.50
Group (start: 0:00:05.00, end: 0:10:13.50, lines: 121), shifts (start: -10.0749166667, end: -10.0631666667, average: -10.0690672356)
Group (start: 0:10:17.10, end: 0:21:04.11, lines: 144), shifts (start: -10.0301666667, end: -10.0109166667, average: -10.0216322788)
Snapping 0:03:31.32 to keyframes, start time by 0, end: 0.0420672355732
Snapping 0:03:55.66 to keyframes, start time by 0, end: 0.0670672355732
Snapping 0:04:44.54 to keyframes, start time by 0, end: 0.0590672355732
Snapping 0:05:10.64 to keyframes, start time by 0, end: 0.0760672355732
Snapping 0:05:29.39 to keyframes, start time by 0, end: 0.0430672355732
Snapping 0:06:19.60 to keyframes, start time by 0, end: 0.0340672355732
Snapping 0:06:37.27 to keyframes, start time by 0, end: 0.0250672355731
Snapping 0:06:44.47 to keyframes, start time by 0, end: 0.0510672355732
Snapping 0:06:52.35 to keyframes, start time by 0, end: 0.0500672355732
Snapping 0:07:00.82 to keyframes, start time by 0, end: 0.0500672355732
Snapping 0:07:06.62 to keyframes, start time by 0, end: 0.0340672355732
Snapping 0:07:29.75 to keyframes, start time by 0, end: 0.0510672355732
Snapping 0:07:40.18 to keyframes, start time by 0, end: 0.0340672355732
Snapping 0:07:47.41 to keyframes, start time by 0, end: 0.0340672355732
Snapping 0:07:54.53 to keyframes, start time by 0, end: 0.0340672355732
Snapping 0:08:01.09 to keyframes, start time by 0, end: 0.0170672355732
Snapping 0:08:02.60 to keyframes, start time by 0, end: 0.0170672355732
Snapping 0:09:11.08 to keyframes, start time by 0, end: 0.0170672355732
Snapping 0:09:27.14 to keyframes, start time by 0, end: 0.0250672355733
Snapping 0:09:41.61 to keyframes, start time by 0, end: 0.0260672355732
Snapping 0:10:05.95 to keyframes, start time by 0, end: 0.0180672355732
Snapping 0:10:08.10 to keyframes, start time by 0.0180672355732, end: 0
Snapping 0:10:20.22 to keyframes, start time by 0, end: 0.0366322787817
Snapping 0:10:23.64 to keyframes, start time by 0, end: 0.0206322787817
Snapping 0:10:53.31 to keyframes, start time by 0, end: 0.0206322787817
Snapping 0:11:49.03 to keyframes, start time by 0, end: -0.0303677212183
Snapping 0:12:27.97 to keyframes, start time by 0, end: -0.0133677212184
Snapping 0:12:50.64 to keyframes, start time by 0, end: -0.0133677212183
Snapping 0:13:02.11 to keyframes, start time by 0, end: -0.0133677212184
Snapping 0:13:16.69 to keyframes, start time by 0, end: -0.0133677212183
Snapping 0:14:08.80 to keyframes, start time by -0.0133677212184, end: -0.0133677212183
Snapping 0:14:15.68 to keyframes, start time by 0, end: -0.0133677212184
Snapping 0:14:17.82 to keyframes, start time by 0, end: -0.0303677212183
Snapping 0:14:22.52 to keyframes, start time by 0, end: 0.0116322787817
Snapping 0:14:42.84 to keyframes, start time by 0, end: -0.0213677212183
Snapping 0:15:00.06 to keyframes, start time by 0, end: -0.0133677212183
Snapping 0:15:02.23 to keyframes, start time by 0, end: -0.0213677212183
Snapping 0:16:24.23 to keyframes, start time by 0, end: -0.0133677212183
Snapping 0:16:51.40 to keyframes, start time by 0, end: -0.0133677212183
Snapping 0:17:22.26 to keyframes, start time by 0, end: -0.0133677212184
Snapping 0:18:35.96 to keyframes, start time by 0, end: -0.0303677212182
Snapping 0:18:42.67 to keyframes, start time by 0, end: -0.0303677212185
Snapping 0:19:12.98 to keyframes, start time by 0, end: -0.0213677212184
Snapping 0:20:50.70 to keyframes, start time by 0, end: -0.0383677212183
Done in 5.77015280724s

I'm working around the issue by using sed to remove all comments but it would be nice if Sushi would not output negative timestamps.

Problem in Sushi normalization/clipping

I have an audio that the Sushi normalization/clipping does not work.
The problem is that np.median (code) is zero for this audio.

Graphs from the audio that shows why both max_value and min_value medians are 0:

max_value
min_value

When max_value and min_value are zeros the clipping zeroes all the audio, resulting in Sushi not working properly.

auto/make option for timecodes?

I believe timecodes has --dst-timecodes or --src-timecodes does not has this option resulting timecodes must extracted every time if sushi process a file many times (if the path is not specified)

Sushi changing SRT File Encoding

When I ran Sushi with SRT subtitles that were encoded as UTF-8-BOM Sushi saved the output subtitles as simply UTF-8. When trying to open this file in Aegisub it gave the following error:
image
This is easily fixed by changing the encoding of the file. Weirdly enough, this only seems to happen on SRT subtitles, when I ran Sushi on .ass subtitles which used UTF-8-BOM it did not change the encoding.
Here's the original and sushi output if needed:
sushi.zip

Consider multiple candidates when searching for audio substream

Right now when Sushi searches for audio substream in the destination audio, it only considers the best match, even though OpenCV calculates the diff value for every possible candidate. There is no reason why we can't use this info for more accurate postprocessing.

The idea is to remember multiple best candidates so that during postprocessing we could check if replacing the selected shift with some of the other candidates would make the value more similar to its surroundings.

Working implementation below.

splits = np.array_split(result[0], 50)
len_so_far = 0
candidates = []
for split in splits:
    min_index = np.argmin(split)
    candidates.append((min_index + len_so_far, split[min_index]))
    len_so_far += len(split)
candidates.sort(key=lambda x: x[1])
candidates = candidates[:10]

We split the entire diff array into 50 ranges, find the best match in all of them and then select 10 best matches from those 50. The best match in the entire array is stored in candidates[0].

While this does find correct shift for some of the test, it still fails on many problematic cases with a lot of silence in the audio stream. Better ways of improving search accuracy might be preferable.

[Q] --test-shift-plot or graph?

I've read some sushi codes in order to understand how it works. One thing I wonder is the "--test-shift-plot" args in the parser. How I can generate the graph?

ERROR: Source file doesn't exist

hi when i type the command i have this error --src 12.ac3 --dst 1.ac3 --script 1.ass
Sushi's running with arguments: --src 12.ac3 --dst 1.ac3 --script 1.ass
ERROR: Source file doesn't exist

thanks

Better logging

Add verbose logging with all relevant and irrelevant info that might help finding issues.
Move 0:20:44.38-0:20:45.55: shift: -16.157750000000, diff: 0.500385522842 lines to the verbose log.

Script encoded as UTF-8 without BOM

Currently, sushi produced script which encoded with UTF-8 without BOM. Such character is like 'โ€”' (U+2014). When you drag the script inside player, it still display correctly. But after muxed the script with video, the character no longer display correctly. Encoding it as UTF-8 will solve the problem

Allow multiple subtitles to be processed

Currently I'm running Sushi for every subtitle separately, by doing so it wastes time redoing most of processing. By allowing multiple subtitles to be processed it can improve performance when working with multiple subtitles timed to the same source.

Interacting thought every script inputted is a way of implementing it, but I think it's better splitting shift calculation from event remapping into two separate functions/commands: the first working only with the audio files and the second taking the result of the first function and a script, then doing the event remapping. I think it would make the code easier to understand and maintain as, currently, run handles a lot of things.

Python 3 migration

Python 2 is not supported at many places and it's getting harder to meet the dependencies for it now

specify start for combined episodes

Hi,
Thanks a lot for this great tool!

I have a use-case that does not seem to work, assume I have

  • Episode 1.mp4 (duration 12 minutes Opening +episode content +ending) + episode 1.srt
  • Episode 2.mp4 (duration 12 minutes Opening +episode content +ending) + episode 2.srt

Now if I switch to a combined raw that has ~23 minutes: Opening +episode 1 content+ +episode 2 content +ending, the tool doesn't seem to be able to sync the srt file of episode 2.
I personally assumed it will just sync the op translation, then when comparing audio of episode 2 it won't find any matches during the first ~10 minutes then it would start matching for the second half of the video but seems such behavior with a "blank" mid-clip is not supported... or do we have some flag to instruct sushi to try matching starting from minute: 11 for example?

Thanks

WAV Loading Fails

I'm trying to sync the subs from one encode to a different encode using Sushi. After loading both of the MKV files and extracting the audio as WAVs, Sushi simply closes with a critical error that it couldn't load the source WAV. The full log is here. The WAV files that ffmpeg extracted (converted to FLAC), as well as the script, are available here. I have also tried the latest version (instead of 0.4), but it doesn't make a difference; I still get the same error.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.