Git Product home page Git Product logo

subtitlesparser's People

Contributors

alexpoint avatar anduin2017 avatar gldraphael avatar grumpybear57 avatar guidupuy avatar lafe avatar ontorder avatar oodelally avatar rellikjaeger avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

subtitlesparser's Issues

Parsing as srt returned no srt part

Hello developer, I got the following error when I parsed the SRT file:

System.FormatException
Message : Parsing as srt returned no srt part.
StackTrace :    at SubtitlesParser.Classes.Parsers.SrtParser.ParseStream(Stream srtStream, Encoding encoding)

I guess the SRT file is invalid or corrupt, but I when I use VLC player to play the video, VLC player still shows the subtitle from this SRT file. So I think it's a valid SRT file.

It worked with the other SRT files I have but not for this file. I attach the video and its SRT file for your reference. I also used another package called node srt-to-vtt to convert it, but I got the blank VTT file.

FYI, I use the tool called SubtitleEdit to create the SRT file for this video. Thanks in advance if there's any solution or workaround to fix this issue.

Archive.zip

MicroDvd recognized as SRT

Thanks for your great work!
I am using the universal parser, which is supposed to recognize the format.

var parser = new SubtitlesParser.Classes.Parsers.SubParser();
using (var fileStream = File.OpenRead(pathToSrtFile)){
	var items = parser.ParseStream(fileStream);
}

I use as test a MicroDvd file
https://www.opensubtitles.org/fr/subtitleserve/sub/117068

However it is recognized as a SRT, and therefore I get the following error:
Stream is not a valid Srt format

My current workaround is to use the filename, and GetMostLikelyFormat, which correctly guesses MicroDvd. However since there are many formats using .sub extension, I am not sure how solid this method is.

I would be happy with any help :-)

Is NuGet package assembly signed?

Has the assembly published to NuGet been signed with snk? I see commits in the history indicating that an snk ref was added at one point...

Stream is not in a valid Youtube XML format

YtXmlFormatParser.Parse causes System.ArgumentException: 'Stream is not in a valid Youtube XML format'

The code is as follows

List<SubtitlesParser.Classes.SubtitleItem> subtitleItems;
var ytSubtitlesParser = new SubtitlesParser.Classes.Parsers.YtXmlFormatParser();

using (var stream = new MemoryStream(Encoding.UTF8.GetBytes(subtitles)))
{
     subtitleItems = ytSubtitlesParser.ParseStream(stream, Encoding.UTF8);
}

YouTube captions attached
yt-video-oPnDOxMXlUc.zip

How can I get seconds instead of milliseconds?

Hi there, could you tell me if there's a way to get the result in second instead of millisecond? For example, it converts 00:00:06,848 into 6848. I'd like to see 6,848. Is it possible? Thank you.

Test could not pass

I tried to run the Test project but the files are missing.

Manually copying the files from Test/Content to Test/Content/bin solves the issue.

But this step shall be done automatically.

The decimal numbers of `StartTime` and `EndTime` of `SrtWriter` should be 3 instead of 2.

I found out an issue is that the SrtWriter class returns the StartTime and EndTime with only 2 decimal numbers (fractional part). It leads to the video plays and stops the subtitle earlier a little bit. It should be 3 instead of 2 because I see most subtitles use 3 decimal numbers.
For example, the original subtitle is:

00:00:06.704 --> 00:00:10.538
The ravenous swarm stretches
as far as the eye can see.

After using SrtWriter it becomes:

1
00:00:06,70 --> 00:00:10,53
The ravenous swarm stretches
as far as the eye can see.

I found this issue is from the following line:
SubtitlesParser/Classes/Writers/SrtWriter.cs

...
string formatTimecodeLine()
{
            TimeSpan start = TimeSpan.FromMilliseconds(subtitleItem.StartTime);
            TimeSpan end = TimeSpan.FromMilliseconds(subtitleItem.EndTime);
            return $"{start:hh\\:mm\\:ss\\,ff} --> {end:hh\\:mm\\:ss\\,ff}";
}
...

I fixed it by replacing return $"{start:hh\\:mm\\:ss\\,ff} --> {end:hh\\:mm\\:ss\\,ff}"; into return $"{start:hh\\:mm\\:ss\\,fff} --> {end:hh\\:mm\\:ss\\,fff}";.

So please update it if you think it's right.
Thanks for this awesome plugin and please keep evolving it as you guys are doing amazing job. Cheers.

Updated NuGet

Hey,

Are you going to publish the latest NuGet release?
The actual one is out of date (does not have Writer classes)

Also, you have a null reference exception in SRT parser.
Line 84. Need to add item.PlaintextLines ??= new List<string>();

P.S. Great library.

Nuget Upgrade from 1.4.8 to 1.5.1 Failed

There is a problem with upgrading Nuget package

Could not install package 'SubtitlesParser 1.5.1'. You are trying to install this package into a project that targets '.NETFramework,Version=v4.8', but the package does not contain any assembly references or content files that are compatible with that framework. For more information, contact the package author.

How to convert SRT to VTT or vice versa?

Thanks for this nuget, very useful for me. Just wondering if it can convert SRT to VTT file or vice versa? For example:

var parser = new SubtitlesParser.Classes.Parsers.SubParser();
List<SubtitlesParser.Classes.SubtitleItem> subtitleItems = null;
using (var fileStream = File.OpenRead(pathToSrtFile))
{
	subtitleItems = parser.ParseStream(fileStream);
}

Then how to write out subtitleItems into srt or vtt files like srtFormat.srt or vttFormat.vtt?

At the moment I'm using the Node library named SrtToVtt.js to convert srt to vtt but DotNet Core 3 marks it as Obsolete. So I'm looking for another approach to convert it. Thank you.

Tests fail on master

The tests seem to fail with the following output:

Parsing of file 20140120-074450_invalidSub: FAILURE
System.FormatException: Failed to parse as SubtitlesParser.Classes.SubtitlesFormat ---> System.ArgumentException: Stream is not in a valid Srt format
   at SubtitlesParser.Classes.Parsers.SrtParser.ParseStream(Stream srtStream, Encoding encoding) in C:\Users\galdin\source\repos\SubtitlesParser\SubtitlesParser\Classes\Parsers\SrtParser.cs:line 104
   at SubtitlesParser.Classes.Parsers.SubParser.ParseStream(Stream stream, Encoding encoding, Dictionary`2 subFormatDictionary) in C:\Users\galdin\source\repos\SubtitlesParser\SubtitlesParser\Classes\Parsers\SubParser.cs:line 120
   --- End of inner exception stack trace ---
   at SubtitlesParser.Classes.Parsers.SubParser.ParseStream(Stream stream, Encoding encoding, Dictionary`2 subFormatDictionary) in C:\Users\galdin\source\repos\SubtitlesParser\SubtitlesParser\Classes\Parsers\SubParser.cs:line 125
   at SubtitlesParser.Classes.Parsers.SubParser.ParseStream(Stream stream, Encoding encoding, SubtitlesFormat subFormat) in C:\Users\galdin\source\repos\SubtitlesParser\SubtitlesParser\Classes\Parsers\SubParser.cs:line 85
   at Test.Program.Main(String[] args) in C:\Users\galdin\source\repos\SubtitlesParser\Test\Program.cs:line 26
----------------------
Parsing of file 4989_ES.srt: SUCCESS (1092 items - 0% corrupted)
----------------
----------------------
Parsing of file ccSubs_com_pope-francis-speaks-about-religious-liberty_en.srt: SUCCESS (25 items - 8% corrupted)
----------------
----------------------
Parsing of file Children.of.Men.2006.DVD5.720p.HDDVD.x264-REVEiLLE.srt: SUCCESS (985 items - 0% corrupted)
----------------
----------------------
Parsing of file cloudy.with.a.risk.of.meatballs.ttml: SUCCESS (1109 items - 0% corrupted)
----------------
----------------------
Parsing of file Example With Comments.vtt: SUCCESS (2 items - 0% corrupted)
----------------
----------------------
Parsing of file Fight Club_eng.sub: FAILURE
System.FormatException: Failed to parse as SubtitlesParser.Classes.SubtitlesFormat ---> System.ArgumentException: Stream is not in a valid MicroDVD format
   at SubtitlesParser.Classes.Parsers.MicroDvdParser.ParseStream(Stream subStream, Encoding encoding) in C:\Users\galdin\source\repos\SubtitlesParser\SubtitlesParser\Classes\Parsers\MicroDvdParser.cs:line 111
   at SubtitlesParser.Classes.Parsers.SubParser.ParseStream(Stream stream, Encoding encoding, Dictionary`2 subFormatDictionary) in C:\Users\galdin\source\repos\SubtitlesParser\SubtitlesParser\Classes\Parsers\SubParser.cs:line 120
   --- End of inner exception stack trace ---
   at SubtitlesParser.Classes.Parsers.SubParser.ParseStream(Stream stream, Encoding encoding, Dictionary`2 subFormatDictionary) in C:\Users\galdin\source\repos\SubtitlesParser\SubtitlesParser\Classes\Parsers\SubParser.cs:line 125
   at SubtitlesParser.Classes.Parsers.SubParser.ParseStream(Stream stream, Encoding encoding, SubtitlesFormat subFormat) in C:\Users\galdin\source\repos\SubtitlesParser\SubtitlesParser\Classes\Parsers\SubParser.cs:line 85
   at Test.Program.Main(String[] args) in C:\Users\galdin\source\repos\SubtitlesParser\Test\Program.cs:line 26
----------------------
Parsing of file Game of Thrones - 03x05 - Kissed by Fire.2HD.English.HI.C.orig.Addic7ed.com.srt: FAILURE
System.FormatException: Failed to parse as SubtitlesParser.Classes.SubtitlesFormat ---> System.ArgumentException: Stream is not in a valid Srt format
   at SubtitlesParser.Classes.Parsers.SrtParser.ParseStream(Stream srtStream, Encoding encoding) in C:\Users\galdin\source\repos\SubtitlesParser\SubtitlesParser\Classes\Parsers\SrtParser.cs:line 104
   at SubtitlesParser.Classes.Parsers.SubParser.ParseStream(Stream stream, Encoding encoding, Dictionary`2 subFormatDictionary) in C:\Users\galdin\source\repos\SubtitlesParser\SubtitlesParser\Classes\Parsers\SubParser.cs:line 120
   --- End of inner exception stack trace ---
   at SubtitlesParser.Classes.Parsers.SubParser.ParseStream(Stream stream, Encoding encoding, Dictionary`2 subFormatDictionary) in C:\Users\galdin\source\repos\SubtitlesParser\SubtitlesParser\Classes\Parsers\SubParser.cs:line 125
   at SubtitlesParser.Classes.Parsers.SubParser.ParseStream(Stream stream, Encoding encoding, SubtitlesFormat subFormat) in C:\Users\galdin\source\repos\SubtitlesParser\SubtitlesParser\Classes\Parsers\SubParser.cs:line 85
   at Test.Program.Main(String[] args) in C:\Users\galdin\source\repos\SubtitlesParser\Test\Program.cs:line 26
----------------------
Parsing of file Gangs-of-New-York-DVD-Deadman-2.sub: SUCCESS (748 items - 0% corrupted)
----------------
----------------------
Parsing of file kYB8IZa5AuE.en.ttml: SUCCESS (169 items - 0% corrupted)
----------------
----------------------
Parsing of file No Captions - With comment block.vtt: SUCCESS (No items found!)
----------------------
Parsing of file Orange.is.the.new.black.s01e01.ttml: SUCCESS (813 items - 0% corrupted)
----------------
----------------------
Parsing of file Salvage.SRT: SUCCESS (474 items - 0% corrupted)
----------------
----------------------
Parsing of file The Mentalist - 3x11 - Episode 11.fr.srt: SUCCESS (786 items - 0% corrupted)
----------------
----------------------
Parsing of file timedtext.xml: FAILURE
System.FormatException: Failed to parse as SubtitlesParser.Classes.SubtitlesFormat ---> System.ArgumentException: Stream is not in a valid Srt format
   at SubtitlesParser.Classes.Parsers.SrtParser.ParseStream(Stream srtStream, Encoding encoding) in C:\Users\galdin\source\repos\SubtitlesParser\SubtitlesParser\Classes\Parsers\SrtParser.cs:line 104
   at SubtitlesParser.Classes.Parsers.SubParser.ParseStream(Stream stream, Encoding encoding, Dictionary`2 subFormatDictionary) in C:\Users\galdin\source\repos\SubtitlesParser\SubtitlesParser\Classes\Parsers\SubParser.cs:line 120
   --- End of inner exception stack trace ---
   at SubtitlesParser.Classes.Parsers.SubParser.ParseStream(Stream stream, Encoding encoding, Dictionary`2 subFormatDictionary) in C:\Users\galdin\source\repos\SubtitlesParser\SubtitlesParser\Classes\Parsers\SubParser.cs:line 125
   at SubtitlesParser.Classes.Parsers.SubParser.ParseStream(Stream stream, Encoding encoding, SubtitlesFormat subFormat) in C:\Users\galdin\source\repos\SubtitlesParser\SubtitlesParser\Classes\Parsers\SubParser.cs:line 85
   at Test.Program.Main(String[] args) in C:\Users\galdin\source\repos\SubtitlesParser\Test\Program.cs:line 26
----------------------
Parsing of file Wiedzmin.s01e01.DVDRip.DreamLair.srt: SUCCESS (233 items - 0% corrupted)
----------------
----------------------

compatibility error

hello sir,
i used this parser in asp net core 2.1 but it doesn't work, i got this error in that line
"using (var fileStream = File.OpenRead(pathToSrtFile))"
i got this error message(IformFile doesn't have definition for "OpenRead")
i replaced (OpenRead with OpenReadStream) but still don't work.
so do you have any suggestion sir?

License?

Hi,

We would like to fork and extend this parser to include WebVTT parsing for a project at our company. However, without a selected license for this project, we're not able to do so. What is your intended license for this parser?

Thanks

The TextWriter writer should not be closed in the WriteStream method

The WriteStream in the SrtWriter method might called multiple times. I think the TextWriter writer should not be closed early in this method.

public void WriteStream(Stream stream, IEnumerable<SubtitleItem> subtitleItems, bool includeFormatting = true)
{
using TextWriter writer = new StreamWriter(stream);
List<SubtitleItem> items = subtitleItems.ToList(); // avoid multiple enumeration since we're using a for instead of foreach
for (int i = 0; i < items.Count; i++)
{
SubtitleItem subtitleItem = items[i];
IEnumerable<string> lines = SubtitleItemToSubtitleEntry(subtitleItem, i + 1, includeFormatting); // add one because subtitle entry numbers start at 1 instead of 0
foreach (string line in lines)
writer.WriteLine(line);
writer.WriteLine(); // empty line between subtitle entries
}
}

Dual license?

Any chance you'd consider dual-licensing under something more permissive, such as an MIT license? I came across this library and think it may be an ideal fit for a current need. Unfortunately, given the gray area around dynamic linkage of GPL libraries (IANAL) I'm afraid it is not something we'll be able to use considering our current license model.

Will certainly fork/contribute any modifications/extensions we distribute if this is something you're willing to do.

TTML Files from Youtube fail

Run this (with http://rg3.github.io/youtube-dl/) to get a TTML file that fails with a null reference.

youtube-dl.exe --write-info-json --write-auto-sub --sub-lang en --sub-format ttml -o %(id)s.mp4 -f mp4 https://www.youtube.com/watch?v=kYB8IZa5AuE

System.ArgumentNullException was unhandled by user code HResult=-2147467261 Message=Value cannot be null. Parameter name: ns ParamName=ns Source=System.Xml.Linq StackTrace: at System.Xml.Linq.XNamespace.op_Addition(XNamespace ns, String localName) at SubtitlesParser.Classes.Parsers.TtmlParser.ParseStream(Stream xmlStream, Encoding encoding)

MIght want a way for SubParser to report which subtitle type it found

I have a situation where I am using subliminal to download subtitles, and it seems to default to naming them .en.srt, but some are NOT SRT, so I used our code to read them and it worked great, but I can't tell it was not a Srt to start with, so have to read/rewrite them all instead of just rewriting the few 'wrong ones'.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.