Is it possible that an issue exists with overlapping transcoding runs, i.e. is it poss

a "job" in one invocation of <a href="https://github.com/m1k1o/go-transco

Questions on simultaneous ffmpeg processes (NVENC job concurrency) and segment buffering behavior about go-transcode HOT 8 OPEN

m1k1o commented on May 30, 2024

Questions on simultaneous ffmpeg processes (NVENC job concurrency) and segment buffering behavior

from go-transcode.

Comments (8)

m1k1o commented on May 30, 2024

There is actually no limitation in number of concurrent transcoding processes. Transcoding is started depending on requested segments. For typical playback it should not be more that one process. But from your logs I see that transcoding jobs were started twice (-ss 120.000000 and -ss 128.000000) so there might be some race condition.

After that, go-transcode runs into some kind of hiccup where from now on until a restart, it only produces timeouts:

This looks like segments were already marked as being transcoded, but since ffmpeg never returned any valid segment they never get response from transcoding prccess. As you correctly pointed out, thisi sa bug: it doesn't handle the ffmpeg failure well.

segmentBufferMin has nothing to do with this, yes. When buffer size reaches this treshold, new transcoding process is spawned with segmentBufferMax segments. So the buffer size after current playing segment should always be within this range.

It would be a nice addition to control number of concurrent transcoding processes.

from go-transcode.

hheimbuerger commented on May 30, 2024

There is actually no limitation in number of concurrent transcoding processes. Transcoding is started depending on requested segments.

I see, then I misunderstood the comment // maximum segments to be transcoded at once. This sounds like simultaneous transcoding processes to me, but you're saying that's not what it means?

Would it be more accurate to call this // maximum number of segments to be kept on disk before oldest is deleted?

For typical playback it should not be more that one process. But from your logs I see that transcoding jobs were started twice (-ss 120.000000 and -ss 128.000000) so there might be some race condition.

In my use case, I have multiple clients requesting parallel segments indeed! (I did increase segmentLength to 8, so that's why 120 is followed by 128.)

So you're saying this isn't currently supported, and I would need to implement some kind of locking mechanism to guard against this request pattern/behavior?

After that, go-transcode runs into some kind of hiccup where from now on until a restart, it only produces timeouts:

This looks like segments were already marked as being transcoded, but since ffmpeg never returned any valid segment they never get response from transcoding prccess. As you correctly pointed out, thisi sa bug: it doesn't handle the ffmpeg failure well.

Gotcha. Probably quite easily fixable, but won't solve the root cause of these segments failing in the first place for me.

segmentBufferMin has nothing to do with this, yes. When buffer size reaches this treshold, new transcoding process is spawned with segmentBufferMax segments. So the buffer size after current playing segment should always be within this range.

I'm not sure I fully understand. All of this is in transcodeFromSegment() in hlsvod/manager.go, correct?

I think where I struggle is that I don't understand what 'buffer' refers to in this context. Is buffer the number of segments cached on disk, before go-transcode starts deleting them?
Because that would have been my next question: I really want to turn that off. Is there a way to configure the system so that it reuses the same generated segments across multiple invocations (right now, it seems to generate a new subfolder in transcode_dir every time the application is restarted. I'll want to reuse these segments after a restart.
Also, I'd prefer to do the segment cleanup externally, and let go-transcode just keep all segments it has ever generated.

Can I accomplish the latter by simply setting segmentBufferMax to a very high number? And segmentBufferMin is how many segments the system will transcode in expectation of future requests for them (essentially 'prefetching'), so that it then ideally can just grab them from disk if the request actually does come in?

from go-transcode.

m1k1o commented on May 30, 2024

I see, then I misunderstood the comment // maximum segments to be transcoded at once. This sounds like simultaneous transcoding processes to me, but you're saying that's not what it means?

It was meant to transcode as: at one -> in one job, not as at once -> simultaneously. So it could say: // maximum segments to be transcoded in one transcoding job.

I think where I struggle is that I don't understand what 'buffer' refers to in this context. Is buffer the number of segments cached on disk, before go-transcode starts deleting them?

You can think of one VOD, that is split to N segments. Each segment is identified by exact time offset withing that media file, and duration. Sum of durations of all segments is duration of whole media file. Those segments are numbered 0..N and can exist (have already been transcoded) or not exists (have not yet been transcoded).

One VOD media can have X clients. Every client can start playing VOD at any segment (or even seek forward/backwards). Depending on where current position of a client is, we need to ensure they will have available those segments. That is that buffer refers to in this context.

Let's say segmentBufferMax=3 and segmentBufferMin=1:
So when you start watching VOD from begining, you requested first segment 0 so we want to transcodeFromSegment. We check that no segemnts are transcoded so we create new transcoding job, that should transcode segmentBufferMax (so that our buffer is full it will be 3 segments). Ideally, transcoding job returns those segments and we save them to disk. When you request the next segment with ID 1, we should have this segment already available. Calling to transcodeFromSegment results to the next transcoding job, because we only have 1 segment in buffer (current index is 1, and we transcoded 0,1,2). So now we transcode following 3 segments (3,4,5).

So you're saying this isn't currently supported, and I would need to implement some kind of locking mechanism to guard against this request pattern/behavior?

All already transcoded segments can be reused by other clients and should not be transcoded twice. So you should not guard against that, if thats not currently the case, this issue should be fixed.

Can I accomplish the latter by simply setting segmentBufferMax to a very high number?

That would mean, first transcoding job would esentially transcode whole video in just one job. But other clients, that want to start from seeked position would have those segments available after that transcoding finishes.

I really want to turn that off. Is there a way to configure the system so that it reuses the same generated segments across multiple invocations

It is configured to clear all segments when closing the app, so we could add config value that contidionally calls clearAllSegments.

from go-transcode.

hheimbuerger commented on May 30, 2024

Thank you so much for taking the time to explain all of this. Super helpful to me! Maybe this could be a nice baseline of some internal developer documentation?…

One more basic question: what do you call a "job" in this context?

I'm currently reading hlsvod.transcoding.TranscodeSegments() and I suspect a "job" in one invocation of that function. Is that accurate?
So a job is a single invocation of ffmpeg, yielding up to segmentBufferMax new segments in one go. (And presumably only ever using one 'GPU job' in the process, because it can be assumed that ffmpeg will not parallelize segment generation on its own.)
Does go-transcode learn about the individual segments as they get completed, or can it only add them to its own repository once ffmpeg and the job have finished?

I don't know how you and others are using go-transcode, but I'm streaming the video from a web browser (using hls.js for HLS support, because Chrome doesn't have that built-in).
One potential problem I see is that the video player will itself is also prefetching segments. The very second the HTML5 video player loads, it goes ahead and requests the first 10 segments or so simultaneously.

What are your thoughts on that? Will that immediately create ten jobs (segments 0–2, 1–3, 2–4, 3–5, etc.)? I could imagine that this might cause a lot of unnecessary transcoding, because presumably the "0–2" job, after having transcoded its first segment, will proceed to segment 1, unaware that the "1–3" job has already transcoded that.

I suspect that is not what's going on, because then I can think of at least three different ways my entire application should immediately fail hard and not work as 'mostly fine' as it currently does. 😉 But I'll need to take a few hours and really analyze browser and transcoder logs to understand why it isn't happening.

from go-transcode.

m1k1o commented on May 30, 2024

a "job" in one invocation of hlsvod.transcoding.TranscodeSegments()

Yes, thats correct.

Does go-transcode learn about the individual segments as they get completed, or can it only add them to its own repository once ffmpeg and the job have finished?

Yes it does learn about the individual segments as they get completed by reading them from channel in real time.

One potential problem I see is that the video player will itself is also prefetching segments.

That might cause problems, but the buffer of player (actual prefetching) should match with go transcodes buffer. So when you know your player will be prefetching 10 segments, segmentBufferMax could be set to 10 to have them available. But it is not requirement.

Will that immediately create ten jobs (segments 0–2, 1–3, 2–4, 3–5, etc.)?

No, it should only spawn 0-2, 3-5, 6-8, 9-11 those jobs. Because that function is storing enqueued segments and not processing them twice. But after further inspeciton now if found, that function transcodeFromSegment is not atomic. Meaning if those requests happen really in the same time, there is race condition between multiple invocations of this function, specifically waitForSegment and isSegmentTranscoded that are called there. It leads to processing a single segment multiple times. Fixing this issue should only leave you with predictable amount of transcoding jobs.

from go-transcode.

hheimbuerger commented on May 30, 2024

Gotcha.

The HTTP server component is really multi-threaded, not coroutine-based or so?

My use case is synchronizing video playback between multiple browsers. So yeah, I'm definitely going to cause these simultaneous requests a lot. 😀

from go-transcode.

m1k1o commented on May 30, 2024

It is multi-threaded, but that is all expected. Simultaneous requests should be fully supported by this program.

I tried to add this mutex, and it looks like it fixed the issue with processing a single request multiple times. Could you try it out?

from go-transcode.

hheimbuerger commented on May 30, 2024

Thank you for all your changes! I'll give it a try next week.

from go-transcode.

Questions on simultaneous ffmpeg processes (NVENC job concurrency) and segment buffering behavior about go-transcode HOT 8 OPEN

Comments (8)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent