Git Product home page Git Product logo

Comments (7)

MuhammadWaseem-DevOps avatar MuhammadWaseem-DevOps commented on August 23, 2024

Anyone? having any idea about this issue and any possible fix?

from react-native-vision-camera.

mrousavy avatar mrousavy commented on August 23, 2024

Hey!

I just spent a few days on thinking about a battleproof timestamp synchronization solution, and I came up with a great idea.
I built a TrackTimeline helper class which represents a video or audio track - it can be started & stopped, paused & resumed, and even supports nesting pauses without issues.

  • The total duration of the video is summed up from the difference between the first and the last actually written timestamps, minus the total duration of all pauses between a video. No more incorrect video.duration! 🥳
  • Whereas before I just had a 4 second timeout if no frames arrive, I now just wait twice the frame latency (a few milliseconds) to ensure no frames are left out at maximum! 🎉
  • A video can be stopped while it is paused without any issues, as a pause call is taken into consideration before stopping 💪
  • A video file's session now exactly starts at the start() timestamp, and ends at the exact timestamp of the last video frame - this ensures there can never be any blank frames in the video, even if the audio track is longer 🤩

This was really complex to built as I had to synchronize timestamps between capture sessions, and the entire thing is a producer model - a video buffer can come like a second or so later than the audio buffer, but I need to make sure the video track starts before the audio track starts, and ends after the audio track ends - that's a huge brainf*ck! 🤯😅

There's also no helper APIs for this on iOS, and it looks like no other Camera framework (not even native Swift/ObjC iOS Camera libraries) support this - they all break when timestamps have a delay (e.g. video stabilization enabled) (or dont even support delays at all) ; so I had to build the thing myself.

Check out this PR and try if it fixes the issue for you; #2948

Thanks! ❤️

from react-native-vision-camera.

mrousavy avatar mrousavy commented on August 23, 2024

I just re-read what you said and it sounds actually intentional - there are situations where a few more frames are being encoded into the video.

This is to ensure the video is longer than the audio, but the video metadata has a flag that specifies the actual duration of the track session - this might cut off a few frames in the start or end.

See AVAssetWriter.startSession(atSourceTime:) / AVAssetWriter.endSession(atSourceTime:)

from react-native-vision-camera.

mrousavy avatar mrousavy commented on August 23, 2024

I think this is either fixed now in 4.2.0, or intentional (depending on the comment above). https://github.com/mrousavy/react-native-vision-camera/releases/tag/v4.2.0

from react-native-vision-camera.

MuhammadWaseem-DevOps avatar MuhammadWaseem-DevOps commented on August 23, 2024

Thank you @mrousavy for the detailed comments and the new logic for accurate duration calculation. However, I'm still encountering the issue I mentioned earlier. Let me provide a more detailed explanation:

  1. Video recording starts.
  2. Each call to let successful = assetWriterInput.append(buffer) returns success, and I am counting the frames written to the file at this point.
  3. Video recording stops.

For example, the count is 184 because assetWriterInput.append(buffer) was called successfully 184 times. The video file's metadata also reflects 184 frames. However, when I decode the recorded file using the Python script I mentioned in my first comment, it shows 183 frames or sometimes 182 frames. The decoded frame count is always less than the number of frames actually written to the file.

Could you suggest a way to fix this discrepancy? I have even tried excluding frames that are before the video starting timestamp by returning false in the start case of the events (if timestamp < event.timestamp).

I need the metadata file frame count to match the decoded frame count because I am recording the timestamp of each frame for later video analysis. The timestamps are recorded in a separate JSON file. So, when 184 buffers are appended, the timestamp count is also 184. But with only 183 or 182 decoded frames, there is a mismatch, and it's unclear which frame was dropped or skipped (whether at the start, middle, or end).

Any assistance to resolve this issue would be greatly appreciated. Thanks!

from react-native-vision-camera.

mrousavy avatar mrousavy commented on August 23, 2024

I think this shouldn't be changed in VisionCamera, but rather in your Python script.

VisionCamera just does add a few frames before or after the video to make sure there are no blanks (because if an audio sample comes after a video sample, it will be a blank frame in the resulting video).

So I guess you just need to make sure to decode only the Frames that are actually within the time range of the track duration.

from react-native-vision-camera.

MuhammadWaseem-DevOps avatar MuhammadWaseem-DevOps commented on August 23, 2024

The decoded frames are always fewer than the frames added by the Vision Camera, never more. I have tried using the ffmpeg -i video.mp4 thumb%04d.jpg -hide_banner command, and the result is the same as with the Python script.

Additionally, the video recorded by Expo Camera does not exhibit any frame discrepancies. I have also tested some random recorded videos from other sources, and none of them show any frame differences.

Do you think this issue can be fixed on the Vision Camera side? Any help would be greatly appreciated.

from react-native-vision-camera.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.