Deep Fake SyncNet using VIDTIMIT dataset
The methods for manipulating videos have advanced significantly in recent years. Additionally, the creation of altered movies that are capable of tricking human eyes has become much simpler. Such material often results in the spread of false news or disinformation.
As part of my study, I attempted to determine whether or not the video has been tampered with; alternatively, you might say that I investigated whether or not the video is authentic. They concentrated on finding the audio-visual synchronisation between the movement of the speaker's lips and the words that were being said in the video. In order to transmit TV shows, they used audio-video synchronisation. They discovered a solution to the lip-sync issue that was language-independent as well as speaker-independent, and they did it without using any labelled data. It's a pretty great scientific study.
I referred to research paper - "https://www.robots.ox.ac.uk/~vgg/publications/2016/Chung16a/chung16a.pdf" I referred to the repository located at "https://github.com/voletiv/syncnet-in-keras" for the modelling and processing functions. For this part of my study, I used the VidTIMIT dataset, which can be found at http://conradsanderson.id.au/vidtimit/.