Audio Dealization is the process of identifying the person speaking in a defined time.There the main objective was to identify the person speacking along with the time stamp and the transcription of who is speaking
sed presidential election data. The data is in the form of .mp3 audio file and .csv file. We convert the .mp3 audio file into .wav format and trimed it to a shorter length to get better understanding of the data. Created the time series for the following data to create a PCA to find the speakers for the specific time.Then we trained and tested the model with a CNN model to predict and to see how accurate the model performed. For that we calculate the confusion matrix as well as classification report.
Using Deep Learning model such as CNN model the data is trained and tested to fit the model and we get an accuracy of 61%