Transcribe video with Nvidia Nemo ASR
In this tutorial we will see how to transcribe video file with nvidia nemo asr (speech to text).
Nvidia has multiple open source AI models, one of them is Nemo ASR, and with this we can transcribe any video file without converting it to audio file.
Nvidia nemo asr only accepts mono video files. So convert your video to mono.
Try spliting your audio file into 2 min chunks if you get memory error.
Nemo GitHub link — https://github.com/nvidia/nemo
Nemo version — 1.23.0
python version — 3.10.12
sample video file — https://hostfiles.co/f/f5c7496e
Colab Notebook- https://colab.research.google.com/drive/1RAvqLqqNtcaf3vLgOM4L3Szbk51iQ6WT?usp=sharing
Let’s get started by installing nemo
## Install NeMo
pip uninstall pyarrow
python -m pip install git+https://github.com/NVIDIA/NeMo.git@main#egg=nemo_toolkit[all]
Now we will initialise nemo asr and install the parakeet model for transcription.
import nemo.collections.asr as nemo_asr
asr_model = nemo_asr.models.EncDecRNNTBPEModel.from_pretrained(model_name="nvidia/parakeet-rnnt-0.6b")
Now that you have installed nemo, initialised it and also installed parakeet model, we will download an video file and transcribe it
Video file download link — https://hostfiles.co/f/f5c7496e
ls "/output_first_2_minutes.mp4"
After downloading an audio file, we will transcribe it
transcriptions = asr_model.transcribe(['output_first_2_minutes.mp4'])
print(transcriptions)
asr_model.transcribe() method will transcribe you video file.
You will see below as an output.
(["let's talk about what you do do you work or are you a student yes i am both actually i work part time as well as a student what job do you do i'm an payroll controller i control the payroll in a property management company do you meet interesting people in your job not many but yes sometimes i have to visit the banks and the sites so yes i do how long have you been doing this sort of work for a year and a half now let's go on to talk about friends now are your friends mostly your age or different ages no many different ages why it ranges it's basically because i am not from this place i'm from back nepal and when i came here i met people from my country as well as people from different places at work and at different places so they're all over the place do you usually see your friends during the week or at weekends it depends i see them all over the week i think work friends at work after work drinks and things like that so yeah all throughout the week the last time you saw your friends what did you do together we went out for a movie last saturday there was this interesting movie that we heard was out on due date so we all decided to go for it and it was pretty nice in what ways are your friends important to you oh they are very important to me because i hate to be lonely i need to talk to someone every now and then because i can't just sit not to say do nothing but i need someone to confront and talk to so i hold them i give them a great importance in my life"], ["let's talk about what you do do you work or are you a student yes i am both actually i work part time as well as a student what job do you do i'm an payroll controller i control the payroll in a property management company do you meet interesting people in your job not many but yes sometimes i have to visit the banks and the sites so yes i do how long have you been doing this sort of work for a year and a half now let's go on to talk about friends now are your friends mostly your age or different ages no many different ages why it ranges it's basically because i am not from this place i'm from back nepal and when i came here i met people from my country as well as people from different places at work and at different places so they're all over the place do you usually see your friends during the week or at weekends it depends i see them all over the week i think work friends at work after work drinks and things like that so yeah all throughout the week the last time you saw your friends what did you do together we went out for a movie last saturday there was this interesting movie that we heard was out on due date so we all decided to go for it and it was pretty nice in what ways are your friends important to you oh they are very important to me because i hate to be lonely i need to talk to someone every now and then because i can't just sit not to say do nothing but i need someone to confront and talk to so i hold them i give them a great importance in my life"])
FFMPEG command to convert stero video to mono video
ffmpeg -i input_video.mp4 -c:v copy -af "pan=mono|c0=c0+c1" output_video.mp4
FFMPEG command to split video file into 2 min
ffmpeg -i input_video.mp4 -t 120 -c:v copy -c:a copy output_first_2_minutes.mp4