You are right in my opinion the best way to do it is to create images from the canvas's data and then compile all those images into a video via a module (fluent-ffmpeg
for example, which is a node js package). It's pretty easy but be careful of the FPS (frame rate) if you create those images as fast as you can you may change the fps of your video, for example if you use recursively requestAnimationFrame()
you will be at 60fps. So, instead of reading a html5 video you should set the time each 1/30s (for example if you want a 30fps video) and create a image from the currentTime until the end of the video. And if you have not only one canvas, if you apply animations on your video via multiple canvas you could create a new empty canvas and draw all canvas's data on it to create only one image instead of one image for each canvas.