Merging multiple video files with ffmpeg and xfade filter

问题

I need to merge multiple video files (with included audio) into a single video. I've noticed xfade has been recently released and used it but I am running into an audio sync issue.

All videos are in the same format / resolution / fame and bitrate / etc both for video and audio.

Here is what I am using to merge 5 videos of various durations with 0.5 crossfade transitions:

ffmpeg \
-i v0.mp4 \
-i v1.mp4 \
-i v2.mp4 \
-i v3.mp4 \
-i v4.mp4 \
-filter_complex \
"[0][1]xfade=transition=fade:duration=0.5:offset=3.5[V01]; \
 [V01][2]xfade=transition=fade:duration=0.5:offset=32.75[V02]; \
 [V02][3]xfade=transition=fade:duration=0.5:offset=67.75[V03]; \
 [V03][4]xfade=transition=fade:duration=0.5:offset=98.75[video]; \
 [0:a][1:a]acrossfade=d=0.5:c1=tri:c2=tri[A01]; \
 [A01][2:a]acrossfade=d=0.5:c1=tri:c2=tri[A02]; \
 [A02][3:a]acrossfade=d=0.5:c1=tri:c2=tri[A03]; \
 [A03][4:a]acrossfade=d=0.5:c1=tri:c2=tri[audio]" \
-vsync 0 -map "[video]" -map "[audio]" out.mp4

The code above generates a video with audio. The first and second segment is aligned with audio but starting with the second transition the sound is misaligned.

回答1:

Your offsets are incorrect. Try:

ffmpeg -i v0.mp4 -i v1.mp4 -i v2.mp4 -i v3.mp4 -i v4.mp4 -filter_complex \
"[0][1]xfade=transition=fade:duration=0.5:offset=3.5[V01]; \
 [V01][2]xfade=transition=fade:duration=0.5:offset=12.1[V02]; \
 [V02][3]xfade=transition=fade:duration=0.5:offset=15.1[V03]; \
 [V03][4]xfade=transition=fade:duration=0.5:offset=22.59,format=yuv420p[video]; \
 [0:a][1:a]acrossfade=d=0.5:c1=tri:c2=tri[A01]; \
 [A01][2:a]acrossfade=d=0.5:c1=tri:c2=tri[A02]; \
 [A02][3:a]acrossfade=d=0.5:c1=tri:c2=tri[A03]; \
 [A03][4:a]acrossfade=d=0.5:c1=tri:c2=tri[audio]" \
-map "[video]" -map "[audio]" -movflags +faststart out.mp4

How to get xfade offset values:

input	input duration	+	previous xfade `offset`	-	xfade `duration`	=
`v0.mp4`	4.00	+	0	-	0.5	3.5
`v1.mp4`	9.19	+	3.5	-	0.5	12.1
`v2.mp4`	3.41	+	12.1	-	0.5	15.1
`v3.mp4`	7.99	+	15.1	-	0.5	22.59

See xfade and acrossfade filter documentation for more info.

回答2:

Automating the process will help deal with errors in calculating the offsets. I created a Python script that makes the calculation and builds a graph for any size list of input videos:

https://gist.github.com/royshil/369e175960718b5a03e40f279b131788

It will check the lengths of the video files (with ffprobe) to figure out the right offsets.

The crux of the matter is to build the filter graph and calculating the offsets:

# Prepare the filter graph
video_fades = ""
audio_fades = ""
last_fade_output = "0:v"
last_audio_output = "0:a"
video_length = 0
for i in range(len(segments) - 1):
    # Video graph: chain the xfade operator together
    video_length += file_lengths[i]
    next_fade_output = "v%d%d" % (i, i + 1)
    video_fades += "[%s][%d:v]xfade=duration=0.5:offset=%.3f[%s]; " % \
        (last_fade_output, i + 1, video_length - 1, next_fade_output)
    last_fade_output = next_fade_output

    # Audio graph:
    next_audio_output = "a%d%d" % (i, i + 1)
    audio_fades += "[%s][%d:a]acrossfade=d=1[%s]%s " % \
        (last_audio_output, i + 1, next_audio_output, ";" if (i+1) < len(segments)-1 else "")
    last_audio_output = next_audio_output

It may produce a filter graph such as

[0:v][1:v]xfade=duration=0.5:offset=42.511[v01]; 
[v01][2:v]xfade=duration=0.5:offset=908.517[v12]; 
[v12][3:v]xfade=duration=0.5:offset=1098.523[v23]; 
[v23][4:v]xfade=duration=0.5:offset=1234.523[v34]; 
[v34][5:v]xfade=duration=0.5:offset=2375.523[v45]; 
[v45][6:v]xfade=duration=0.5:offset=2472.526[v56]; 
[v56][7:v]xfade=duration=0.5:offset=2659.693[v67]; 
[0:a][1:a]acrossfade=d=1[a01]; 
[a01][2:a]acrossfade=d=1[a12]; 
[a12][3:a]acrossfade=d=1[a23]; 
[a23][4:a]acrossfade=d=1[a34]; 
[a34][5:a]acrossfade=d=1[a45]; 
[a45][6:a]acrossfade=d=1[a56]; 
[a56][7:a]acrossfade=d=1[a67]

来源：https://stackoverflow.com/questions/63553906/merging-multiple-video-files-with-ffmpeg-and-xfade-filter

标签

video

ffmpeg

concatenation

cross-fade