问题
I am using the last command from this SO answer https://stackoverflow.com/a/54818581/80353
cap()(cd /tmp;rm -f *.vtt;youtube-dl --skip-download --write-auto-sub "$1";sed '1,/^$/d' *.vtt|sed 's/<[^>]*>//g'|awk -F. 'NR%8==1{printf"%s ",$1}NR%8==3'|tee cap)
What this command currently do
- This command will download captions for a youtube video as a .vtt file and
- then print out on the terminal the simplified version of the .vtt file
This command works as described.
How to use this command
In the terminal I will run the above command once and then run cap $youtube_url
What I like to have
I would like to modify the original cap()
function so that the original behavior remains with one extra part
- This command will download captions for a youtube video as a .vtt file (unchanged)
- then print out the simplified version of the .vtt file into another file that's stated as parameter $2 (changed)
How I expect to call the new command
Originally, I would call the original command as
cap $youtube_url
Now I like to do this
cap $youtube_url $relative_or_absolute_path_of_text_or_markdown_file
How do I modify the original cap command to achieve the outcome I want?
回答1:
Considering that you want to see output on screen as well as you want to save output into a output file too, if this is the case could you please try following.
cap()(cd /tmp;rm -f *.vtt;youtube-dl --skip-download --write-auto-sub "$1";sed '1,/^$/d' *.vtt|sed 's/<[^>]*>//g'|awk -F. 'NR%8==1{printf"%s ",$1}NR%8==3'|tee -a "$2")
OR in non-one liner form use:
cap()(cd /tmp;rm -f *.vtt;youtube-dl --skip-download --write-auto-sub "$1";\
sed '1,/^$/d' *.vtt|sed 's/<[^>]*>//g'|awk -F. 'NR%8==1{printf"%s ",$1}NR%8==3'\
|tee -a "$2")
Please make sure that you have provided complete path in your variable eg--> relative_or_absolute_path_of_text_or_markdown_file="/full/path/output_file.txt"
etc just an example. I couldn't test it since I don't have mechanism for vtt files etc in my box.
In case you don't want to print information on screen and simply want to save output into output file then as @oguz ismail's comment use only tee "$2"
not tee -a "$2"
as I shown above.
回答2:
Thank You @KimStacks @RavinderSingh13 @Oguz-Ismail for posting these solutions above and in the previous post
I managed to get results in the .vtt file with youtube-dl --skip-download --write-auto-sub $youtube_url
However, the format of the output is not ideal for my purpose. I have to delete line by line in order to remove the time as well as the /n new line. So I would like to customize the code syntax to fit my requirements.
NOTE: Not sure whether it's a new query or not, so I will post it here for now:
I have tried all the steps suggested in previous post and here as well but I still can not understand:
How to insert the "$youtube_url" inside the code below?
cap()(cd /tmp;rm -f *.vtt;youtube-dl --skip-download --write-auto-sub "$1";\ sed '1,/^$/d' *.vtt|sed 's/<[^>]*>//g'|awk -F. 'NR%8==1{printf"%s ",$1}NR%8==3'\ |tee -a "$2")
I tried editing the numbers from 0 to 3 to -1 in
'NR%8==1{printf"%s ",$1}NR%8==3'
, on both ends but not successfully getting the right format inside the .vtt file. Thus, Is it possible to have:transcripted text printed continously as sentences, rather than each subtitle printed as new lines?
remove printout of start time?
来源:https://stackoverflow.com/questions/59244045/how-to-modify-this-sed-awk-command-so-that-the-output-goes-to-a-file-of-choice