Parsing LIUM Speaker Diarization Output

空扰寡人 提交于 2019-12-23 17:15:22

问题


How can I know which speaker spoke for how much time by using LIUM Speaker Diarization toolkit?

For example, this is my .seg file.

;; cluster S0 [ score:FS = -33.93166562542459 ] [ score:FT = 
-34.24966646974656 ] [ score:MS = -34.05223781565528 ] [ score:MT = 
-34.32834794609819 ] 
Seq06 1 0 237 F S U S0
Seq06 1 2960 278 F S U S0
;; cluster S1 [ score:FS = -33.33289449700619 ] [ score:FT = 
-33.64489165914674 ] [ score:MS = -32.71833169822944 ] [ score:MT = 
-33.380835069917275 ] 
Seq06 1 238 594 M S U S1
Seq06 1 1327 415 M S U S1
Seq06 1 2311 649 M S U S1
;; cluster S2 [ score:FS = -33.354874450638064 ] [ score:FT = 
-33.46618707052516 ] [ score:MS = -32.70702429201772 ] [ score:MT = 
-33.042146088874844 ] 
Seq06 1 832 495 M S U S2
Seq06 1 1742 569 M S U S2

How can I extract the times from this file?


回答1:


In this line

Seq06 1 2960 278 F S U S0

You have

field 1: Seq06 = the show name
field 2: 1 = the channel number
field 3: 2960 = the start of the segment (in features)
field 4: 278 = the length of the segment (in features)
field 5: F = the speaker gender (U=unknown, F=female, M=Male)
field 6: S = the type of band (T=telephone, S=studio)
field 7: U = the type of environment (music, speech only, …)
field 8: S0 = the speaker label

Times are in features, so 2960 is 29.60 seconds (divide by 100 to convert from features seconds). Length is also in features, so your segment length is 2.78 seconds.

Documented in LIUM WIKI



来源:https://stackoverflow.com/questions/45309366/parsing-lium-speaker-diarization-output

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!