What does the Brown clustering algorithm output mean?

前端 未结 5 832
暗喜
暗喜 2020-12-25 15:05

I\'ve ran the brown-clustering algorithm from https://github.com/percyliang/brown-cluster and also a python implementation https://github.com/mheilman/tan-clustering. And th

5条回答
  •  清酒与你
    2020-12-25 15:21

    If I understand correctly, the algorithm gives you a tree and you need to truncate it at some level to get clusters. In case of those bit strings, you should just take first L characters.

    For example, cutting at the second character gives you two clusters

    10           chased     
    
    11           dog        
    11           mouse      
    11           cat        
    

    At the third character you get

    110           dog        
    
    111           mouse      
    111           cat        
    

    The cutting strategy is a different subject though.

提交回复
热议问题