How do I count unique grapheme clusters in a string in Rust?

做~自己de王妃 提交于 2020-01-14 09:32:16

问题


For example, for

let n = count_unique_grapheme_clusters("🇧🇷 🇷🇺 🇧🇷 🇺🇸 🇧🇷");
println!("{}", n);

the expected output is (space and three flags: " ", "🇧🇷", "🇷🇺", "🇺🇸"):

4

回答1:


We can use the graphemes method from unicode-segmentation crate to iterate over the grapheme clusters and save them in a HashSet<&str> to filter out the duplicates. Then we get the .len() of the container.

extern crate unicode_segmentation; // 1.2.1

use std::collections::HashSet;

use unicode_segmentation::UnicodeSegmentation;

fn count_unique_grapheme_clusters(s: &str) -> usize {
    let is_extended = true;
    s.graphemes(is_extended).collect::<HashSet<_>>().len()
}

fn main() {
    assert_eq!(count_unique_grapheme_clusters(""), 0);
    assert_eq!(count_unique_grapheme_clusters("a"), 1);
    assert_eq!(count_unique_grapheme_clusters("🇺🇸"), 1);
    assert_eq!(count_unique_grapheme_clusters("🇷🇺é"), 2);
    assert_eq!(count_unique_grapheme_clusters("🇧🇷🇷🇺🇧🇷🇺🇸🇧🇷"), 3);
}

Playground



来源:https://stackoverflow.com/questions/51818497/how-do-i-count-unique-grapheme-clusters-in-a-string-in-rust

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!