Is it possible to populate a large set at compile time?

后端 未结 3 663
-上瘾入骨i
-上瘾入骨i 2020-12-20 19:56

We have a \'delete all my data\' feature. I\'d like to delete a set of IPs from many many web log files.

Currently at runtime I open a CSV with the IP addresses to d

相关标签:
3条回答
  • 2020-12-20 20:11

    have only a single static binary to deploy

    Inline your entire CSV file using include! or include_str! and then go about the rest of your program as usual.

    use csv; // 1.0.5
    
    static CSV_FILE: &[u8] = include_bytes!("/etc/hosts");
    
    fn main() -> Result<(), Box<dyn std::error::Error>> {
        let mut rdr = csv::ReaderBuilder::new()
            .delimiter(b'\t')
            .from_reader(CSV_FILE);
    
        for result in rdr.records() {
            let record = result?;
            println!("{:?}", record);
        }
    
        Ok(())
    }
    

    See also:

    • Is there a good way to include external resource data into Rust source code?
    0 讨论(0)
  • 2020-12-20 20:21

    The Rust-PHF crate provides compile-time data structures, including (ordered) maps and sets.

    Unfortunately, to date, it does not support initialization of a set of std::net::IpAddr, but can be used with static strings:

    static IP_SET: phf::Set<&'static str> = phf_set! {
        "127.0.0.1",
        "::1",
    };
    
    0 讨论(0)
  • 2020-12-20 20:22

    I would recommend to simply use a Build Script to read the CSV and produce a source file containing the initialized of a standard HashSet with a custom hasher (FxHash, for example).

    This would let you keep the convenience of editing a CSV file, while still baking all the data into a binary. It would require some initialization time (unlike PHF), but the ability to specify a custom hash is quite beneficial.

    Also, depending on the format of IPs in the logs, you may want to store either &'static str or u32; the latter is more efficient (search-wise), but the gain may be negated if a conversion is required.

    0 讨论(0)
提交回复
热议问题