Optimize large dictionary for reading in .NET [closed]

耗尽温柔 提交于 2019-12-13 21:30:23

问题


I have a list of number-string related couples, there are 22k of them. (it is a MAC address vendor list).

In my code, I'm searching the Vendor name by first three bytes of MAC address.

I know, I can use a dictionary, even array use is possible, but there is the need to initialize the dictionary every time I run the program, but the program uses only small amount of translations (under one percent to the items in dictionary) and initialisation of the dictionary takes a significant amount of time when the program runs.

Can you imagine any other method? In old VB6 there was possibility to read Binary file and seek the records, which would be good enough for me, because I will load only values that I actually need.

I'm prefering an in-project solution - so there is no external file with data. I am trying to use code like:-

Vendors.add("00125A","Microsoft Corporation") 
'... this in another 22000 times '
Vendors.add("00124E","XAC AUTOMATION CORP.")

回答1:


Not sure what the best course of action should be for you or if this will actually help you but..

You seem to seek a way to seek&read a certain record from a structured file.

For this you can define a class the encapsulates the record fields and also the access methods.

Here is an example. On my machine it creates, stores 22k+ records and seeks a few all in around 20ms. Otoh doing 100 random seeks takes 3.5 seconds, obviously because it always start at the begiining. Doing a sequential search is rather fast again..

Of course total time will depend on your machine and how many records you will seek&read..

Here is a record class that holds a byte, a long and a string:

class aRecord
{
    byte aByte { get; set; }
    long aLong { get; set; }
    string aString { get; set; }

    public aRecord() { }

    public aRecord(byte b_, long l_, string s_)
    { aByte = b_; aLong = l_; aString = s_; }

    public void writeToStream(BinaryWriter bw )
    {
        bw.Write(aByte);
        bw.Write(aLong);
        bw.Write(aString);
    }

    public void readFromStream(BinaryReader br)
    {
        aByte = br.ReadByte();
        aLong = br.ReadInt64();
        aString = br.ReadString();
    }

    static public aRecord readFromStream(BinaryReader br, int record)
    {
        int r = 0;
        aRecord  rec = new aRecord();
        br.BaseStream.Position = 0;
        while (br.PeekChar() != -1 & r <= record  )
        {
            rec.readFromStream(br);
            r++;
        }
        return rec;
    }

    static public aRecord readFromStream(BinaryReader br, string search)
    {
        aRecord rec = new aRecord();
        while (br.PeekChar() != -1 )
        {
            rec.readFromStream(br);
            if (rec.aString.Contains(search)) return rec;
        }
        return null;
    }

}

I tested like this:

Console.WriteLine(DateTime.Now.ToString("ss,ffff") + "  init ");

List<aRecord> data = new List<aRecord>();

Random rnd = new Random(9);

int count = 23000;
for (int i = 1000; i < count; i++ )
{
    data.Add(new aRecord((byte)(i%128), i, "X" + rnd.Next(13456).ToString()));
}

Console.WriteLine(DateTime.Now.ToString("ss,ffff") + "  write ");

string fileName = "D:\\_DataStream.dat";

FileStream sw = new FileStream(fileName, FileMode.Create);
BinaryWriter bw = new BinaryWriter(sw);

foreach(aRecord r in data)
{
    r.writeToStream(bw);

}
bw.Flush();
sw.Close();
bw.Close();

FileStream sr = new FileStream(fileName, FileMode.Open);
BinaryReader br = new BinaryReader(sr);

List<aRecord> data2 = new List<aRecord>();
Console.WriteLine(DateTime.Now.ToString("ss,ffff") + "  begin search");
for (int i = 0; i < 100; i++)
{
    aRecord  rec = aRecord.readFromStream(br, "911");
    if (rec != null) data2.Add(rec);
}
Console.WriteLine(DateTime.Now.ToString("ss,ffff") + "  done. found " + data2.Count);


Console.WriteLine(DateTime.Now.ToString("ss,ffff") + "  seek ");

aRecord ar = aRecord.readFromStream(br, 0);
Console.WriteLine(DateTime.Now.ToString("ss,ffff") + " 0 ");

aRecord ar1 = aRecord.readFromStream(br, 1);
Console.WriteLine(DateTime.Now.ToString("ss,ffff") + " 1 ");

aRecord ar2 = aRecord.readFromStream(br, 13000);
Console.WriteLine(DateTime.Now.ToString("ss,ffff") + " 13000 ");

aRecord ar3 = aRecord.readFromStream(br, 23000-1);
Console.WriteLine(DateTime.Now.ToString("ss,ffff") + " 23000 end ");

br.Close();
sr.Close();

Your titel is concerned with optimizing a Dictionary. This depends on what the uses will mainly be: Reading or Writing? If you read a lot in the Dictionary, best create a SortedDictionary. If you need to create many mor entries than you expect to read a nomal Dictionary would be better..

..and there are evne more collection classes, but the first thing is to find out what the true bottleneck is. The above seek&read routine will not waste time inserting the data into a Dictionary but simply discard them until the right record is found. I have also added a search method which continues after each hit at the same position. Expanding the class to suit your own needs is rather simple.

27,2208 init

27,2297 write

27,2438 seek

27,2438 begin search

27,3097 done. found 38

27,3097 0 end

27,3097 1 end

27,3457 13000 end

27,4037 23000 end




回答2:


You can embed the data as a resource and then use an instance of ResourceManager to retrieve the value.

var rm = new ResourceManager(baseName, assembly);
string vendor = rm.GetString(macAddress);

To create the resource file (without having to key it into Visual Studio) you can make an executable that reads your source file and creates a .resources file from it:

string path = Path.GetFullPath(Path.Combine(outputPath, "..\\MyData.resources"));

using (IResourceWriter rsxw = new ResourceWriter(path))
{
    foreach (var x ...)
    {
       rsxw.AddResource(x.name, x.value);
    }
    rsxw.Close();
}

Include this MyData.resources file in your project and it will be compiled in as a resource.



来源:https://stackoverflow.com/questions/39315287/optimize-large-dictionary-for-reading-in-net

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!