How can I quickly encode and then compress a short string containing numbers in c#

后端 未结 2 463
粉色の甜心
粉色の甜心 2020-12-19 10:09

I have strings that look like this:

000101456890
348324000433
888000033380

They are strings that are all the same length and they contain o

相关标签:
2条回答
  • 2020-12-19 10:21

    To do it simply, you could consider each as a long (plenty of room there), and hex-encode; that gives you:

    60c1bfa
    5119ba72b1
    cec0ed3264
    

    base-64 would be shorter, but you'd need to look at it as big-endian (note most .NET is little-endian) and ignore leading 0 bytes. That gives you:

    Bgwb+g==
    URm6crE=
    zsDtMmQ=
    

    For example:

        static void Main()
        {
            long x = 000101456890L, y = 348324000433L, z = 888000033380L;
    
            Console.WriteLine(Convert.ToString(x, 16));
            Console.WriteLine(Convert.ToString(y, 16));
            Console.WriteLine(Convert.ToString(y, 16));
    
            Console.WriteLine(Pack(x));
            Console.WriteLine(Pack(y));
            Console.WriteLine(Pack(z));
    
            Console.WriteLine(Convert.ToInt64("60c1bfa", 16).ToString().PadLeft(12, '0'));
            Console.WriteLine(Convert.ToInt64("5119ba72b1", 16).ToString().PadLeft(12, '0'));
            Console.WriteLine(Convert.ToInt64("cec0ed3264", 16).ToString().PadLeft(12, '0'));
    
            Console.WriteLine(Unpack("Bgwb+g==").ToString().PadLeft(12, '0'));
            Console.WriteLine(Unpack("URm6crE=").ToString().PadLeft(12, '0'));
            Console.WriteLine(Unpack("zsDtMmQ=").ToString().PadLeft(12, '0'));
    
        }
        static string Pack(long value)
        {
            ulong a = (ulong)value; // make shift easy
            List<byte> bytes = new List<byte>(8);
            while (a != 0)
            {
                bytes.Add((byte)a);
                a >>= 8;
            }
            bytes.Reverse();
            var chunk = bytes.ToArray();
            return Convert.ToBase64String(chunk);
        }
        static long Unpack(string value)
        {
            var chunk = Convert.FromBase64String(value);
            ulong a = 0;
            for (int i = 0; i < chunk.Length; i++)
            {
                a <<= 8;
                a |= chunk[i];
            }
            return (long)a;
        }
    
    0 讨论(0)
  • 2020-12-19 10:29

    I'm not sure Base 64 is url safe since it has '/' in its index table (the pack function provided in the selected answer will yield non url-safe strings).

    You can consider replacing the '/' symbol by something more url friendly or use another base. Base 62 will do it here, for instance.

    Here is a generic code that translates back and forth from decimal to any numeral base <= 64 (it's probably faster then converting to bytes and then using Convert.ToBase64String()):

    static void Main()
    {
        Console.WriteLine(Decode("101456890", 10));
        Console.WriteLine(Encode(101456890, 62));
        Console.WriteLine(Decode("6rhZS", 62));
        //Result:
        //101456890
        //6rhZS
        //101456890
    }
    
    public static long Decode(string str, int baze)
    {
        long result = 0;
        int place = 1;
        for (int i = 0; i < str.Length; ++i)
        {
            result += Value(str[str.Length - 1 - i]) * place;
            place *= baze;
        }
    
        return result;
    }
    
    public static string Encode(long val, int baze)
    {
        var buffer = new char[64];
        int place = 0;
        long q = val;
        do
        {
            buffer[place++] = Symbol(q % baze);
            q = q / baze;
        }
        while (q > 0);
    
        Array.Reverse(buffer, 0, place);
        return new string(buffer, 0, place);
    }
    
    public static long Value(char c)
    {
        if (c == '+') return 62;
        if (c == '/') return 63;
        if (c < '0') throw new ArgumentOutOfRangeException("c");
        if (c < ':') return c - '0';
        if (c < 'A') throw new ArgumentOutOfRangeException("c");
        if (c < '[') return c - 'A' + 10;
        if (c < 'a') throw new ArgumentOutOfRangeException("c");
        if (c < '{') return c - 'a' + 36;
        throw new ArgumentOutOfRangeException("c");
    }
    
    public static char Symbol(long i)
    {
        if (i < 0) throw new ArgumentOutOfRangeException("i");
        if (i < 10) return (char)('0' + i);
        if (i < 36) return (char)('A' + i - 10);
        if (i < 62) return (char)('a' + i - 36);
        if (i == 62) return '+';
        if (i == 63) return '/';
        throw new ArgumentOutOfRangeException("i");
    }
    
    0 讨论(0)
提交回复
热议问题