How to generate 8 bytes unique id from GUID?

后端 未结 10 1659
长情又很酷
长情又很酷 2020-12-04 01:21

I try to use long as unique id within our C# application (not global, and only for one session) for our events. Do you know if the following will generate an unique long id?

相关标签:
10条回答
  • 2020-12-04 01:59

    enerates an 8-byte Ascii85 identifier based on the current timestamp in seconds. Guaranteed unique for each second. 85% chance of no collisions for 5 generated Ids within the same second.

    private static readonly Random Random = new Random();
    public static string GenerateIdentifier()
    {
        var seconds = (int) DateTime.Now.Subtract(new DateTime(1970, 1, 1, 0, 0, 0)).TotalSeconds;
        var timeBytes = BitConverter.GetBytes(seconds);
        var randomBytes = new byte[2];
        Random.NextBytes(randomBytes);
        var bytes = new byte[timeBytes.Length + randomBytes.Length];
        System.Buffer.BlockCopy(timeBytes, 0, bytes, 0, timeBytes.Length);
        System.Buffer.BlockCopy(randomBytes, 0, bytes, timeBytes.Length, randomBytes.Length);
        return Ascii85.Encode(bytes);
    }
    
    0 讨论(0)
  • 2020-12-04 02:02

    No, it won't. A GUID has 128 bit length, a long only 64 bit, you are missing 64 bit of information, allowing for two GUIDs to generate the same long representation. While the chance is pretty slim, it is there.

    0 讨论(0)
  • 2020-12-04 02:11

    You cannot distill a 16-bit value down to an 8-bit value while still retaining the same degree of uniqueness. If uniqueness is critical, don't "roll your own" anything. Stick with GUIDs unless you really know what you're doing.

    If a relatively naive implementation of uniqueness is sufficient then it's still better to generate your own IDs rather than derive them from GUIDs. The following code snippet is extracted from a "Locally Unique Identifier" class I find myself using fairly often. It makes it easy to define both the length and the range of characters output.

    using System.Security.Cryptography;
    using System.Text;
    
    public class LUID
    {
        private static readonly RNGCryptoServiceProvider RandomGenerator = new RNGCryptoServiceProvider();
        private static readonly char[] ValidCharacters = "ABCDEFGHJKLMNPQRSTUVWXYZ23456789".ToCharArray();
        public const int DefaultLength = 6;
        private static int counter = 0;
    
        public static string Generate(int length = DefaultLength)
        {
            var randomData = new byte[length];
            RandomGenerator.GetNonZeroBytes(randomData);
    
            var result = new StringBuilder(DefaultLength);
            foreach (var value in randomData)
            {
                counter = (counter + value) % (ValidCharacters.Length - 1);
                result.Append(ValidCharacters[counter]);
            }
            return result.ToString();
        }
    }
    

    In this instance it excludes 1 (one), I (i), 0 (zero) and O (o) for the sake of unambiguous human-readable output.

    To determine just how effectively 'unique' your particular combination of valid characters and ID length are, the math is simple enough but it's still nice to have a 'code proof' of sorts (Xunit):

        [Fact]
        public void Does_not_generate_collisions_within_reasonable_number_of_iterations()
        {
            var ids = new HashSet<string>();
            var minimumAcceptibleIterations = 10000;
            for (int i = 0; i < minimumAcceptibleIterations; i++)
            {
                var result = LUID.Generate();
                Assert.True(!ids.Contains(result), $"Collision on run {i} with ID '{result}'");
                ids.Add(result);
            }            
        }
    
    0 讨论(0)
  • 2020-12-04 02:12

    As already said in most of the other answers: No, you can not just take a part of a GUID without losing the uniqueness.

    If you need something that's shorter and still unique, read this blog post by Jeff Atwood:
    Equipping our ASCII Armor

    He shows multiple ways how to shorten a GUID without losing information. The shortest is 20 bytes (with ASCII85 encoding).

    Yes, this is much longer than the 8 bytes you wanted, but it's a "real" unique GUID...while all attempts to cram something into 8 bytes most likely won't be truly unique.

    0 讨论(0)
  • 2020-12-04 02:16
    var s = Guid.NewGuid().ToString();
    var h1 = s.Substring(0, s.Length / 2).GetHashCode(); // first half of Guid
    var h2 = s.Substring(s.Length / 2).GetHashCode(); // second half of Guid
    var result = (uint) h1 | (ulong) h2 << 32; // unique 8-byte long
    var bytes = BitConverter.GetBytes(result);
    

    P. S. It's very good, guys, that you are chatting with topic starter here. But what about answers that need other users, like me???

    0 讨论(0)
  • 2020-12-04 02:17

    No, it won't. As highlighted many times on Raymond Chen's blog, the GUID is designed to be unique as a whole, if you cut out just a piece of it (e.g. taking only 64 bytes out of its 128) it will lose its (pseudo-)uniqueness guarantees.


    Here it is:

    A customer needed to generate an 8-byte unique value, and their initial idea was to generate a GUID and throw away the second half, keeping the first eight bytes. They wanted to know if this was a good idea.

    No, it's not a good idea. (...) Once you see how it all works, it's clear that you can't just throw away part of the GUID since all the parts (well, except for the fixed parts) work together to establish the uniqueness. If you take any of the three parts away, the algorithm falls apart. In particular, keeping just the first eight bytes (64 bits) gives you the timestamp and four constant bits; in other words, all you have is a timestamp, not a GUID.

    Since it's just a timestamp, you can have collisions. If two computers generate one of these "truncated GUIDs" at the same time, they will generate the same result. Or if the system clock goes backward in time due to a clock reset, you'll start regenerating GUIDs that you had generated the first time it was that time.


    I try to use long as unique id within our C# application (not global, and only for one session.) for our events. do you know the following will generate an unique long id?

    Why don't you just use a counter?

    0 讨论(0)
提交回复
热议问题