Creating your own TinyURL

前端 未结 13 2128

I have just found this great tutorial as it is something that I need.

However, after having a look, it seems that this might be inefficient. The way it works is, fir

相关标签:
13条回答
  • 2020-12-13 11:59

    In the database table, there is an index on the unique_chars field, so I don't see why that would be slow or inefficient.

    UNIQUE KEY `unique_chars` (`unique_chars`)
    

    Don't rush to do premature optimization on something that you think might be slow.

    Also, there may be some benefit in a url shortening service that generates random urls instead of sequential urls.

    0 讨论(0)
  • 2020-12-13 11:59

    I have also created small tinyurl service.

    I wrote a script in Python that was generating keys and store in MySQL table named tokens with status U(Unused).

    But, I am doing it in offline mode. I have a corn job on my VPS. It runs a script every 10 minutes. The script check if there are less than 1000 keys in the table, it keep generating keys and inserting them if they are unique and not already exists in the table until the key's count up to 1000.

    For my service, 1000 keys for 10 minutes are more than enough, you can set the timing or number of keys generated according to your need.

    Now when any tiny url needs to be created on my website, my PHP script just fetch any key which is unused from the table and marked its status as T(taken). PHP script does not have to bother about its uniqueness as my python script already populated only unique keys.

    0 讨论(0)
  • 2020-12-13 12:05

    That might work, but the easiest way to accomplish the problem would probably be with hashing. Theoretically speaking, hashing runs in O(1) time, as in, it only has to perform the hash, and then does only one actual hit to the database to retrieve the value. Then, you would introduce complications for checking for hash collisions, but it seems like this is probably what most of the tinyurl providers do. And, a good hash function isn't terribly hard to write.

    0 讨论(0)
  • 2020-12-13 12:10

    Perhaps this is a bit off-answer, but, my general rule for creating always unique keys is simple md5( time() * 100 + rand( 0, 100 ) ); There is a one in 100,000 chance that if two people are using the same service at the same second they will get the same result (nie impossible).

    That said, md5( rand( 0, n ) ) works too.

    0 讨论(0)
  • 2020-12-13 12:11

    I don't know why you'd bother. The premise of the tutorial is to create a "random" URL. If the random space is large enough, then you can simply rely on pure, dumb luck. If you random character space is 62 characters (A-Za-z0-9), the the 4 characters they use, given a reasonable random number generator, is 1 in 62^4, which is 1 in 14,776,336. Five characters is 1 in 916,132,832. So, a conflict is, literally, "1 in a billion".

    Obviously, as the documents fill, your odds increase for the chance of a collision.

    With 10,000 documents, it's 1 in 91,613, almost 1 in 100,000 (for round numbers).

    That means, for every new document, you have a 1 in 91,613 chance of hitting the DB again for another pull on the slot machine.

    It is not deterministic. It's random. It's luck. In theory, you can hit a string of really, really, bad luck and just get collision after collision after collision. Also, it WILL, eventually, fill up. How many URLs do you plan on hashing?

    But if 1 in 91,613 odds isn't good enough, boosting it to 6 chars makes it more than 1 in 5M for 10,000 documents. We're talking almost LOTTO odds here.

    Simply put, make the key big enough (7 characters? 8?) and the problem pretty much "wishes" itself out of existence.

    0 讨论(0)
  • 2020-12-13 12:13

    Couldn't you just trim the hash to the length you wish?

    $tinyURL = substr(md5($longURL . time()),0,4);
    

    Granted, this may not provide as much pseudo randomness as using the entire string length. But, if you hash the long URL concatenated with the time(), wouldn't this be sufficient? Thoughts on using this method? Thanks!

    0 讨论(0)
提交回复
热议问题