how to shorten the url in a mathematical approach

夙愿已清 提交于 2019-12-01 21:09:55

The answer, as always, is "it depends". There's a mathematical theory that talks about the "information content" of a bunch of data. If your data is originally strings like this:

lleAgByD2rREjzqj85g68207NsjspdINfPRNvU9udgWw7y4qXh0EQLSy0yEi2

then the information content is much greater than if your strings look like this:

one zero one one zero one zero zero one zero one one zero one

even though the strings are actually the same length. Using compression, you can reduce the number of bits necessary to express the same meaning, but only down to a point. That point depends on the information content of the original message.

It seems unlikely to me that your string of 150 to 250 characters has so little information content that it could be effectively compressed down to 12 characters. You may have to store the longer data in the database and assign a shorter "key" to each data item.

For further reading, one place to start is the Wikipedia article on Information theory.

Is your goal primarily to shorten or to encrypt? You could probably specialize a compression algorithm enough to store a known-character-set URL dramatically, but that would not be effective for encryption purposes. I strongly doubt you could get a cryptographically sound encryption algorithm to achieve the specified level of compression, not to mention you're not discussing the allowable key lengths which might be tied into your scheme.

You did not mention if you are looking to only reduce the URL length or use the shortened URL for something (like tinyurl does). Is that your intent? If that is the case, then you can create the hash for the URL and use that hash internally to map to the actual URL. Then your choice of short length URLs depends on hashing algorithms. Based on your intent you can chose one of the options suggested in the replies.

Sorry, but no way for that. In this case shorten means losing the unique information. Maybe you could generate unique key(hash) for every string, but it will not help you unpack the data, unless there is no dictionary(static information) provided.

Check how ZIP or RAR is working, for example

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!