Shortest possible encoded string with decode possibility (shorten url) using only PHP

前端 未结 13 1779
甜味超标
甜味超标 2020-12-28 19:14

I\'m looking for a method that encodes an string to shortest possible length and lets it be decodable (pure PHP, no SQL). I have working sc

13条回答
  •  萌比男神i
    2020-12-28 19:52

    A lot has been said about how encoding doesn't help security so I am just concentrating on the shortening and aesthetics.

    Rather than thinking of it as a string, you could consider it as 3 individual components. Then if you limit your code space for each component, you can pack things together a lot smaller.

    E.g.

    • path - Only consisting of the 26 chars (a-z) and / - . (Variable length)
    • width - Integer (0 - 65k) (Fixed length, 16 bits)
    • height - Integer (0 - 65k) (Fixed length, 16 bits)

    I'm limiting path to only consist of a maximum 31 characters so we can use 5 bit groupings.

    Pack your fixed length dimensions first, and append each path character as 5 bits. It might also be necessary to add a special null character to fill up the end byte. Obviously you need to use the same dictionary string for encoding and decoding.

    See code below.

    This shows that by limiting what you encode and how much you can encode, you can get a shorter string. You could make it even shorter by using only 12 bit dimension integers (max 2048), or even removing parts of the path if they are known such as base path or file extension (see last example).

    = pow(2,16)) {
            throw new Exception("Width value is too high to encode with 16 bits");
        }
        if ($height >= pow(2,16)) {
            throw new Exception("Height value is too high to encode with 16 bits");
        }
    
        //Pack width, then height first
        $packed = pack("nn", $width, $height);
    
        $path_bits = "";
        foreach (str_split($path) as $ch) {
            $index = array_search($ch, $dictionary, true);
            if ($index === false) {
                throw new Exception("Cannot encode character outside of the allowed dictionary");
            }
    
            $index++; //Add 1 due to index 0 meaning NULL rather than a.
    
            //Work with a bit string here rather than using complicated binary bit shift operators.
            $path_bits .=  str_pad(base_convert($index, 10, 2), 5, "0", STR_PAD_LEFT);
        }
    
        //Remaining space left?
        $modulo = (8 - (strlen($path_bits) % 8)) %8;
    
        if ($modulo >=5) {
            //There is space for a null character to fill up to the next byte
            $path_bits .= "00000";
            $modulo -= 5;
        }
    
        //Pad with zeros
        $path_bits .= str_repeat("0", $modulo);
    
        //Split in to nibbles and pack as a hex string
        $path_bits = str_split($path_bits, 4);
        $hex_string = implode("", array_map(function($bit_string) {
            return base_convert($bit_string, 2, 16);
        }, $path_bits));
        $packed .= pack('H*', $hex_string);
    
        return base64_url_encode($packed);
    }
    
    function decodeImageAndDimensions($str) {
        $dictionary = str_split("abcdefghijklmnopqrstuvwxyz/-.");
    
        $data = base64_url_decode($str);
    
        $decoded = unpack("nwidth/nheight/H*path", $data);
    
        $path_bit_stream = implode("", array_map(function($nibble) {
            return str_pad(base_convert($nibble, 16, 2), 4, "0", STR_PAD_LEFT);
        }, str_split($decoded['path'])));
    
        $five_pieces = str_split($path_bit_stream, 5);
    
        $real_path_indexes = array_map(function($code) {
            return base_convert($code, 2, 10) - 1;
        }, $five_pieces);
    
        $real_path = "";
        foreach ($real_path_indexes as $index) {
            if ($index == -1) {
                break;
            }
            $real_path .= $dictionary[$index];
        }
    
        $decoded['path'] = $real_path;
    
        return $decoded;
    }
    
    //These do a bit of magic to get rid of the double equals sign and obfuscate a bit.  It could save an extra byte.
    function base64_url_encode($input) {
        $trans = array('+' => '-', '/' => ':', '*' => '$', '=' => 'B', 'B' => '!');
        return strtr(str_replace('==', '*', base64_encode($input)), $trans);
    }
    function base64_url_decode($input) {
        $trans = array('-' => '+', ':' => '/', '$' => '*', 'B' => '=', '!' => 'B');
        return base64_decode(str_replace('*', '==',strtr($input,$trans)));
    }
    
    //Example usage
    
    $encoded = encodeImageAndDimensions("/dir/dir/hi-res-img.jpg", 700, 500);
    var_dump($encoded); // string(27) "Arw!9NkTLZEy2hPJFnxLT9VA4A$"
    $decoded = decodeImageAndDimensions($encoded);
    var_dump($decoded); // array(3) { ["width"]=> int(700) ["height"]=> int(500) ["path"]=> string(23) "/dir/dir/hi-res-img.jpg" } 
    
    $encoded = encodeImageAndDimensions("/another/example/image.png", 4500, 2500);
    var_dump($encoded); // string(28) "EZQJxNhc-iCy2XAWwYXaWhOXsHHA"
    $decoded = decodeImageAndDimensions($encoded);
    var_dump($decoded); // array(3) { ["width"]=> int(4500) ["height"]=> int(2500) ["path"]=> string(26) "/another/example/image.png" }
    
    $encoded = encodeImageAndDimensions("/short/eg.png", 300, 200);
    var_dump($encoded); // string(19) "ASwAyNzQ-VNlP2DjgA$"
    $decoded = decodeImageAndDimensions($encoded);
    var_dump($decoded); // array(3) { ["width"]=> int(300) ["height"]=> int(200) ["path"]=> string(13) "/short/eg.png" }
    
    $encoded = encodeImageAndDimensions("/very/very/very/very/very-hyper/long/example.png", 300, 200);
    var_dump($encoded); // string(47) "ASwAyN2LLO7FlndiyzuxZZ3Yss8Rm!ZbY9x9lwFsGF7!xw$"
    $decoded = decodeImageAndDimensions($encoded);
    var_dump($decoded); // array(3) { ["width"]=> int(300) ["height"]=> int(200) ["path"]=> string(48) "/very/very/very/very/very-hyper/long/example.png" } 
    
    $encoded = encodeImageAndDimensions("only-file-name", 300, 200);
    var_dump($encoded); //string(19) "ASwAyHuZnhksLxwWlA$"
    $decoded = decodeImageAndDimensions($encoded);
    var_dump($decoded); // array(3) { ["width"]=> int(300) ["height"]=> int(200) ["path"]=> string(14) "only-file-name" }
    

提交回复
热议问题