How to compress URL parameters

后端 未结 12 1489
孤城傲影
孤城傲影 2020-12-04 11:16

Say I have a single-page application that uses a third party API for content. The app’s logic is in-browser only, and there is no backend I can write to.

To allow de

相关标签:
12条回答
  • 2020-12-04 11:34

    Why not using protocol-buffers?

    Protocol buffers are a flexible, efficient, automated mechanism for serializing structured data – think XML, but smaller, faster, and simpler. You define how you want your data to be structured once, then you can use special generated source code to easily write and read your structured data to and from a variety of data streams and using a variety of languages. You can even update your data structure without breaking deployed programs that are compiled against the "old" format.

    ProtoBuf.js converts objects to protocol buffer messages and vice vera.

    The following object converts to: CgFhCgFiCgFjEgFkEgFlEgFmGgFnGgFoGgFpIgNqZ2I=

    {
        repos : ['a', 'b', 'c'],
        labels: ['d', 'e', 'f'],
        milestones : ['g', 'h', 'i'],
        username : 'jgb'
    }
    

    Example

    The following example is built using require.js. Give it a try on this jsfiddle.

    require.config({
        paths : {
            'Math/Long'  : '//rawgithub.com/dcodeIO/Long.js/master/Long.min',
            'ByteBuffer' : '//rawgithub.com/dcodeIO/ByteBuffer.js/master/ByteBuffer.min',
            'ProtoBuf'   : '//rawgithub.com/dcodeIO/ProtoBuf.js/master/ProtoBuf.min'
        }
    })
    
    require(['message'], function(message) {
        var data = {
            repos : ['a', 'b', 'c'],
            labels: ['d', 'e', 'f'],
            milestones : ['g', 'h', 'i'],
            username : 'jgb'
        }
    
        var request = new message.arguments(data);
    
        // Convert request data to base64
        var base64String = request.toBase64();
        console.log(base64String);
    
        // Convert base64 back
        var decodedRequest = message.arguments.decode64(base64String);
        console.log(decodedRequest);
    });
    
    // Protobuf message definition
    // Message definition could also be stored in a .proto definition file
    // See: https://github.com/dcodeIO/ProtoBuf.js/wiki
    define('message', ['ProtoBuf'], function(ProtoBuf) {
        var proto = {
            package : 'message',
            messages : [
                {
                    name : 'arguments',
                    fields : [
                        {
                            rule : 'repeated',
                            type : 'string',
                            name : 'repos',
                            id : 1
                        },
                        {
                            rule : 'repeated',
                            type : 'string',
                            name : 'labels',
                            id : 2
                        },
                        {
                            rule : 'repeated',
                            type : 'string',
                            name : 'milestones',
                            id : 3
                        },
                        {
                            rule : 'required',
                            type : 'string',
                            name : 'username',
                            id : 4
                        },
                        {
                            rule : 'optional',
                            type : 'bool',
                            name : 'with_comments',
                            id : 5
                        },
                        {
                            rule : 'optional',
                            type : 'bool',
                            name : 'without_comments',
                            id : 6
                        }
                    ],
                }
            ]
        };
    
        return ProtoBuf.loadJson(proto).build('message')
    });
    
    0 讨论(0)
  • 2020-12-04 11:35

    Short

    Use a URL packing scheme such as my own, starting only from the params section of your URL.

    Longer

    As other's here have pointed out, typical compression systems don't work for short strings. But, it's important to recognise that URLs and Params are a serialization format of a data model: a text human-readable format with specific sections - we know that the scheme is first, the host is found directly after, the port is implied but can be overridden, etc...

    With the underlying conceptual data model, one can serialize with a more bit-efficient serialization scheme. In fact, I have created such a serialization myself which archives around 50% compression: see http://blog.alivate.com.au/packed-url/

    Conceptually, my scheme was written with the conceptual data model in mind, it doesn't deserialize the URL into that conceptual model as a distinct step. However, that's possible, and that formal approach might yield greater efficiencies, where the bits don't need to be in the same order as what a string URL might be.

    0 讨论(0)
  • 2020-12-04 11:40

    Perhaps you can find a url shortener with a jsonp API, that way you could make all the URLs really short automatically.

    http://yourls.org/ even has jsonp support.

    0 讨论(0)
  • 2020-12-04 11:45

    Update: I released an NPM package with some more optimizations, see https://www.npmjs.com/package/@yaska-eu/jsurl2

    Some more tips:

    • Base64 encodes with a..zA..Z0..9+/=, and un-encoded URI characters are a..zA..Z0..9-_.~. So Base64 results only need to swap +/= for -_. and it won't expand URIs.
    • You could keep an array of key names, so that objects could be represented with the first character being the offset in the array, e.g. {foo:3,bar:{g:'hi'}} becomes a3,b{c'hi'} given key array ['foo','bar','g']

    Interesting libraries:

    • JSUrl specifically encodes JSON so it can be put in a URL without changes, even though it uses more characters than specified in the RFC. {"name":"John Doe","age":42,"children":["Mary","Bill"]} becomes ~(name~'John*20Doe~age~42~children~(~'Mary~'Bill)) and with a key dictionary ['name','age','children'] that could be ~(0~'John*20Doe~1~42~2~(~'Mary~'Bill)), thus going from 101 bytes URI encoded to 38.
      • Small footprint, fast, reasonable compression.
    • lz-string uses an LZW-based algorithm to compress strings to UTF16 for storing in localStorage. It also has a compressToEncodedURIComponent() function to produce URI-safe output.
      • Still only a few KB of code, pretty fast, good/great compression.

    So basically I'd recommend picking one of these two libraries and consider the problem solved.

    0 讨论(0)
  • 2020-12-04 11:47

    It looks like the Github APIs have numeric IDs for many things (looks like repos and users have them, but labels don't) under the covers. It might be possible to use those numbers instead of names wherever advantageous. You then have to figure out how to best encode those in something that'll survive in a query string, e.g. something like base64(url).

    For example, your hoodie.js repository has ID 4780572.

    Packing that into a big-endian unsigned int (as many bytes as we need) gets us \x00H\xf2\x1c.

    We'll just toss the leading zero, we can always restore that later, now we have H\xf2\x1c.

    Encode as URL-safe base64, and you have SPIc (toss any padding you might get).

    Going from hoodiehq/hoodie.js to SPIc seems like a good-sized win!

    More generally, if you're willing to invest the time, you can try to exploit a bunch of redudancies in your query strings. Other ideas are along the lines of packing the two boolean params into a single character, possibly along with other state (like what fields are included). If you use base64-encoding (which seems the best option here due to the URL-safe version -- I looked at base85, but it has a bunch of characters that won't survive in a URL), that gets you 6 bits of entropy per character... there's a lot you can do with that.

    To add to Thomas Fuchs' note, yes, if there's some kind of inherent, immutable ordering in some of things you're encoding, than that would obviously also help. However, that seems hard for both the labels and the milestones.

    0 讨论(0)
  • 2020-12-04 11:49

    Maybe any simple JS minifier will help you. You'll need only to integrate it on serialization and deserialization points only. I think it'd be the easiest solution.

    0 讨论(0)
提交回复
热议问题