How to compress URL parameters

后端未结

关注

 12  1489

Say I have a single-page application that uses a third party API for content. The app’s logic is in-browser only, and there is no backend I can write to.

To allow de

Example

The following example is built using require.js. Give it a try on this jsfiddle.

require.config({
    paths : {
        'Math/Long'  : '//rawgithub.com/dcodeIO/Long.js/master/Long.min',
        'ByteBuffer' : '//rawgithub.com/dcodeIO/ByteBuffer.js/master/ByteBuffer.min',
        'ProtoBuf'   : '//rawgithub.com/dcodeIO/ProtoBuf.js/master/ProtoBuf.min'
    }
})

require(['message'], function(message) {
    var data = {
        repos : ['a', 'b', 'c'],
        labels: ['d', 'e', 'f'],
        milestones : ['g', 'h', 'i'],
        username : 'jgb'
    }

    var request = new message.arguments(data);

    // Convert request data to base64
    var base64String = request.toBase64();
    console.log(base64String);

    // Convert base64 back
    var decodedRequest = message.arguments.decode64(base64String);
    console.log(decodedRequest);
});

// Protobuf message definition
// Message definition could also be stored in a .proto definition file
// See: https://github.com/dcodeIO/ProtoBuf.js/wiki
define('message', ['ProtoBuf'], function(ProtoBuf) {
    var proto = {
        package : 'message',
        messages : [
            {
                name : 'arguments',
                fields : [
                    {
                        rule : 'repeated',
                        type : 'string',
                        name : 'repos',
                        id : 1
                    },
                    {
                        rule : 'repeated',
                        type : 'string',
                        name : 'labels',
                        id : 2
                    },
                    {
                        rule : 'repeated',
                        type : 'string',
                        name : 'milestones',
                        id : 3
                    },
                    {
                        rule : 'required',
                        type : 'string',
                        name : 'username',
                        id : 4
                    },
                    {
                        rule : 'optional',
                        type : 'bool',
                        name : 'with_comments',
                        id : 5
                    },
                    {
                        rule : 'optional',
                        type : 'bool',
                        name : 'without_comments',
                        id : 6
                    }
                ],
            }
        ]
    };

    return ProtoBuf.loadJson(proto).build('message')
});

0 讨论(0)

谎友^

2020-12-04 11:35

Short

Use a URL packing scheme such as my own, starting only from the params section of your URL.

Longer

As other's here have pointed out, typical compression systems don't work for short strings. But, it's important to recognise that URLs and Params are a serialization format of a data model: a text human-readable format with specific sections - we know that the scheme is first, the host is found directly after, the port is implied but can be overridden, etc...

With the underlying conceptual data model, one can serialize with a more bit-efficient serialization scheme. In fact, I have created such a serialization myself which archives around 50% compression: see http://blog.alivate.com.au/packed-url/

Conceptually, my scheme was written with the conceptual data model in mind, it doesn't deserialize the URL into that conceptual model as a distinct step. However, that's possible, and that formal approach might yield greater efficiencies, where the bits don't need to be in the same order as what a string URL might be.

0 讨论(0)
发布评论:

提交评论
- 加载中...
名媛妹妹

2020-12-04 11:40

Perhaps you can find a url shortener with a jsonp API, that way you could make all the URLs really short automatically.

http://yourls.org/ even has jsonp support.

0 讨论(0)
发布评论:

提交评论
- 加载中...
太阳男子

2020-12-04 11:45
Update: I released an NPM package with some more optimizations, see https://www.npmjs.com/package/@yaska-eu/jsurl2

Some more tips:
- Base64 encodes with a..zA..Z0..9+/=, and un-encoded URI characters are a..zA..Z0..9-_.~. So Base64 results only need to swap +/= for -_. and it won't expand URIs.
- You could keep an array of key names, so that objects could be represented with the first character being the offset in the array, e.g. {foo:3,bar:{g:'hi'}} becomes a3,b{c'hi'} given key array ['foo','bar','g']
Interesting libraries:
- JSUrl specifically encodes JSON so it can be put in a URL without changes, even though it uses more characters than specified in the RFC. {"name":"John Doe","age":42,"children":["Mary","Bill"]} becomes ~(name~'John*20Doe~age~42~children~(~'Mary~'Bill)) and with a key dictionary ['name','age','children'] that could be ~(0~'John*20Doe~1~42~2~(~'Mary~'Bill)), thus going from 101 bytes URI encoded to 38.
  - Small footprint, fast, reasonable compression.
- lz-string uses an LZW-based algorithm to compress strings to UTF16 for storing in localStorage. It also has a compressToEncodedURIComponent() function to produce URI-safe output.
  - Still only a few KB of code, pretty fast, good/great compression.
So basically I'd recommend picking one of these two libraries and consider the problem solved.
0 讨论(0)
发布评论:

提交评论
- 加载中...
难免孤独

2020-12-04 11:47

It looks like the Github APIs have numeric IDs for many things (looks like repos and users have them, but labels don't) under the covers. It might be possible to use those numbers instead of names wherever advantageous. You then have to figure out how to best encode those in something that'll survive in a query string, e.g. something like base64(url).

For example, your hoodie.js repository has ID 4780572.

Packing that into a big-endian unsigned int (as many bytes as we need) gets us \x00H\xf2\x1c.

We'll just toss the leading zero, we can always restore that later, now we have H\xf2\x1c.

Encode as URL-safe base64, and you have SPIc (toss any padding you might get).

Going from hoodiehq/hoodie.js to SPIc seems like a good-sized win!

More generally, if you're willing to invest the time, you can try to exploit a bunch of redudancies in your query strings. Other ideas are along the lines of packing the two boolean params into a single character, possibly along with other state (like what fields are included). If you use base64-encoding (which seems the best option here due to the URL-safe version -- I looked at base85, but it has a bunch of characters that won't survive in a URL), that gets you 6 bits of entropy per character... there's a lot you can do with that.

To add to Thomas Fuchs' note, yes, if there's some kind of inherent, immutable ordering in some of things you're encoding, than that would obviously also help. However, that seems hard for both the labels and the milestones.

0 讨论(0)
发布评论:

提交评论
- 加载中...
终归单人心

2020-12-04 11:49

Maybe any simple JS minifier will help you. You'll need only to integrate it on serialization and deserialization points only. I think it'd be the easiest solution.

0 讨论(0)
发布评论:

提交评论
- 加载中...

1 2 下一页