Binary Data in JSON String. Something better than Base64

后端 未结 15 1675
一向
一向 2020-11-21 23:03

The JSON format natively doesn\'t support binary data. The binary data has to be escaped so that it can be placed into a string element (i.e. zero or more Unicode chars in d

15条回答
  •  清歌不尽
    2020-11-21 23:44

    I dig a little bit more (during implementation of base128), and expose that when we send characters which ascii codes are bigger than 128 then browser (chrome) in fact send TWO characters (bytes) instead one :(. The reason is that JSON by defaul use utf8 characters for which characters with ascii codes above 127 are coded by two bytes what was mention by chmike answer. I made test in this way: type in chrome url bar chrome://net-export/ , select "Include raw bytes", start capturing, send POST requests (using snippet at the bottom), stop capturing and save json file with raw requests data. Then we look inside that json file:

    • We can find our base64 request by finding string 4142434445464748494a4b4c4d4e this is hex coding of ABCDEFGHIJKLMN and we will see that "byte_count": 639 for it.
    • We can find our above127 request by finding string C2BCC2BDC380C381C382C383C384C385C386C387C388C389C38AC38B this are request-hex utf8 codes of characters ¼½ÀÁÂÃÄÅÆÇÈÉÊË (however the ascii hex codes of this characters are c1c2c3c4c5c6c7c8c9cacbcccdce). The "byte_count": 703 so it is 64bytes longer than base64 request because characters with ascii codes above 127 are code by 2 bytes in request :(

    So in fact we don't have profit with sending characters with codes >127 :( . For base64 strings we not observe such negative behaviour (probably for base85 too - I don check it) - however may be some solution for this problem will be sending data in binary part of POST multipart/form-data described in Ælex answer (however usually in this case we don't need to use any base coding at all...).

    The alternative approach may rely on mapping two bytes data portion into one valid utf8 character by code it using something like base65280 / base65k but probably it would be less effective than base64 due to utf8 specification ...

    function postBase64() {
      let formData = new FormData();
      let req = new XMLHttpRequest();
    
      formData.append("base64ch", "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/");
      req.open("POST", '/testBase64ch');
      req.send(formData);
    }
    
    
    function postAbove127() {
      let formData = new FormData();
      let req = new XMLHttpRequest();
    
      formData.append("above127", "¼½ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖרÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõö÷øùúûüý");
      req.open("POST", '/testAbove127');
      req.send(formData);
    }
    
    

提交回复
热议问题