Why URI-encoded ('#') anchors cause 404, and how to deal with it in JS?

后端 未结 2 1084
梦毁少年i
梦毁少年i 2021-01-12 05:15

prettyPhoto utilizes hashtags, but if they get encoded (to %23), most browsers will bring up a 404 error. This has been discussed before:

You get a 40

2条回答
  •  予麋鹿
    予麋鹿 (楼主)
    2021-01-12 05:58

    To answer #1)

    It would become a part of the URL because it's no longer a token which the browser/server/etc know how to parse out.

    What I mean is that "?" plays a significant role in URLs -- the server knows to separate what's before from what's after. The browser doesn't need to care about what is or isn't dynamic in the URI - it's all significant (though JavaScript separates the values in the location object).

    The browser won't send "#......" to the server, as the hashtag has special connotations for the browser.

    However, if you escape that hash in JavaScript, the browser won't hesitate to send that escaped string to the server as a literal value.

    Why wouldn't it? If your search query legitimately required a hash character (you make a POST request to a facebook wall, and you're submitting a phonenumber), then you'd be screwed. Or you're doing a GET-based search of some number on 411.com or whatever, and they haven't really thought their application through.

    The problem is that the server isn't going to understand that the escaped value is to be held separately from the url, if it's occurring in the actual path.

    It has to accept escaped characters, otherwise spaces (%20) and other every-day characters, which are otherwise valid in filenames/paths/queries/values would pose problems.

    So if you're looking for:

    //mysite.gov.on.ca/path/to/file.extension%23action%3Dfullscreen
    

    verily, you shall surely 404.

    There are a few things that you could do, I'm certain. The first would be in Apache, or whatever you're serving from, you could write a RegEx which matches any url up to the first "%23", assuming that there is no "?" beforehand.

    Less soul-rending implementations might involve figuring out if there's a way to escape the "#" that are plug-in friendly.

    Google, for-instance, uses a "hash-bang" strategy ("#!") where it asks that URLs be submitted that way, to know whether or not to encode.

    Other options might be to check for a "#" character using url.indexOf("#"); and splitting the URL at the hash, and submitting the valid portion.

    It really all comes down to what you're trying to accomplish -- I can point at why it's an issue, but the how to best make it a non-issue relies on what you're trying to do, how you're trying to do it, and what's allowed in the context you're working in.

提交回复
热议问题