OK to skip slash before query string?

后端 未结 4 523
清酒与你
清酒与你 2020-12-02 11:20

Is it safe to always skip the trailing slash when appending a query string?

That is, can I use

http://example.com?querystring

inste

4条回答
  •  谎友^
    谎友^ (楼主)
    2020-12-02 11:48

    As a matter of modern spec, yes, it is permissible to skip the slash, contrary to what the accepted answer here claims.

    Although the accepted answer correctly quotes RFC 1738 (released over 20 years ago!), it wrongly claims that RFC 2396 (released in 1998) requires the slash, and neglects that both of those specs have in turn been obsoleted by RFC 3986, released in 2005 (still several years before the accepted answer was written) and more recently by the WhatWG URL Standard, both of which allow the slash to be omitted.

    Let's consider each of these specs in turn, from earliest to latest:


    RFC 1738: Uniform Resource Locators (URL) (released in 1994)

    Implicitly requires the slash to be included by specifying that it may be omitted if the URL contains neither a path nor a query string (called a searchpart, here). Bolding below is mine:

    An HTTP URL takes the form:

    http://:/?
    

    where and are as described in Section 3.1. If : is omitted, the port defaults to 80. No user name or password is allowed. is an HTTP selector, and is a query string. The is optional, as is the and its preceding "?". If neither nor is present, the "/" may also be omitted.


    RFC 2396: Uniform Resource Identifiers (URI): Generic Syntax (released in 1998; "updates" RFC 1738)

    Here it is acceptable to omit the slash. This RFC legalises some weird URL syntaxes that don't have a double-slash after the scheme, but if we ignore those (they're the ones with an opaque_part in the spec's BNF) and stick to URLs that contain a host, then we find that an absoluteURI is defined like this...

    absoluteURI   = scheme ":" ( hier_part | opaque_part )
    

    and that a hier_part looks like this:

    hier_part     = ( net_path | abs_path ) [ "?" query ]
    

    and that a net_path looks like this:

    net_path      = "//" authority [ abs_path ]
    

    where an abs_path is in turn defined to start with a slash. Note that the abs_path is optional in the grammar above - that means that a URL of the form scheme://authority?query is completely legal.

    The motivation for this change is hinted at in appendix G.2. Modifications from both RFC 1738 and RFC 1808:

    The question-mark "?" character was removed from the set of allowed characters for the userinfo in the authority component, since testing showed that many applications treat it as reserved for separating the query component from the rest of the URI.

    In other words - code in the real world was assuming that the first question mark in a URL, anywhere, marked the beginning of a query string, and so the spec was pragmatically updated to align with reality.


    RFC 3986: Uniform Resource Identifier (URI): Generic Syntax (released in 2005; "obsoletes" RFC 2396)

    Again, it is permissible to omit the slash. The spec expresses this by saying that a "path" is required in every URI that contains an authority (host), and that path must either begin with a slash or consist of no characters:

    3. Syntax Components

    The generic URI syntax consists of a hierarchical sequence of components referred to as the scheme, authority, path, query, and fragment.

    URI         = scheme ":" hier-part [ "?" query ] [ "#" fragment ]
    
    hier-part   = "//" authority path-abempty
                / path-absolute
                / path-rootless
                / path-empty
    

    The scheme and path components are required, though the path may be empty (no characters). When authority is present, the path must either be empty or begin with a slash ("/") character.

    For completeness, note that path-abempty is later defined thus:

    path-abempty  = *( "/" segment )
    

    This does indeed permit it contain no characters.


    URL Standard by WhatWG (a living standard under active maintenance, first created in 2012, with the goal of obsoleting RFC 3986)

    Again, omitting the slash is acceptable, although this time we have no BNF to look at but instead need to read lots of prose.

    Section 4.3 tells us:

    An absolute-URL string must be one of the following

    • a URL-scheme string that is an ASCII case-insensitive match for a special scheme and not an ASCII case-insensitive match for "file", followed by ":" and a scheme-relative-special-URL string
    • a URL-scheme string that is not an ASCII case-insensitive match for a special scheme, followed by ":" and a relative-URL string
    • a URL-scheme string that is an ASCII case-insensitive match for "file", followed by ":" and a scheme-relative-file-URL string

    any optionally followed by "?" and a URL-query string.

    Since HTTP and HTTPS are special schemes, any HTTP or HTTPS URL must satisfy the first of those three options - that is, http: or https: followed by a scheme-relative-special-URL string, which:

    must be "//", followed by a valid host string, optionally followed by ":" and a URL-port string, optionally followed by a path-absolute-URL string.

    A path-absolute-URL string is defined to start with a slash, but is explicitly optional in the definition of an absolute-URL string above; thus, it is permissible to go straight from the host to the "?" and query string, and so URLs like http://example.com?query are legal.


    Of course, none of this provides a cast-iron guarantee that every web server or HTTP library will accept such URLs, nor that they will treat them as semantically equivalent to a URL that contains the slash. But as far as spec goes, skipping the slash is completely legal.

提交回复
热议问题