Classic ASP's Request.Form is dropping an 8-bit character — is there a simple way to prevent this?

问题

A client of mine is using a Classic ASP script to process a form from a third-party payment processor (this is the last step in a credit-card-transaction sequence that starts at the client's website, goes to the third-party site, and then returns to the client's site).

The client is in Austria and when one of the fields includes an 8-bit character (e.g., when the field value is Österreich), the Ö is simply dropped when I retrieve the value of the field in the standard way; e.g.:

fieldval = Request.Form("country")
If fieldval = "sterreich" Then
    ' Code here will execute
End If

The literal value that the third-party page is POSTing is %D6sterreich, which I think suggests that the POST is being encoded in UTF-8.

The POST request has the following possibly-relevant headers:

Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Content-Type: application/x-www-form-urlencoded

I'm by no means a character-encoding expert and this is the first time I've really done anything with Classic ASP, so I'm kind of flummoxed.

From some Googling and searching SO, I've added the following to the page that processes the POST:

<%@ Codepage=65001 %>
<%
Response.CharSet = "UTF-8"
Response.Codepage = 65001
%>

But it doesn't make any difference -- I still lose that initial 8-bit character. Is there something really simple that I'm just not aware of?

回答1:

Try adding the following to the top of the page:

<%
Response.CharSet = "utf-8"
Session.CodePage = 65001
%>

回答2:

Turns out I was going the wrong direction with this. The ASP file in question was itself encoded in UTF-8, which was implicitly setting Response.CodePage to 65001 -- in other words, explicitly adding a CODEPAGE directive made no difference -- and in fact the UTF-8 encoding was the source of the problem.

When I re-encoded the file to Windows-1252, the problem disappeared. I'm pretty ignorant of character encodings in general, but I think in retrospect the %D6 in the POST should have been my clue -- if I'm starting to understand things rightly, the single byte 0xD6 is not a valid UTF-8 character. Maybe someone more familiar with these things could confirm or deny this.

回答3:

What about using the Ascii Character 0 in the query string, encoded as (%00), can I retrieve the whole value without terminating by Ascii 0?

http://localhost/Test_Authentication.asp?token=%13%23%02%00%01%01%00%01%01%05%02%02%03%00%02%02%0A%0A%0A%0A%0A%0A048


Response.CharSet = "utf-8";
Session.CodePage=65001;

var strToken = (Request.QueryString("token").Count > 0)?Request.QueryString("token")(1):"";

回答4:

@Ben Dunlap: Try this at the top of the page --

<%@LANGUAGE="VBSCRIPT" CODEPAGE="65001"%>

Update
If you do a Response.Write Request.Form("country"), what does it display?

回答5:

The 2 simple steps I used were:

add at the top of EVERY asp file:

Response.CharSet = "utf-8"

Response.CodePage = 65001
save every ASP text file in "ANSI" encoding (NOT utf-8!) - this option is usually found in the "Save" window of advanced text editors

If you save in utf-8 encoding or if you don't add the two line specified at the top of your code, this will never work as you intended.

来源：https://stackoverflow.com/questions/5747478/classic-asps-request-form-is-dropping-an-8-bit-character-is-there-a-simple-w

标签

http

utf-8

asp-classic