问题
We take a (non-corrupted) .docx file from our server and post it via httprequest to an API. When downloading it from the API it comes out corrupted. I 99% sure that this is down to the code that posts the file, not the API.
It turns out the corrupted file had some extra characters in the binary - I thought it would be pretty easy to find out where they came from and remove them. Boy was I wrong.
I've since realised that every time we post the file, the binary ending is slightly different. We're using the exact same file, using the exact same code.
What could account for this difference?
Example Binary Endings
0015 e88a 5060 0700 00da 3b00 000f 0000
0000 0000 0000 0000 0000 0060 1d00 0077
6f72 642f 7374 796c 6573 2e78 6d6c 504b
0506 0000 0000 0b00 0b00
30 seconds later:
0015 e88a 5060 0700 00da 3b00 000f 0000
0000 0000 0000 0000 0000 0060 1d00 0077
6f72 642f 7374 796c 6573 2e78 6d6c 504b
0506 0000 0000 0b00 0b00 c102 00
Another 30 seconds later:
0015 e88a 5060 0700 00da 3b00 000f 0000
0000 0000 0000 0000 0000 0060 1d00 0077
6f72 642f 7374 796c 6573 2e78 6d6c 504b
0506 0000 0000 0b00 0b00 c102 0000 ed24
Posting Code
Sub PostTheFile(CVFile, fullFilePath, PostToURL)
strBoundary = "---------------------------9849436581144108930470211272"
strRequestStart = "--" & strBoundary & vbCrlf &_
"Content-Disposition: attachment; name=""file""; filename=""" & CVFile & """" & vbcrlf & vbcrlf
strRequestEnd = vbCrLf & "--" & strBoundary & "--"
Set stream = Server.CreateObject("ADODB.Stream")
stream.Type = adTypeBinary
stream.Mode = adModeReadWrite
stream.Open
stream.Write StringToBinary(strRequestStart)
stream.Write ReadBinaryFile(fullFilePath)
stream.Write StringToBinary(strRequestEnd)
stream.Position = 0
BINARYPOST= stream.read
stream.Close
Set stream = Nothing
Set httpRequest = Server.CreateObject("MSXML2.ServerXMLHTTP.6.0")
httpRequest.Open "PATCH", PostToURL, False, "username", "pw"
httpRequest.setRequestHeader "Content-Type", "multipart/form-data; boundary=""" & strBoundary & """"
httpRequest.Send BINARYPOST
Response.write "httpRequest.status: " & httpRequest.status
Set httpRequest = Nothing
End Sub
Function StringToBinary(input)
dim stream
set stream = Server.CreateObject("ADODB.Stream")
stream.Charset = "UTF-8"
stream.Type = adTypeText
stream.Mode = adModeReadWrite
stream.Open
stream.WriteText input
stream.Position = 0
stream.Type = adTypeBinary
StringToBinary = stream.Read
stream.Close
set stream = Nothing
End Function
Function ReadBinaryFile(fullFilePath)
dim stream
set stream = Server.CreateObject("ADODB.Stream")
stream.Type = 1
stream.Open()
stream.LoadFromFile(fullFilePath)
ReadBinaryFile = stream.Read()
stream.Close
set stream = nothing
end function
Update
We played with a few different boundaries and Charsets.
There was some additional BOM stuff going on with UTF-8.
http://wikipedia.org/wiki/Byte_order_mark
Now the issue is clearly the addition of (a seemingly random amount of) NULL / zero padding.
E.g. The first time it adds 13 sets of "00". Hit refresh and the second time it will add 8. A third time it adds 7. Each time with the exact same file and code.
Suggestion - How Likely is This?
The destination URL for the post is https - so a friend suggested that our server may have recognised this and added random padding as part of the encryption. This sounds kind of unlikely to me, but I don't have any better suggestions.
回答1:
I have found a similar question:
Error in downloaded pdf file - ASP classic
Here are some tips that come from there:
- set Stream .Mode property to 3
- set Response.ContentType to "xxx/xxx"
- Before you start adding Response Headers, you should call Response.Clear (just to be sure you're not sending extra markup) (This seems very similar)
Hope this helps :-)
来源:https://stackoverflow.com/questions/18341803/exact-same-file-and-code-so-why-does-the-binary-of-my-docx-file-always-end-diff