Can I export excel data with UTF-8 without BOM?

后端 未结 6 1359
慢半拍i
慢半拍i 2020-12-01 04:00

I export Microsoft Excel data by Excel Macro(VBScript). Because file is lua script, I export it as UTF-8. The only way I can make UTF-8 in Excel is using adodb.stream like t

6条回答
  •  囚心锁ツ
    2020-12-01 04:28

    Edit

    A comment from rellampec alerted me to a better way of dropping the LF I had discovered was added to the end of the file by user272735's method. I have added a new version of my routine at the end.

    Original post

    I had been using user272735's method successfully for a year when I discovered it added a LF at the end of the file. I failed to notice this extra LF until I did some very detailed testing so this is not an important error. However, my latest version discards that LF just in case it ever became important.

    Public Sub PutTextFileUtf8(ByVal PathFileName As String, ByVal FileBody As String)
    
      ' Outputs FileBody as a text file (UTF-8 encoding without leading BOM)
      ' named PathFileName
    
      ' Needs reference to "Microsoft ActiveX Data Objects n.n Library"
      ' Addition to original code says version 2.5. Tested with version 6.1.
    
      '  1Nov16  Copied from http://stackoverflow.com/a/4461250/973283
      '          but replaced literals with parameters.
      ' 15Aug17  Discovered routine was adding an LF to the end of the file.
      '          Added code to discard that LF.
    
      ' References: http://stackoverflow.com/a/4461250/973283
      '             https://www.w3schools.com/asp/ado_ref_stream.asp
    
      Dim BinaryStream As Object
      Dim UTFStream As Object
    
      Set UTFStream = CreateObject("adodb.stream")
    
      UTFStream.Type = adTypeText
      UTFStream.Mode = adModeReadWrite
      UTFStream.Charset = "UTF-8"
      ' The LineSeparator will be added to the end of FileBody. It is possible
      ' to select a different value for LineSeparator but I can find nothing to
      ' suggest it is possible to not add anything to the end of FileBody
      UTFStream.LineSeparator = adLF
      UTFStream.Open
      UTFStream.WriteText FileBody, adWriteLine
    
      UTFStream.Position = 3 'skip BOM
    
      Set BinaryStream = CreateObject("adodb.stream")
      BinaryStream.Type = adTypeBinary
      BinaryStream.Mode = adModeReadWrite
      BinaryStream.Open
    
      UTFStream.CopyTo BinaryStream
    
      ' Oriinally I planned to use "CopyTo Dest, NumChars" to not copy the last
      ' byte.  However, NumChars is described as an integer whereas Position is
      ' described as Long. I was concerned by "integer" they mean 16 bits.
      'Debug.Print BinaryStream.Position
      BinaryStream.Position = BinaryStream.Position - 1
      BinaryStream.SetEOS
      'Debug.Print BinaryStream.Position
    
      UTFStream.Flush
      UTFStream.Close
      Set UTFStream = Nothing
    
      BinaryStream.SaveToFile PathFileName, adSaveCreateOverWrite
      BinaryStream.Flush
      BinaryStream.Close
      Set BinaryStream = Nothing
    
    End Sub
    

    New version of routine

    This version omits the code to discard the unwanted LF added at the end because it avoids adding the LF in the first place. I have retained the original version in case anyone is interested in the technique for deleting trailing characters.

    Public Sub PutTextFileUtf8NoBOM(ByVal PathFileName As String, ByVal FileBody As String)
    
      ' Outputs FileBody as a text file named PathFileName using
      ' UTF-8 encoding without leading BOM
    
      ' Needs reference to "Microsoft ActiveX Data Objects n.n Library"
      ' Addition to original code says version 2.5. Tested with version 6.1.
    
      '  1Nov16  Copied from http://stackoverflow.com/a/4461250/973283
      '          but replaced literals with parameters.
      ' 15Aug17  Discovered routine was adding an LF to the end of the file.
      '          Added code to discard that LF.
      ' 11Oct17  Posted to StackOverflow
      '  9Aug18  Comment from rellampec suggested removal of adWriteLine from
      '          WriteTest statement would avoid adding LF.
      ' 30Sep18  Amended routine to remove adWriteLine from WriteTest statement
      '          and code to remove LF from file. Successfully tested new version.
    
      ' References: http://stackoverflow.com/a/4461250/973283
      '             https://www.w3schools.com/asp/ado_ref_stream.asp
    
      Dim BinaryStream As Object
      Dim UTFStream As Object
    
      Set UTFStream = CreateObject("adodb.stream")
    
      UTFStream.Type = adTypeText
      UTFStream.Mode = adModeReadWrite
      UTFStream.Charset = "UTF-8"
      UTFStream.Open
      UTFStream.WriteText FileBody
    
      UTFStream.Position = 3 'skip BOM
    
      Set BinaryStream = CreateObject("adodb.stream")
      BinaryStream.Type = adTypeBinary
      BinaryStream.Mode = adModeReadWrite
      BinaryStream.Open
    
      UTFStream.CopyTo BinaryStream
    
      UTFStream.Flush
      UTFStream.Close
      Set UTFStream = Nothing
    
      BinaryStream.SaveToFile PathFileName, adSaveCreateOverWrite
      BinaryStream.Flush
      BinaryStream.Close
      Set BinaryStream = Nothing
    
    End Sub
    

提交回复
热议问题