problems reading doubles from CSV - VBA

无人久伴 提交于 2019-12-08 12:52:51

问题


I want to read a csv file from vba-excel but i have a problem with double values, from example, this value in the csv: 125.5 is read without dot. So i get 1255. My code:

Dim rs As New ADODB.Recordset
strCon = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" & myDir & ";" & "Extended Properties=""text;HDR=Yes;FMT=Delimited()"";"
strSQL = "SELECT * FROM " & myFileName
rs.Open strSQL, strCon, 3, 3
IBH = rs("IBH")

How can i solve?

update: I tried @Siddharth Rout solution, but i still have the same problem. my code now:

Dim conn As New ADODB.Connection
Dim rs As New ADODB.Recordset
Dim myDate, myTime, IBH, IBL
Dim myDir As String, myFileName As String
Dim strSQL As String

myDir = Trim(shParams.Range("fp_path"))
myFileName = Trim(shParams.Range("fp_filename"))

With conn
 .Provider = "Microsoft.ACE.OLEDB.12.0"
 .ConnectionString = "Data Source=" & myDir & ";Extended Properties='text'"
 .Open
End With

strSQL = "SELECT * FROM " & myFileName
rs.Open strSQL, conn, 3, 3
rs.MoveLast

myDate = rs("Date")
myTime = rs("Time")
IBH = rs("IBH")
IBL = rs("IBL")

Debug.Print myDate, myTime, IBH, IBL

rs.Close
Set rs = Nothing

this is the result:

this is my csv:


回答1:


This issue stems from how the ACE engine determines the type for a ADODB field. The driver will scan a set number of rows to determine what the type should be for the entire column.

Changing the Connection String


One quick thing you can try is changing the MaxScanRows to 0 in the Connection String. Setting this to 0 will scan all rows to determine the type, keep in mind this may have a performance impact depending on how large your data set is.

";Extended Properties='text;MaxScanRows=0;IMEX=0'"

This won't always give you the desired result. Say we have a data set like this:

+--------------------------+
|       DoubleField        |
+--------------------------+
| 1                        |
| 2                        |
| 3                        |
| ...(996 more records...) |
| 1000.01                  |
+--------------------------+

The driver will look and see 999 records that look like an Integer, and 1 record that looks like a Double. It will decide this field based on MajorityType it is an Integer, not a Double. To be honest, I'm not entirely sure how this type determination is done exactly, but it is something along these lines. I've also seen instances where simply changing the top record to be the type you want will work. E.g.

+--------------------------+
|       DoubleField        |
+--------------------------+
| 1.00                     |
| 2                        |
| 3                        |
| ...(996 more records...) |
| 1000.01                  |
+--------------------------+

So another approach could be to format the source file to include decimal places upfront. This should be easy enough to do if you control the source file, but this isn’t always the case.

Use a Schema INI File


If upping the MaxScanRows doesn't work, a sure fire what to get the type you expect per column is to use a Schema.ini file as Comintern pointed out.

Here is a link that goes over this.

The gist, make a file that defines each type for each column explicitly. For our contrived table above, this becomes:

[MyFileNameGoesHere.csv]
ColNameHeader = True
Format = CSVDelimited
Col1=DoubleField Double

You would then save this file as Schema.Ini and place it in the same directory as the file you want to import. The nice thing about this approach is it is just creating a text file, you could even do this in VBA without too much trouble. A downside with this approach is if you have lots of files to import, it can be hard to manage all the Schema.ini files.

A Purely VBA approach


You can create an in memory table in ADODB and populate that with the data from the csv file. Here is a little code sample to get you started.

Option Explicit

Private Function getTypedRS() As ADODB.Recordset
    Set getTypedRS = New ADODB.Recordset

    With getTypedRS
        'Add your other fields here
        .Fields.Append "DoubleField", adDouble
    End With
End Function

Public Sub CSVToADODB()
    Dim myTimer         As Double
    Dim FileNumber      As Long
    Dim FilePath        As String
    Dim FileData        As String
    Dim CSVArray        As Variant
    Dim i               As Long
    Dim rs              As ADODB.Recordset

    myTimer = Timer
    Set rs = getTypedRS()
    FilePath = "C:\Users\Ryan\Desktop\Example.csv"

    'Get the CSV
    FileNumber = FreeFile()
    Open FilePath For Binary Access Read As FileNumber
    FileData = Space$(LOF(FileNumber)) 'Create a buffer first, then assign
    Get FileNumber, , FileData
    Close FileNumber

    'My CSV is just a list of Doubles, should be relatively easy to swap out to process with ','
    CSVArray = Split(FileData, vbCrLf)

    'Add data
    rs.Open
    For i = LBound(CSVArray) + 1 To UBound(CSVArray) '+1 to skip header
        rs.AddNew
        rs.Fields("DoubleField").Value = CSVArray(i)
    Next
    rs.UpdateBatch
    rs.MoveLast

    Debug.Print rs.Fields("DoubleField").Value, "Processed 1000 records in: " & Timer - myTimer
End Sub

The good part with this approach, is it is still quite fast. I was able to load up 1000 doubles in ~0.03 seconds as most actions done here are done in memory. This also avoids the need for a Schema.ini file, however there is more code to maintain, so it is a trade-off.

Recommendation


I would try changing the MaxScanRows, if that doesn't work, create a Schema.ini file.




回答2:


Try this

Sub Sample()
    Dim conn As New ADODB.Connection
    Dim RS As New ADODB.Recordset

    Dim FilePath As String, SheetName As String

    '~~> Replace this with relevant values
    FilePath = "C:\Users\routs\Desktop"
    Filename = "Sample.Csv"

    With conn
       .Provider = "Microsoft.ACE.OLEDB.12.0"
       .ConnectionString = "Data Source=" & FilePath & ";Extended Properties='text'"
       .Open
    End With

    strSQL = "select * from " & Filename

    RS.Open strSQL, conn

    '~~> Replace this with relevant field
    Debug.Print RS("Sale")
End Sub



来源:https://stackoverflow.com/questions/54173345/problems-reading-doubles-from-csv-vba

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!