Passing a .NET Datatable to MATLAB

耗尽温柔 提交于 2020-01-23 17:30:10

问题


I'm building an interface layer for a Matlab component which is used to analyse data maintained by a separate .NET application which I am also building. I'm trying to serialise a .NET datatable as a numeric array to be passed to the MATLAB component (as part of a more generalised serialisation routine).

So far, I've been reasonably successful with passing tables of numeric data but I've hit a snag when trying to add a column of datatype DateTime. What I've been doing up to now is stuffing the values from the DataTable into a double array, because MATLAB only really cares about doubles, and then doing a straight cast to a MWNumericArray, which is essentially a matrix.

Here's the current code;

else if (sourceType == typeof(DataTable))
{
    DataTable dtSource = source as DataTable;
    var rowIdentifiers = new string[dtSource.Rows.Count];               
    // I know this looks silly but we need the index of each item
    // in the string array as the actual value in the array as well
    for (int i = 0; i < dtSource.Rows.Count; i++)
    {
        rowIdentifiers[i] = i.ToString();
    }
    // convenience vars
    int rowCount = dtSource.Rows.Count;
    int colCount = dtSource.Columns.Count;
    double[,] values = new double[rowCount, colCount];

    // For each row 
    for (int rownum = 0; rownum < rowCount; rownum++)
    {
        // for each column
        for (int colnum = 0; colnum < colCount; colnum++)
        {
            // ASSUMPTION. value is a double
            values[rownum, colnum] = Conversion.ConvertToDouble(dtSource.Rows[rownum][colnum]);
        }
    }
    return (MWNumericArray)values;
}

Conversion.ConvertToDouble is my own routine which caters for NULLS, DBNull and returns double.NaN, again because Matlab treats all NULLS as NaNs.

So here's the thing; Does anyone know of a MATLAB datatype that would allow me to pass in a contiguous array with multiple datatypes? The only workaround I can conceive of involves using a MWStructArray of MWStructArrays, but that seems hacky and I'm not sure how well it would work in the MATLAB code, so I'd like to try to find a more elegant solution if I can. I've had a look at using an MWCellArray, but it gives me a compile error when I try to instantiate it.

I'd like to be able to do something like;

object[,] values = new object[rowCount, colCount];
// fill loosely-typed object array
return (MWCellArray)values;

But as I said, I get a compile error with this, also with passing an object array to the constructor.

Apologies if I have missed anything silly. I've done some Googling, but information on Matlab to .NET interfaces seems a little light, so that is why I posted it here.

Thanks in advance.

[EDIT]

Thanks to everyone for the suggestions.

Turns out that the quickest and most efficient way for our specific implementation was to convert the Datetime to an int in the SQL code.

However, of the other approaches, I would recommend using the MWCharArray approach. It uses the least fuss, and it turns out I was just doing it wrong - you can't treat it like another MWArray type, as it is of course designed to deal with multiple datatypes you need to iterate over it, sticking in MWNumerics or whatever takes your fancy as you go. One thing to be aware of is that MWArrays are 1-based, not 0-based. That one keeps catching me out.

I'll go into a more detailed discussion later today when I have the time, but right now I don't. Thanks everyone once more for your help.


回答1:


As @Matt suggested in the comments, if you want to store different datatypes (numeric, strings, structs, etc...), you should use the equivalent of cell-arrays exposed by this managed API, namely the MWCellArray class.

To illustrate, I implemented a simple .NET assembly. It exposes a MATLAB function that receives a cell-array (records from a database table), and simply prints them. This function would be called from our C# application, which generates a sample DataTable, and convert it into a MWCellArray (fill table entries cell-by-cell).

The trick is to map the objects contained in the DataTable to the supported types by the MWArray-derived classes. Here are the ones I used (check the documentation for a complete list):

.NET native type          MWArray classes
------------------------------------------
double,float,int,..       MWNumericArray
string                    MWCharArray
DateTime                  MWNumericArray       (using Ticks property)

A note about the date/time data: in .NET, the System.DateTime expresses date and time as:

the number of 100-nanosecond intervals that have elapsed since January 1, 0001 at 00:00:00.000

while in MATLAB, this is what the DATENUM function has to say:

A serial date number represents the whole and fractional number of days from a specific date and time, where datenum('Jan-1-0000 00:00:00') returns the number 1

For this reason, I wrote two helper functions in the C# application to convert the DateTime "ticks" to match the MATLAB definition of serial date numbers.


First, consider this simple MATLAB function. It expects to receive a numRos-by-numCols cellarray containing the table data. In my example, the columns are: Name (string), Price (double), Date (DateTime)

function [] = my_cell_function(C)
    names = C(:,1);
    price = cell2mat(C(:,2));
    dt = datevec( cell2mat(C(:,3)) );

    disp(names)
    disp(price)
    disp(dt)
end

Using deploytool from MATLAB Builder NE, we build the above as a .NET assembly. Next, we create a C# console application, then add a reference to the MWArray.dll assembly, in addition to the above generated one. This is the program I am using:

using System;
using System.Data;
using MathWorks.MATLAB.NET.Utility;  // MWArray.dll
using MathWorks.MATLAB.NET.Arrays;   // MWArray.dll
using CellExample;                   // CellExample.dll assembly created

namespace CellExampleTest
{
    class Program
    {
        static void Main(string[] args)
        {
            // get data table
            DataTable table = getData();

            // create the MWCellArray
            int numRows = table.Rows.Count;
            int numCols = table.Columns.Count;
            MWCellArray cell = new MWCellArray(numRows, numCols);   // one-based indices

            // fill it cell-by-cell
            for (int r = 0; r < numRows; r++)
            {
                for (int c = 0; c < numCols; c++)
                {
                    // fill based on type
                    Type t = table.Columns[c].DataType;
                    if (t == typeof(DateTime))
                    {
                        //cell[r+1,c+1] = new MWNumericArray( convertToMATLABDateNum((DateTime)table.Rows[r][c]) );
                        cell[r + 1, c + 1] = convertToMATLABDateNum((DateTime)table.Rows[r][c]);
                    }
                    else if (t == typeof(string))
                    {
                        //cell[r+1,c+1] = new MWCharArray( (string)table.Rows[r][c] );
                        cell[r + 1, c + 1] = (string)table.Rows[r][c];
                    }
                    else
                    {
                        //cell[r+1,c+1] = new MWNumericArray( (double)table.Rows[r][c] );
                        cell[r + 1, c + 1] = (double)table.Rows[r][c];
                    }
                }
            }

            // call MATLAB function
            CellClass obj = new CellClass();
            obj.my_cell_function(cell);

            // Wait for user to exit application
            Console.ReadKey();
        }

        // DateTime <-> datenum helper functions
        static double convertToMATLABDateNum(DateTime dt)
        {
            return (double)dt.AddYears(1).AddDays(1).Ticks / (10000000L * 3600L * 24L);
        }
        static DateTime convertFromMATLABDateNum(double datenum)
        {
            DateTime dt = new DateTime((long)(datenum * (10000000L * 3600L * 24L)));
            return dt.AddYears(-1).AddDays(-1);
        }

        // return DataTable data
        static DataTable getData()
        {
            DataTable table = new DataTable();
            table.Columns.Add("Name", typeof(string));
            table.Columns.Add("Price", typeof(double));
            table.Columns.Add("Date", typeof(DateTime));

            table.Rows.Add("Amro", 25, DateTime.Now);
            table.Rows.Add("Bob", 10, DateTime.Now.AddDays(1));
            table.Rows.Add("Alice", 50, DateTime.Now.AddDays(2));

            return table;
        }
    }
}

The output of this C# program as returned by the compiled MATLAB function:

'Amro'
'Bob'
'Alice'

25
10
50

     2011            9           26           20           13       8.3906
     2011            9           27           20           13       8.3906
     2011            9           28           20           13       8.3906



回答2:


One option, is to just open up .NET code directly from matlab, and have matlab query the database directly, using your .net interface instead of trying to go through this serialization process you describe. I have done this repeatedly in our environment with great success. In such an an endeavor Net.addAssembly is your biggest friend.

Details are here. http://www.mathworks.com/help/matlab/ref/net.addassembly.html

A second option would be to go with Matlab Cell Array's. You can set it up, so the columns are different data types, each column forming a cell. That is a trick matlab itself uses in the textscan function. I'd recommend reading the documentation for that function here: http://www.mathworks.com/help/techdoc/ref/textscan.html

A third option, is to use textscan completely. Write a text file out from your .net code, and let textscan handle the parsing of it. Textscan is very powerful mechanism for getting this kind of data into matlab. You can point textscan to a file, or to a bunch of strings.




回答3:


I have tried the functions written by @Amro but the result for certain dates are not correct.

What I tried was:

  1. Create a date in C#
  2. Use function to convert to Matlab date num as supplied by @Amro
  3. Use that number in Matlab to check its correctness

It seems to have problems with date with 1 Jan 00:00:00 for some years e.g. 2014, 2015. For example,

DateTime dt = new DateTime(2014, 1, 1, 0, 0, 0);
double dtmat = convertToMATLABDateNum(dt);

I got dtmat = 735599.0 from this. I used in Matlab as follow:

datestr(datenum(735599.0))

I got this in return:

ans = 31-Dec-2013

When I tried 1 Jan 2012 it was OK. Any suggestion or why this happens?




回答4:


I had the same issue as @Johan. The problem is in Leap years that not calculate correctly the date

To fix it I change the code that converts the DateTime to the following:

private static long MatlabDateConversionFactor = (10000000L * 3600L * 24L);
private static long tickDiference = 367;

public static double convertToMATLABDateNum(DateTime dt) {
    var converted = ((double)dt.Ticks / (double)MatlabDateConversionFactor);
    return converted + tickDiference;
}

public static DateTime convertFromMATLABDateNum(double datenum) {
    var ticks = (long)((datenum - 367) * MatlabDateConversionFactor);
    return new DateTime(ticks, DateTimeKind.Utc);
}


来源:https://stackoverflow.com/questions/7525649/passing-a-net-datatable-to-matlab

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!