U-SQL How can I get the current filename being processed to add to my extract output?

你说的曾经没有我的故事 提交于 2019-12-07 05:45:10

问题


I need to add meta data about the Row being processed. I need the filename to be added as a column. I looked at the ambulance demos in the Git repo, but can't figure out how to implement this.


回答1:


You use a feature of U-SQL called 'file sets' and 'virtual columns'. In my simple example, I have two files in my input directory, I use file sets and refer to the virtual columns in the EXTRACT statement, eg

// Filesets, file set with virtual column
@q =
    EXTRACT rowId int,
            filename string,
            extension string
    FROM "/input/filesets example/{filename}.{extension}"
    USING Extractors.Tsv();


@output =
    SELECT filename,
           extension,
           COUNT( * ) AS records
    FROM @q
    GROUP BY filename,
             extension;


OUTPUT @output TO "/output/output.csv"
USING Outputters.Csv();

My results:

Read more about both features here:

https://msdn.microsoft.com/en-us/library/azure/mt621320.aspx



来源:https://stackoverflow.com/questions/40998910/u-sql-how-can-i-get-the-current-filename-being-processed-to-add-to-my-extract-ou

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!