u-sql

Azure Data Lake Analytics: Combine overlapping time duration using U-SQL

不羁岁月 提交于 2019-12-02 04:31:32
问题 I want to remove overlapping time duration from CSV data placed in Azure Data Lake Store using U-SQL and combine those rows. Data set contains start time and end time with several other attributes for each record. Here is an example: Start Time - End Time - Usar Name 5:00 AM - 6:00 AM - ABC 5:00 AM - 6:00 AM - XYZ 8:00 AM - 9:00 AM - ABC 8:00 AM - 10:00 AM - ABC 10:00 AM - 2:00 PM - ABC 7:00 AM - 11:00 AM - ABC 9:00 AM - 11:00 AM - ABC 11:00 AM - 11:30 AM - ABC After removing overlap, output

U-SQL Error while using REFERENCE ASSEMBLY

不想你离开。 提交于 2019-12-02 03:51:33
问题 I created a U-SQL library using Azure APIs and register the assembly on Azure cloud with all dependencies. I added this library with my U-SQL project and added below line On my U-SQL script USE master; REFERENCE ASSEMBLY [AzureLibrary]; At the time of using functions or methods I created in library, I got below error message. Inner exception from user expression: Could not load file or assembly 'Microsoft.Azure.Management.DataLake.Store, Version=1.0.0.0, Culture=neutral, PublicKeyToken

How do I partition a large file into files/directories using only U-SQL and certain fields in the file?

三世轮回 提交于 2019-12-02 00:09:13
问题 I have an extremely large CSV, where each row contains customer and store ids, along with transaction information. The current test file is around 40 GB (about 2 days worth), so partitioning is an absolute must for any reasonable return time on select queries. My question is this: When we receive a file, it contains multiple store's data. I would like to use the "virtual column" functionality to separate this file into the respective directory structure. That structure is "/Data/{CustomerId}/

Debugging u-sql Jobs

谁说我不能喝 提交于 2019-11-30 21:22:37
问题 I would like to know if there are any tips and tricks to find error in data lake analytics jobs. The error message seems most of the time to be not very detailed. When trying to extract from CSV file I often get error like this Vertex failure triggered quick job abort. Vertex failed: SV1_Extract[0] with >error: Vertex user code error. Vertex failed with a fail-fast error It seems that these error occur when trying to convert the columns to specified types. The technique I found is to extract

How to preprocess and decompress .gz file on Azure Data Lake store?

允我心安 提交于 2019-11-30 09:40:28
问题 Will USQL support to Compress and Decompress a file.? I would like decompress a compressed file to perform some validations and once they are passed, would like to compress the data to new file. 回答1: In addition, doing automatic compression on OUTPUT is on the roadmap. Please add your vote to https://feedback.azure.com/forums/327234-data-lake/suggestions/13418367-support-gzip-on-output-as-well 回答2: According to the main EXTRACT article, U-SQL EXTRACT method automatically recognises the GZip

How to preprocess and decompress .gz file on Azure Data Lake store?

陌路散爱 提交于 2019-11-29 16:48:54
Will USQL support to Compress and Decompress a file.? I would like decompress a compressed file to perform some validations and once they are passed, would like to compress the data to new file. In addition, doing automatic compression on OUTPUT is on the roadmap. Please add your vote to https://feedback.azure.com/forums/327234-data-lake/suggestions/13418367-support-gzip-on-output-as-well According to the main EXTRACT article, U-SQL EXTRACT method automatically recognises the GZip format, so you don't need to do anything special. Extraction from compressed data In general, the files are passed

Error while running U-SQL Activity in Pipeline in Azure Data Factory

断了今生、忘了曾经 提交于 2019-11-29 08:06:29
I am getting following error while running a USQL Activity in the pipeline in ADF: Error in Activity: {"errorId":"E_CSC_USER_SYNTAXERROR","severity":"Error","component":"CSC", "source":"USER","message":"syntax error. Final statement did not end with a semicolon","details":"at token 'txt', line 3\r\nnear the ###:\r\n**************\r\nDECLARE @in string = \"/demo/SearchLog.txt\";\nDECLARE @out string = \"/scripts/Result.txt\";\nSearchLogProcessing.txt ### \n", "description":"Invalid syntax found in the script.", "resolution":"Correct the script syntax, using expected token(s) as a guide.",

U- SQL Unable to extract data from JSON file

我与影子孤独终老i 提交于 2019-11-28 08:03:48
问题 I was trying to extract data from a JSON file using USQL. Either the query runs successfully without producing any output data or results in "vertex failed fast error". The JSON file looks like: { "results": [ { "name": "Sales/Account", "id": "7367e3f2-e1a5-11e5-80e8-0933ecd4cd8c", "deviceName": "HP", "deviceModel": "g6-pavilion", "clientip": "0.41.4.1" }, { "name": "Sales/Account", "id": "c01efba0-e0d5-11e5-ae20-af6dc1f2c036", "deviceName": "acer", "deviceModel": "veriton", "clientip": "10

U-SQL - Extract data from json-array

守給你的承諾、 提交于 2019-11-27 21:42:00
Already tried the suggested JSONPath option, but it seems the JSONExtractor only recognizes root level. In my case I have to deal with a nested json-structure, with an array as well (see example below). Any options for extracting this without multiple intermediate files? "relation": { "relationid": "123456", "name": "relation1", "addresses": { "address": [{ "addressid": "1", "street": "Street 1", "postcode": "1234 AB", "city": "City 1" }, { "addressid": "2", "street": "Street 2", "postcode": "5678 CD", "city": "City 2" }] }} SELECT relationid, addressid, street, postcode, city ? Michael Rys

Does U-SQL allow custom code to call external services

◇◆丶佛笑我妖孽 提交于 2019-11-27 15:43:23
In U-SQL custom code (code behind or Assemblies) can external services be called e.g. bing search or map. Thanks, Nasir This is currently not supported for the following reason: Imagine that you write a UDF or UDO (e.g., an extractor) that calls a REST endpoint of a service that is used to get a few calls per minute from the same originating IP address. But now you execute this user code in a U-SQL job that is scaled out over millions of rows, running possibly on hundreds of vertices concurrently. This is a - hopefully unintended - distributed denial of service attack against that service. And