azure-data-lake

Datalake analytic join

自古美人都是妖i 提交于 2019-12-12 04:35:44
问题 I have 2 table. I want classified URL who is in table [Activite_Site] I've try the query below, but it doesn't work... Anyone have idea. Thank you in advance Table [Categorie] URL CAT http//www.site.com/business B2B http//www.site.com/office B2B http//www.site.com/job B2B http//www.site.com/home B2C Table [Actvite_Site] URL http//www.site.com/business/page2/test.html http//www.site.com/business/page3/pagetest/tot.html http//www.site.com/office/all/tot.html http//www.site.com/home/holiday

How to copy azure blob files to azure data lake analytics

孤街醉人 提交于 2019-12-12 04:27:02
问题 Is there a way to create a job or azure service on Azure to move(cut) Azure blob files to Azure data lake store? 回答1: I would say that the Azure Data Factory is a good fit for this. It supports scheduling (https://docs.microsoft.com/en-us/azure/data-factory/data-factory-scheduling-and-execution) and it supports the transfer of data from Blob to Azure Data Lake. See this example: https://docs.microsoft.com/en-us/azure/data-factory/data-factory-azure-datalake-connector From the website: Data

Azure .Net SDK Error : FsOpenStream failed with error 0x83090aa2

一个人想着一个人 提交于 2019-12-12 04:16:39
问题 We are trying to download a file present in Data Lake Store. I have been following the below tutorial which uses .Net Azure SDk. https://azure.microsoft.com/en-us/documentation/articles/data-lake-analytics-get-started-net-sdk/ As we have already the file present in Azure Data Lake Store , I just added the code to download the file FileCreateOpenAndAppendResponse beginOpenResponse = _dataLakeStoreFileSystemClient.FileSystem.BeginOpen("/XXXX/XXXX/test.csv", DataLakeStoreAccountName, new

Run U-SQL Script from C# code with Azure Data Factory

霸气de小男生 提交于 2019-12-12 03:56:47
问题 I am trying to Run an U-SQL script on Azure by C# code. Everything is created on azure (ADF, linkedservices, pipelines, data sets) after code gets executed but U-SQl script is not executed by ADF. I think there is an issue with startTime and end Time configured in pipeline code. I followed following article to complete this console application. Create, monitor, and manage Azure data factories using Data Factory .NET SDK Here is the URL of my complete C# code project for download. https://1drv

Are Guids unique when using a U-SQL Extractor?

陌路散爱 提交于 2019-12-11 17:48:53
问题 As these questions point out, Guid.NewGuid will return the same value for all rows due to the enforced deterministic nature of U-SQL i.e if it's scaled out if an element (vertex) needs retrying then it should return the same value.... Guid.NewGuid() always return same Guid for all rows auto_increment in U-SQL However.... the code example in the officials documentation for a User Defined Extractor purposefully uses Guid.NewGuid(). I'm not querying the validity of the answers for the questions

Azure U-SQL Continous deployment using VSTS Powershell task

北战南征 提交于 2019-12-11 17:46:54
问题 I'm building CI/CD for my Azure Data lake Analytics - USQL code and facing below error while deploying my release using VSTS Power Shell task. "Access from 'example-app1' is denied. Please grant the user with necessary roles on Azure portal. Trace: 03e7229d-e7ca-43d5-a7be-6e0a3a3b9317" I have created Azure AAD following this link - https://docs.microsoft.com/en-us/azure/azure-resource-manager/resource-group-create-service-principal-portal and created a service End point. I also gave access to

USQL JsonTextWriter.Writevalue is throwing error “The type 'Uri' is defined in an assembly that is not referenced”

我只是一个虾纸丫 提交于 2019-12-11 16:17:33
问题 I have a custom outputter for my USQL job which basically writes json output in a file using JsonTextWriter. I get following error when I try to compile error: "The type 'Uri' is defined in an assembly that is not referenced. You must add a reference to assembly 'System, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089'." at line: 60, column 20 Line number 60 is writer.WriteValue("Test"); I basically get this error for all the lines where I am doing WriteValue Here is my

how do we generate a random id for each record?

时光毁灭记忆、已成空白 提交于 2019-12-11 15:08:37
问题 How do we generate a separate file for every record, containing a unique name? I would like every row in my dataset to have a unique identifier, a guid preferably, but it could be anything: @file = EXTRACT col1 string, col2 string, col3 string FROM @file1 USING Extractors.Csv(silent : true); @output = SELECT *, Guid.NewGuid().ToString() AS [myId] FROM @file; I would then create a separate file for each record: OUTPUT @output TO "/myFirstFunction_{myId}.txt" USING Outputters.Tsv(); The files

Execute U-SQL script in ADL storage from Data Factory in Azure

℡╲_俬逩灬. 提交于 2019-12-11 08:31:33
问题 I have a USQL script stored on my ADL store and I am trying to execute it. the script file is quite big - about 250Mb. So far i have a Data Factory, I have created a Linked Service and am trying to create a Data lake Analytics U-SQL Activity. The code for my U-SQL Activity looks like this: { "name": "RunUSQLScript1", "properties": { "description": "Runs the USQL Script", "activities": [ { "name": "DataLakeAnalyticsUSqlActivityTemplate", "type": "DataLakeAnalyticsU-SQL", "linkedServiceName":

How to unit test U-SQL scripts?

烈酒焚心 提交于 2019-12-11 07:35:23
问题 I currently have a U-SQL project with a set of different scripts, and i am trying to create unit tests for them. I can run the scripts locally using the Azure Data Lake tools with a set of test data and generate the expected outputs. The scripts are pure U-SQL data manipulation/transformation so because there are no methods i am not sure whats the correct approach to test this? If anyone has any experience/idea on how it should be done or any documentation please feel free to help. Thank you