问题
This question relates to how to propagate a data factory through CI (in VSTS) if there is a self hosted Integration Runtime defined in the Data Factory.
I have a 3 environments set up - Dev / UAT / Prod each with their own data factory.
The Dev hosts the master collaboration branch. I am using VSTS to retrieve the artifacts from the adf_publish branch and deploying the template to UAT (prod will be done later). I followed much of what is in this guide here.
When deploying to blank UAT with a self-hosted integration runtime (IR), the IR that is deployed in UAT is a copy of the shared IR from dev (not a linked type) and this causes an error since the credentials used by the IR will not be correct. I expect this since we are really just deploying an exact copy of the Resource Group template with just the factory name overridden however the IR will not work without it being re-credentialled with the self hosted IR VMs.
If I pre-register a linked IR with the UAT environment (linked to the dev IRs), then the deployment fails with a conflict because an IR in the resource group template is the same name as the one I just created in UAT. If it is a different name - no conflict but the linked services will be pointing to the template IR and not the one I created for UAT
The docs have a note that says the IR runtime should be the same across all the platforms but I do not think this can be true - one of them (presumably the source/dev) must be a shared type and the others linked and authorized.
One option I could see (untested) is to have each environments IR reference be a separate connection to an actual IR but then there then needs to be some way of overriding the linked services to point to the current environments IR reference (by template parameter override?). In this scenario, we need to block the templates IR from being deployed as it won't be needed and won't work.
Has anyone had success in getting CI working in this situation? My sense is the doc was written with the globally shared IR. Either that or I need to better understand the aim of Auto Integration setting in the linked services definition.
Many thanks. Mark.
回答1:
Update I think there are a couple of bugs in the service so not expecting an answer. I'll post updates here if I see resolution from the bug report I have posted here for the dev group.
In a nutshell, this only affects you if
- you have a self hosted integration runtime (IR), and
- you are trying to deploy a template to a new data factory from an existing data factory (as you would in Dev->UAT->Prod)
- you have a datalake (ADL) linked service defined and using the self hosted IR.
If you have a self hosted IR in the template, the newly deployed copy will not be registered with any server (either linked or unique to the new ADF) as the template only records an IR, it does not instantiate one.
While this can be fixed in post deployment config or scripting, what it can't fix is the dependency in ADL. This is because the ADL linked service wants to encrypt the service principal with the IR....but the IR does not exist at the time of template deployment (i.e. is not configured on a server and not active).
It is no better if you select Managed service identity as the auth on the ADL linked service instead of service principal, then the template fails to deploy because there are no credentials to encrypt and it looks like the resource is expecting to encrypt something.
The work around right now is to use Azure hosted IR for datalake connections. Unfortunately for us this causes a security problem because shared IRs cannot be whitelisted in our ADL Gen 1.
I'll keep you posted.
来源:https://stackoverflow.com/questions/53072270/release-pipeline-conflict-with-integration-runtime