SSIS Sequential Processing

问题

I have 5 independent data flows in the same data flow task each having a source and destination. How can I make them run Sequentially .they seem to run in parallel . I may do it in different data flow tasks. but how can i do it in a single data flow task

回答1:

Don't have independent data flows in the same task. I know the Import/Export wizard will do that but just because a team at Microsoft does something, doesn't make it a best practice. The Data Flow gets its power and performance through "free" parallelization. If you don't want that, please, for the sake of those who maintain your future code, create 4 additional data flows and copy/paste into them. There is absolutely no harm in doing this.

For the sake of actually answering the above question, you will have to introduce a dependency of some sort. In the pantheon of horrible ideas, the following is near the top.

I assume your data flow with multiple independent flows within it looks something like Source (doesn't matter) to an OLE DB Destination. Modify your source query or add a Derived Column in after it and create an column of type int (DT_I4) and call it something unique HackedSortKey and assign a value of 1 to it.

Remove the existing OLE DB Destination on all but one of them. Replace it with an OLE DB Command instead. The value of using OLE DB Command is that it allows rows to pass through. As the name implies, OLE DB Destination is only a sink for data. The only output column from it is an Error one. Write your INSERT queries for each. That's the design pain of the Command object but you'll also experience the run-time pain of them as they perform singleton operations on the database. "Oh, I have a row to insert. One moment while I issue the command. Oh, I have a row to insert. One moment please." Every single row will get this treatment.

Take your first Source to Command object. Attach a Fully Blocking component to it. Use a Sort. Order by HackedSortKey column, remove dupes and allow no other column through. The point of this is to force a wait. Only once all of the data has passed through the OLE DB Command above will the Sort release downstream rows (because it won't know what the sort isuntil it's seen all rows). By selecting the distinct value thing, this will reduce the original rows to A row.

Logjam in stream A, meet stream B. Stream B now looks like "Source B" -> "Sort B" -> "Merge Join AB" -> "OLE DB Command B" -> "Sort on HackedSourceKey". The "Sort B" is needed because Merge Join requires sorted input. There will be a match as the same value is used in our fake match columns. However, you will need to make sure it's a LEFT OUTER JOIN match and not an INNER.

Lather, rinse, repeat this process for the remaining data flows. But really, you want to use different data flows and have the precedence constraint manage execution.

回答2:

String the data flows sequentially using the Completion constraint instead of the Success constraint. That way they will each run independently of the others success or failure but will run one at a time.

To set the value of the constraint, double-click the line going from one task to another and change the value from Success to Completion.

来源：https://stackoverflow.com/questions/17903692/ssis-sequential-processing

标签

sql-server

etl

ssis