问题
I need to read data from an Excel workbook, where data is stored in this manner:
Company Accounts
Company1 (#3000...#3999)
Company2 (#4000..4019)+(#4021..4024)
where the expected output, using a OLE DB Destination in SSIS would be:
Company Accounts
Company1 3000
Company1 3001
Company1 3002
. .
. .
. .
Company1 3999
Company2 4000
Company2 4001
. .
. .
. .
Company2 4019
Company2 4021
. .
. .
Company2 4024
This has me perplexed, I don't know how to even begin process this problem.
Does someone have any insight into this?
回答1:
First, you must insert your data to some temp table. Here are several ways. Then run this query:
with cte as (
select
company, replace(replace(replace(accounts,'(',''),')',''),'+','')+'#' accounts
from
(values ('company 1','#3000#3999'),('company 2','(#4000#4019)+(#4021#4024)')) data(company, accounts)
)
, rcte as (
select
company, stuff(accounts, ind1, ind2 - ind1, '') acc, substring(accounts, ind1 + 1, ind2 - ind1 - 1) accounts
from
cte
cross apply (select charindex('#', accounts) ind1) ca
cross apply (select charindex('#', accounts, ind1 + 1) ind2) cb
union all
select
company, stuff(acc, ind1, ind2 - ind1, ''), substring(acc, ind1 + 1, ind2 - ind1 - 1)
from
rcte
cross apply (select charindex('#', acc) ind1) ca
cross apply (select charindex('#', acc, ind1 + 1) ind2) cb
where
len(acc)>1
)
select company, accounts from rcte
order by company, accounts
option (maxrecursion 0)
回答2:
You can add a script component to achieve this:
- Add a Script Component
- Make the Output buffer asynchronous with the input.
- And for each Row split the retrieve the Minimum and Maximum value from the Account column.
- Then use a For loop to loop over values between the minimum and maximum values retrieved.
- Create Output rows inside the Loop
回答3:
Assuming 2 cells in the worksheet, the general logic that occurs to me is to explode the second cell (per line) twice. First pass splits the string using + as the delimiter and returning one or more rows per company. Repeat that logic using .. as the delimiter but returning 2 columns per row. with that, you can loop or use a table of numbers to generate the desierd set. How best to do that in ssis is a question I can't answer since that is not an area of experience. the numbers table approach is relatively simple and common.
回答4:
First you need one split string function or according to your data you need one custom split function. My example use this dbo.DelimitedSplit8K
But as I said after analysing data from excel,i may create one custom TVF.
Second ,you must have number table,you can create one of your own logic. This is one time creation and population
CREATE TABLE tblnumber (number INT PRIMARY KEY)
INSERT INTO tblnumber
SELECT ROW_NUMBER() OVER (
ORDER BY a.number
)
FROM master..spt_values a
,master..spt_values b
This is just an concept based on your current dataset.
You need to pull all excel data into Staging table.
create table #staging(Company varchar(50),Accounts varchar(50))
insert into #staging values
('Company1', '#3000...#3999')
,('Company2','#4000..4019)+(#4021..4024')
Then,
;with CTE as
(
select Company
,min(ca.Item) MinAcoount,max(ca.Item) MaxAcoount
from
(
select Company
,replace(replace(replace(replace(Accounts,'#','') ,')',''),'(',''),'+','.')Accounts
from #staging
)tbl
cross apply(Select * from dbo.DelimitedSplit8K(Accounts,'.'))ca
where ca.Item<>''
group by Company
)
select c.Company,number as Account from tblnumber n
inner join cte c on n.number>=MinAcoount and n.number<=MaxAcoount
As I am using CTE just for example.This is just for understanding. Cleaning work of Account is for you to understand.
来源:https://stackoverflow.com/questions/47952949/create-n-new-rows-from-raw-data-such-as-1000-1000n