问题
I'm trying to figure out away to split the first 100,000 records from a table that has 1 million+ records into 5 (five) 20,000 records chunks to go into a file? Maybe some SQL that will get the min and max rowid or primary id for each 5 chunks of 20,000 records, so I can put the min and max value into a variable and pass it into the SQL and use a BETWEEN in the where clause to the SQL.
Can this be done?
I'm on an Oracle 11g database.
Thanks in advance.
回答1:
If you just want to assign values 1-5 to basically equal sized groups, then use ntile()
:
select t.*, ntile(5) over (order by NULL) as num
from (select t.*
from t
where rownum <= 100000
) t;
If you want to insert into 5 different tables, then use insert all
:
insert all
when num = 1 then into t1
when num = 2 then into t2
when num = 3 then into t3
when num = 4 then into t4
when num = 5 then into t5
select t.*, ntile(5) over (order by NULL) as num
from (select t.*
from t
where rownum <= 100000
) t;
回答2:
Thanks so much to Gordon Linoff for giving me a starter to the code.
just an update on how to get the min and max values for 5 chunks.
select num, min(cre_surr_id), max(cre_surr_id)
from
(select p.cre_surr_id, ntile(5) over (order by NULL) as num
from (select p.*
from productions p
where rownum <= 100000
) p )
group by num
回答3:
You can even try with simple aggregation:
create table test_chunk(val) as
(
select floor(dbms_random.value(1, level * 10)) from dual
connect by level <= 100
)
select min(val), max(val), floor((num+1)/2)
from (select rownum as num, val from test_chunk)
group by floor((num+1)/2)
回答4:
A bit harsh down voting another fair question.
Anyway, NTILE is new to me, so I wouldn't have discovered that were it not for your question.
My way of doing this , the old school way, would have been to MOD the rownum to get the group number, e.g.
select t.*, mod(rn,5) as num
from (select t.*, rownnum rn
from t
) t;
This solves the SQL part, or rather how to group rows into equal chunks, but that is only half your question. The next half is how to write these to 5 separate files.
You can either have 5 separate queries each spooling to a separate file, e.g:
spool f1.dat
select t.*
from (select t.*, rownnum rn
from t
) t
where mod(t.rn,5) = 0;
spool off
spool f2.dat
select t.*
from (select t.*, rownnum rn
from t
) t
where mod(t.rn,5) = 1;
spool off
etc.
Or, using UTL_FILE. You could try something clever with a single query and have an array of UTL_FILE types where the array index matches the MOD(rn,5) then you wouldn't need logic like "IF rn = 0 THEN UTL_FILE.WRITELN(f0, ...".
So, something like (not tested, just in a rough form for guidance, never tried this myself):
DECLARE
TYPE fname IS VARRAY(5) OF VARCHAR2(100);
TYPE fh IS VARRAY(5) OF UTL_FILE.FILE_TYPE;
CURSOR c1 IS
select t.*, mod(rn,5) as num
from (select t.*, rownnum rn
from t
) t;
idx INTEGER;
BEGIN
FOR idx IN 1..5 LOOP
fname(idx) := 'data_' || idx || '.dat';
fh(idx) := UTL_FILE.'THE_DIR', fname(idx), 'w');
END LOOP;
FOR r1 IN c1 LOOP
UTL_FILE.PUT_LINE ( fh(r1.num+1), r1.{column value from C1} );
END LOOP;
FOR idx IN 1..5 LOOP
UTL_FILE.FCLOSE (fh(idx));
END LOOP;
END;
来源:https://stackoverflow.com/questions/36335406/sql-how-would-you-split-a-100-000-records-from-a-oracle-table-into-5-chunks