问题
I have below mentioned data. I am looking to get min of Start message and corresponding min of success message. If there is no start or success message present then it should show null.
Start Message Table:
ID1 Timestamp_start_msg_recieved date jobid message time in seconds
1234 5/14/2014 10:02:29 5/14/2014 abc start 262
1234 5/14/2014 10:02:31 5/14/2014 abc start 264
1234 5/14/2014 10:02:45 5/14/2014 abc start 278
1234 5/14/2014 10:02:50 5/14/2014 abc start 285
1234 5/14/2014 10:09:04 5/14/2014 abc start 165
1234 5/14/2014 10:09:06 5/14/2014 abc start 2167
1234 5/14/2014 10:09:16 5/14/2014 abc start 2180
1234 5/14/2014 10:09:26 5/14/2014 abc start 2190
1234 5/14/2014 11:45:11 5/14/2014 abc start 8767
1234 5/14/2014 16:48:20 5/14/2014 abc start 878
1234 5/14/2014 19:02:52 5/14/2014 abc start 687
5678 5/14/2014 22:02:52 5/14/2014 pqr start 501
5678 5/14/2014 23:10:40 5/14/2014 abcd start 200
Success Message Table:
ID1 Timestamp_success_msg_recieved date jobid message time in seconds
1234 5/14/2014 10:02:52 5/14/2014 abc successful 290
1234 5/14/2014 10:09:32 5/14/2014 abc successful 4280
1234 5/14/2014 11:45:15 5/14/2014 abc successful 8774
1234 5/14/2014 11:45:18 5/14/2014 abc successful 8777
1234 5/14/2014 11:45:19 5/14/2014 abc successful 8778
1234 5/14/2014 11:45:25 5/14/2014 abc successful 8784
1234 5/14/2014 16:48:22 5/14/2014 abc successful 880
1234 5/14/2014 19:03:00 5/14/2014 abc successful 699
5678 5/14/2014 22:03:00 5/14/2014 pqr successful 250
5678 5/19/2014 14:00:16 5/19/2014 pqr successful 400
Expected Result:
ID1 IMESTAMP_for_start_message TIMESTAMP_for_success_message Date Jobid msg msg start_secs success_secs
1234 5/14/2014 10:02:29 5/14/2014 10:02:52 5/14/2014 abc start success 262 290
1234 5/14/2014 10:09:04 5/14/2014 10:09:32 5/14/2014 abc start success 165 4280
1234 5/14/2014 11:45:11 5/14/2014 11:45:25 5/14/2014 abc start success 8767 8784
1234 5/14/2014 16:48:20 5/14/2014 16:48:22 5/14/2014 abc start success 878 880
1234 5/14/2014 19:02:52 5/14/2014 19:03:00 5/14/2014 abc start success 687 699
5678 5/14/2014 22:02:52 5/14/2014 22:03:00 5/14/2014 pqr start success 501 699
5678 5/14/2014 23:10:40 null 5/14/2014 abcd start success 250 null
5678 null 5/19/2014 14:00:16 5/19/2014 pqr null success null 400
I am trying to get Min of start_timestamp in combination with the very next Min of success_timestamp corresponding to id1 and jobid. If there is a list of start message and no success message for a given id1 and jobid, then it should show NULL and viceversa. Tried using Temporary table using WITH clause and also used self join method. Below is my query, But WITH clause query returns MIN of overall data in the table.
NOTE: TIME IN SECONDS has random values and not actual data.
Query Used:
WITH DATA AS
(SELECT MIN(smt.column13) timestamp_for_success_message
FROM success_table1 smt, start_table2 b
WHERE
(SMT.id1 = b.id1)
AND (SMT.jobid = b.jobid)
AND (SMT.timestamp_for_success_message_recieved >= b.timestamp_for_start_message_recieved)
)
SELECT distinct a.timestamp_for_success_message_recieved,
b.timestamp_for_start_message_recieved,
b.id1,
b.jobid
FROM data a,
start_table2 b
order by b.timestamp_start_message_recieved, a.timestamp_for_success_message_recieved, b.jobid, b.id1;
回答1:
select nvl(a.ID1,b.ID1) ID1 , start_timestamp , success_timestamp
from
(select ID1 , min(timestamp) start_timestamp
from Start_Message_Table
group by ID1) a
full outer join
(select ID1 , min(timestamp) success_timestamp
from Success_Message_Table
group by ID1) b
on a.ID1 = b.ID1;
Hoping , i undestand the problem clearly. Try to use above query. Please add if any extra columns required in inner queries.
回答2:
This solution is not single query, it requires creating table for results and running procedure. I think it's possible to solve this problem with recursive query, but I didn't manage to create one.
Procedure is not optimised, probably slow on big sets of data, but... it works.
One more thing - I'm not sure what that TIME IN SECONDS has random values and not actual data means. Your results suggests, that you just ignore seconds in calculations, but want them in result, so I rebuilt code (what probably slows things due to all these trunc's and others). Also - it would be easier if you added primary key to tables.
Code to create table, run procedure and get results:
create table table_pairs as
select ts.ID1, ts.TIME_START, te.TIME_END, ts.tdate,
ts.JOBID, ts.MSG msg_start, te.MSG msg_end,
cast(null as timestamp) time_start_max
from table_start ts, table_end te
where 1=0;
begin p_pairs; end;
select id1, to_char(time_start, 'yyyy-MM-dd HH24:mi:ss') time_start,
to_char(time_end, 'yyyy-MM-dd HH24:mi:ss') time_end,
tdate, jobid, msg_start, msg_end
--, to_char(time_start_max, 'yyyy-MM-dd HH24:mi:ss') time_start_max
from table_pairs
order by id1, time_start, time_end, jobid;
Results:
id1 time_start time_end tdate jobid msg_start msg_end
---- ------------------- ------------------- ---------- ----- ------------ ----------------
1234 2014-05-14 10:02:29 2014-05-14 10:02:52 2014-05-14 abc start 262 successful 290
1234 2014-05-14 10:09:04 2014-05-14 10:09:32 2014-05-14 abc start 165 successful 4280
1234 2014-05-14 11:45:11 2014-05-14 11:45:25 2014-05-14 abc start 8767 successful 8784
1234 2014-05-14 16:48:20 2014-05-14 16:48:22 2014-05-14 abc start 878 successful 880
1234 2014-05-14 19:02:52 2014-05-14 19:03:00 2014-05-14 abc start 687 successful 699
5678 2014-05-14 22:02:52 2014-05-14 22:03:00 2014-05-14 pqr start 501 successful 250
5678 2014-05-14 23:10:40 2014-05-14 abcd start 200
5678 2014-05-19 14:00:16 2014-05-14 pqr successful 400
Procedure:
create or replace procedure p_pairs is
r_start table_start%rowtype;
r_pair table_pairs%rowtype;
v_start_min table_pairs.time_start%type;
v_start_max table_pairs.time_start_max%type;
cursor c_success is
select * from table_pairs order by id1, jobid, time_end for update;
begin
begin -- delete everything from w_pairs and insert all ended processes
delete from table_pairs;
--simple version with proper seconds handling
--insert into table_pairs (id1, jobid, tdate, time_end, msg_end)
-- (select id1, jobid, tdate, time_end, msg from table_end);
-- complicated version for seconds ignored
insert into table_pairs (id1, jobid, tdate, time_end, msg_end)
(select id1, jobid, tdate, max(time_end), max(msg)
from (
select id1, jobid, tdate, time_end,
last_value(msg) over (partition by id1, jobid, tdate, trunc(time_end, 'mi')
order by null rows between unbounded preceding and unbounded following) msg
from table_end)
group by id1, jobid, tdate, trunc(time_end, 'mi')
);
end;
for r_pair in c_success
loop
begin -- find matching starting process
select min(time_start), max(time_start) into v_start_min, v_start_max
from (
select * from table_start ts1
where ts1.id1 = r_pair.id1 and ts1.jobid = r_pair.jobid
and trunc(ts1.time_start, 'mi') <= trunc(r_pair.time_end, 'mi')
minus -- eliminate already "used" processes
select * from table_start ts2
where ts2.jobid = r_pair.jobid
and trunc(ts2.time_start, 'mi') <= (
select trunc(max(time_start_max), 'mi') from table_pairs
where table_pairs.jobid = r_pair.jobid and table_pairs.id1=r_pair.id1
)
);
select * into r_start
from (
select * from table_start ts
where ts.jobid = r_pair.jobid and ts.id1 = r_pair.id1
and trunc(time_start,'mi') <= trunc(r_pair.time_end, 'mi')
and trunc(ts.time_start, 'mi') = trunc(v_start_min, 'mi')
order by time_start
)
where rownum = 1;
update table_pairs set
tdate = r_start.tdate,
time_start = v_start_min,
time_start_max = v_start_max,
msg_start = r_start.msg
where current of c_success;
exception when no_data_found then
null; -- no matching starting process
end;
end loop;
begin -- add started and not finished processes
insert into table_pairs (id1, jobid, time_start, tdate, msg_start)
select id1, jobid, time_start, tdate, msg
from (
select * from table_start
minus
select ts.*
from table_start ts
join table_pairs tp
on ts.jobid = tp.jobid and ts.id1=tp.id1
and trunc(ts.time_start, 'mi')
between trunc(tp.time_start, 'mi') and trunc(tp.time_start_max, 'mi')
);
end;
end p_pairs;
Input data preparation:
create table TABLE_START
(
ID1 NUMBER,
TIME_START TIMESTAMP(6),
TDATE DATE,
JOBID VARCHAR2(10),
MSG VARCHAR2(20)
);
insert into table_start
select 1234, to_date('05/14/2014 10:02:29', 'MM/DD/YYYY HH24:mi:ss'), to_date('05/14/2014', 'MM/DD/YYYY'), 'abc', 'start 262' from dual
union all select 1234, to_date('05/14/2014 10:02:31', 'MM/DD/YYYY HH24:mi:ss'), to_date('05/14/2014', 'MM/DD/YYYY'), 'abc', 'start 264' from dual
union all select 1234, to_date('05/14/2014 10:02:45', 'MM/DD/YYYY HH24:mi:ss'), to_date('05/14/2014', 'MM/DD/YYYY'), 'abc', 'start 278' from dual
union all select 1234, to_date('05/14/2014 10:02:50', 'MM/DD/YYYY HH24:mi:ss'), to_date('05/14/2014', 'MM/DD/YYYY'), 'abc', 'start 285' from dual
union all select 1234, to_date('05/14/2014 10:09:04', 'MM/DD/YYYY HH24:mi:ss'), to_date('05/14/2014', 'MM/DD/YYYY'), 'abc', 'start 165' from dual
union all select 1234, to_date('05/14/2014 10:09:06', 'MM/DD/YYYY HH24:mi:ss'), to_date('05/14/2014', 'MM/DD/YYYY'), 'abc', 'start 2167' from dual
union all select 1234, to_date('05/14/2014 10:09:16', 'MM/DD/YYYY HH24:mi:ss'), to_date('05/14/2014', 'MM/DD/YYYY'), 'abc', 'start 2180' from dual
union all select 1234, to_date('05/14/2014 10:09:26', 'MM/DD/YYYY HH24:mi:ss'), to_date('05/14/2014', 'MM/DD/YYYY'), 'abc', 'start 2190' from dual
union all select 1234, to_date('05/14/2014 11:45:11', 'MM/DD/YYYY HH24:mi:ss'), to_date('05/14/2014', 'MM/DD/YYYY'), 'abc', 'start 8767' from dual
union all select 1234, to_date('05/14/2014 16:48:20', 'MM/DD/YYYY HH24:mi:ss'), to_date('05/14/2014', 'MM/DD/YYYY'), 'abc', 'start 878' from dual
union all select 1234, to_date('05/14/2014 19:02:52', 'MM/DD/YYYY HH24:mi:ss'), to_date('05/14/2014', 'MM/DD/YYYY'), 'abc', 'start 687' from dual
union all select 5678, to_date('05/14/2014 22:02:52', 'MM/DD/YYYY HH24:mi:ss'), to_date('05/14/2014', 'MM/DD/YYYY'), 'pqr', 'start 501' from dual
union all select 5678, to_date('05/14/2014 23:10:40', 'MM/DD/YYYY HH24:mi:ss'), to_date('05/14/2014', 'MM/DD/YYYY'), 'abcd', 'start 200' from dual
create table TABLE_END
(
ID1 NUMBER,
TIME_END TIMESTAMP(6),
TDATE DATE,
JOBID VARCHAR2(10),
MSG VARCHAR2(20)
);
insert into table_end
select 1234, to_date('05/14/2014 10:02:52', 'MM/DD/YYYY HH24:mi:ss'), to_date('05/14/2014', 'MM/DD/YYYY'), 'abc', 'successful 290' from dual
union all select 1234, to_date('05/14/2014 10:09:32', 'MM/DD/YYYY HH24:mi:ss'), to_date('05/14/2014', 'MM/DD/YYYY'), 'abc', 'successful 4280' from dual
union all select 1234, to_date('05/14/2014 11:45:15', 'MM/DD/YYYY HH24:mi:ss'), to_date('05/14/2014', 'MM/DD/YYYY'), 'abc', 'successful 8774' from dual
union all select 1234, to_date('05/14/2014 11:45:18', 'MM/DD/YYYY HH24:mi:ss'), to_date('05/14/2014', 'MM/DD/YYYY'), 'abc', 'successful 8777' from dual
union all select 1234, to_date('05/14/2014 11:45:19', 'MM/DD/YYYY HH24:mi:ss'), to_date('05/14/2014', 'MM/DD/YYYY'), 'abc', 'successful 8778' from dual
union all select 1234, to_date('05/14/2014 11:45:25', 'MM/DD/YYYY HH24:mi:ss'), to_date('05/14/2014', 'MM/DD/YYYY'), 'abc', 'successful 8784' from dual
union all select 1234, to_date('05/14/2014 16:48:22', 'MM/DD/YYYY HH24:mi:ss'), to_date('05/14/2014', 'MM/DD/YYYY'), 'abc', 'successful 880' from dual
union all select 1234, to_date('05/14/2014 19:03:00', 'MM/DD/YYYY HH24:mi:ss'), to_date('05/14/2014', 'MM/DD/YYYY'), 'abc', 'successful 699' from dual
union all select 5678, to_date('05/14/2014 22:03:00', 'MM/DD/YYYY HH24:mi:ss'), to_date('05/14/2014', 'MM/DD/YYYY'), 'pqr', 'successful 250' from dual
union all select 5678, to_date('05/19/2014 14:00:16', 'MM/DD/YYYY HH24:mi:ss'), to_date('05/14/2014', 'MM/DD/YYYY'), 'pqr', 'successful 400' from dual
回答3:
My understanding of the issue is that each row in the start table represents a job of some kind starting. The success table represents that job finishing. To find out when a job finished, you need to find the row in the success table that matches id1 and jobid columns with the lowest timestamp that is greater that the start row timestamp unless there is an earlier row in the start table that matches the success row.
For example the first row in the start table matches the first row in the success table, but the second row in the start table has no match in the success table.
To resolve this I've used nested sub-queries to construct each piece of data needed.
SELECT start.id1, start.jobid, start.TIMESTAMP_START_MSG_RECIEVED AS start, table2.end
FROM start
LEFT OUTER JOIN (
SELECT table1.id1, table1.jobid, MIN(table1.start) AS start, table1.end
FROM (
SELECT s.id1, s.jobid, s.TIMESTAMP_START_MSG_RECIEVED AS start, MIN(t.TIMESTAMP_SUCCESS_MSG_RECIEVED) AS end
FROM start AS s
LEFT OUTER JOIN success AS t ON t.id1 = s.id1 AND t.jobid = s.jobid AND t.TIMESTAMP_SUCCESS_MSG_RECIEVED >= s.TIMESTAMP_START_MSG_RECIEVED
GROUP BY s.id1, s.TIMESTAMP_START_MSG_RECIEVED, s.jobid, s.time
ORDER BY start) AS table1
GROUP BY table1.id1, table1.jobid, table1.end
ORDER BY table1.end) AS table2 ON table2.id1 = start.id1 AND table2.jobid = start.jobid AND table2.start = start.TIMESTAMP_START_MSG_RECIEVED
ORDER BY start
The innermost select gets each start row and the lowest end time from the success table.
The next select then gets the rows with the lowest start time from table1 The outer select then joins the start table with table 2 to include all the jobs that have not finished.
来源:https://stackoverflow.com/questions/28444842/sql-min-values-from-two-columns-across-two-tables-against-id