问题
- My real life use-case was to merge nested ranges. I've drew some sketches and then I saw the resemblance between ranges starting and ending to stack PUSH and POP operations. I understood that solving this problem will also solve the original problem.
- The op column can actually be removed from the question. When val is NULL then it is a POP operation otherwise it is a PUSH operation.
The Puzzle
A table, stack_trace ,contains the following columns:
- i - Integer value that represents a point in time.
- op - 2 possible operations: I ("in"/"push") and O ("out"/"pop").
val - The value inserted by the "in"/"push" operation or NULL for "out"/"pop" operation.
The goal is to find what was the value at the top of the stack, at each point in time (i).
e.g.
(NULL values are represented here as empty spaces)
data:
i op val
-- -- --
1 I A
2 I B
3 O
4 I C
5 O
6 O
required result:
i top_of_stack_val
-- ----------------
1 A
2 B
3 A
4 C
5 A
6
Requirements
- The solution should be a single SQL query (sub-queries are fine).
- Only the following clauses are allowed: SELECT, FROM, WHERE, GROUP BY, HAVING, ORDER BY.
- The use of WITH clause (CTE - Common Table Expression) is not allowed.
- The use of T-SQL, PL/SQL etc. is not allowed.
- The use of UDF (User Defined Functions) is not allowed.
- The use of variables is not allowed.
Sample data
create table stack_trace
(
i int
,op char(1)
,val char(1)
)
;
insert into stack_trace (i,op,val) values (1,'I','A');
insert into stack_trace (i,op,val) values (2,'I','B');
insert into stack_trace (i,op,val) values (3,'I','C');
insert into stack_trace (i,op,val) values (4,'I','D');
insert into stack_trace (i,op,val) values (5,'I','E');
insert into stack_trace (i,op) values (6,'O');
insert into stack_trace (i,op) values (7,'O');
insert into stack_trace (i,op) values (8,'O');
insert into stack_trace (i,op,val) values (9,'I','F');
insert into stack_trace (i,op) values (10,'O');
insert into stack_trace (i,op,val) values (11,'I','G');
insert into stack_trace (i,op,val) values (12,'I','H');
insert into stack_trace (i,op) values (13,'O');
insert into stack_trace (i,op) values (14,'O');
insert into stack_trace (i,op,val) values (15,'I','I');
insert into stack_trace (i,op,val) values (16,'I','J');
insert into stack_trace (i,op,val) values (17,'I','K');
insert into stack_trace (i,op,val) values (18,'I','L');
insert into stack_trace (i,op,val) values (19,'I','M');
insert into stack_trace (i,op) values (20,'O');
insert into stack_trace (i,op,val) values (21,'I','N');
insert into stack_trace (i,op) values (22,'O');
insert into stack_trace (i,op,val) values (23,'I','O');
insert into stack_trace (i,op) values (24,'O');
insert into stack_trace (i,op,val) values (25,'I','P');
insert into stack_trace (i,op) values (26,'O');
insert into stack_trace (i,op) values (27,'O');
insert into stack_trace (i,op,val) values (28,'I','Q');
insert into stack_trace (i,op,val) values (29,'I','R');
insert into stack_trace (i,op) values (30,'O');
insert into stack_trace (i,op) values (31,'O');
insert into stack_trace (i,op) values (32,'O');
insert into stack_trace (i,op) values (33,'O');
insert into stack_trace (i,op) values (34,'O');
insert into stack_trace (i,op) values (35,'O');
insert into stack_trace (i,op,val) values (36,'I','S');
insert into stack_trace (i,op) values (37,'O');
insert into stack_trace (i,op) values (38,'O');
insert into stack_trace (i,op,val) values (39,'I','T');
insert into stack_trace (i,op,val) values (40,'I','U');
insert into stack_trace (i,op) values (41,'O');
insert into stack_trace (i,op,val) values (42,'I','V');
insert into stack_trace (i,op,val) values (43,'I','W');
insert into stack_trace (i,op,val) values (44,'I','X');
insert into stack_trace (i,op) values (45,'O');
insert into stack_trace (i,op) values (46,'O');
insert into stack_trace (i,op,val) values (47,'I','Y');
insert into stack_trace (i,op) values (48,'O');
insert into stack_trace (i,op) values (49,'O');
insert into stack_trace (i,op,val) values (50,'I','Z');
insert into stack_trace (i,op) values (51,'O');
insert into stack_trace (i,op) values (52,'O');
Required results
i top_of_stack_val
-- ----------------
1 A
2 B
3 C
4 D
5 E
6 D
7 C
8 B
9 F
10 B
11 G
12 H
13 G
14 B
15 I
16 J
17 K
18 L
19 M
20 L
21 N
22 L
23 O
24 L
25 P
26 L
27 K
28 Q
29 R
30 Q
31 K
32 J
33 I
34 B
35 A
36 S
37 A
38
39 T
40 U
41 T
42 V
43 W
44 X
45 W
46 V
47 Y
48 V
49 T
50 Z
51 T
52
回答1:
Personally I doubt that you will end up finding SQL that you can just use in all of SQL Server, Teradata, Postgres, and Oracle and that has acceptable performance if the table is at all large.
A SQL Server solution (demo) would be as follows
SELECT i,
SUBSTRING(MAX(FORMAT(i, 'D10') + val) OVER (PARTITION BY Pos ORDER BY i
ROWS UNBOUNDED PRECEDING), 11, 8000)
FROM (SELECT st.*,
sum(CASE WHEN op = 'I' THEN 1 ELSE -1 END)
OVER (ORDER BY i ROWS UNBOUNDED PRECEDING) AS pos
FROM stack_trace st) t1
ORDER BY i;
回答2:
This is a nice puzzle.
As my main DBMS is Teradata I wrote a solution for it using Analytical functions (needs TD14.10+):
SELECT dt.*,
-- find the last item in the stack with the same position
Last_Value(val IGNORE NULLS)
Over (PARTITION BY pos
ORDER BY i) AS top_of_stack_val
FROM
(
SELECT st.*,
-- calculate the number of items in the stack
Sum(CASE WHEN op = 'I' THEN 1 ELSE -1 end)
Over (ORDER BY i
ROWS Unbounded Preceding) AS pos
FROM stack_trace AS st
) AS dt;
This solution works for Oracle, too, but PostgreSQL & SQL Server don't support the IGNORE NULLS
option for LAST_VALUE
and emulating it is quite complicated, e.g see Itzk Ben-Gan's The Last non NULL Puzzle
Edit: In fact it's not that complex, I forgot Itzik's 2nd solution, the old piggyback trick ;-)
Martin Smith's approach will work for all four DBMSes.
回答3:
Although it does require an additional step -
Generic solution for Hive, Teradata, Oracle, SQL Server and PostgreSQL
select s.i
,min (s.val) over
(
partition by s.depth
,s.depth_val_seq
) as top_of_stack_val
from (select s.i
,s.val
,s.depth
,count (s.val) over
(
partition by s.depth
order by s.i
rows between unbounded preceding and current row
) as depth_val_seq
from (select s.i
,s.val
,sum (case s.op when 'I' then 1 else -1 end) over
(
order by s.i
rows between unbounded preceding and current row
) as depth
from stack_trace s
)
s
)
s
order by i
;
回答4:
This is actually an interesting problem. What I would do is keep track of each elements position in the stack. You can do this using a cumulative sum:
select st.*,
sum(case when op = 'I' then 1 else -1 end) over (order by i) as pos
from stack_trace st;
Alas, at this point, I think you need a rather complicated join or subquery to figure out the most recent value that pos
refers to. Here is one method:
with st as (
select st.*,
sum(case when op = 'I' then 1 else -1 end) over (order by i) as pos
from stack_trace st
)
select st.*,
(select val
from st st2
where st2.i <= st.id and st2.pos = st.pos and
st2.val is not null
order by i desc
fetch first 1 row only
) as top_of_stack_val
from st;
回答5:
Teradata
select s.i
,first_value (s.val) over
(
partition by s.depth
order by s.i
reset when s.op = 'I'
) as top_of_stack_val
from (select s.i
,s.val
,s.op
,sum (case s.op when 'I' then 1 else -1 end) over
(
order by s.i
rows unbounded preceding
) as depth
from stack_trace s
)
s
order by i
;
来源:https://stackoverflow.com/questions/39936479/sql-challenge-puzzle-given-a-stack-trace-how-to-find-the-top-element-at-each