问题
For a table e.g. containing a date, price timeseries with prices every e.g. millisecond, how can this be downsampled into groups of open high low close (ohlc) rows with time interval e.g. minute?
回答1:
While option with arrays will work, the simplest option here is to use use combination of group by timeintervals with min
, max
, argMin
, argMax
aggregate functions.
SELECT
id,
minute,
max(value) AS high,
min(value) AS low,
avg(value) AS avg,
argMin(value, timestamp) AS first,
argMax(value, timestamp) AS last
FROM security
GROUP BY id, toStartOfMinute(timestamp) AS minute
ORDER BY minute
回答2:
In ClickHouse you solve this kind of problem with arrays. Let's assume a table like the following:
CREATE TABLE security (
timestamp DateTime,
id UInt32,
value Float32
)
ENGINE=MergeTree
PARTITION BY toYYYYMM(timestamp)
ORDER BY (id, timestamp)
You can downsample to one-minute intervals with a query like the following:
SELECT
id, minute, max(value) AS high, min(value) AS low, avg(value) AS avg,
arrayElement(arraySort((x,y)->y,
groupArray(value), groupArray(timestamp)), 1) AS first,
arrayElement(arraySort((x,y)->y,
groupArray(value), groupArray(timestamp)), -1) AS last
FROM security
GROUP BY id, toStartOfMinute(timestamp) AS minute
ORDER BY minute
The trick is to use array functions. Here's how to decode the calls:
- groupArray gathers column data within the group into an array.
- arraySort sorts the values using the timestamp order. We use a lambda function to provide the timestamp array as the sorting key for the first array of values.
- arrayElement allows us to pick the first and last elements respectively.
To keep the example simple I used DateTime for the timestamp which only samples at 1 second intervals. You can use a UInt64 column to get any precision you want. I added an average to my query to help check results.
来源:https://stackoverflow.com/questions/57698667/clickhouse-downsample-into-ohlc-time-bar-intervals