问题
I'm using influxdb in my project and I'm facing an issue with query when multiple points are written at once
I'm using influxdb-python to write 1000 unique points to influxdb.
In the influxdb-python there is a function called influxclient.write_points()
I have two options now:
- Write each point once every time (1000 times) or
- Consolidate 1000 points and write all the points once.
The first option code looks like this(pseudo code only) and it works:
thousand_points = [0...9999
while i < 1000:
...
...
point = [{thousand_points[i]}] # A point must be converted to dictionary object first
influxclient.write_points(point, time_precision="ms")
i += 1
After writing all the points, when I write a query like this:
SELECT * FROM "mydb"
I get all the 1000 points.
To avoid the overhead added by every write in every iteration, I felt like exploring writing multiple points at once. Which is supported by the write_points
function.
write_points(points, time_precision=None, database=None, retention_policy=None, tags=None, batch_size=None)
Write to multiple time series names.
Parameters: points (list of dictionaries, each dictionary represents a point) – the list of points to be written in the database
So, what I did was:
thousand_points = [0...999]
points = []
while i < 1000:
...
...
points.append({thousand_points[i]}) # A point must be converted to dictionary object first
i += 1
influxclient.write_points(points, time_precision="ms")
With this change, when I query:
SELECT * FROM "mydb"
I only get 1 point as the result. I don't understand why.
Any help will be much appreciated.
回答1:
You might have a good case for a SeriesHelper
In essence, you set up a SeriesHelper
class in advance, and every time you discover a data point to add, you make a call. The SeriesHelper
will batch up the writes for you, up to bulk_size
points per write
回答2:
I know this has been asked well over a year ago, however, in order to publish multiple data points in bulk to influxdb each datapoint needs to have a unique timestamp it seems, otherwise it will just be continously overwritten.
I'd import a datetime
and add the following to each datapoint within the for loop
:
'time': datetime.datetime.now().strftime("%Y-%m-%dT%H:%M:%SZ")
So each datapoint should look something like...
{'fields': data, 'measurement': measurement, 'time': datetime....}
Hope this is helpful for anybody else who runs into this!
Edit: Reading the docs show that another unique identifier is a tag, so you could instead include {'tag' : i}
(supposedly each iteration value is unique) if you wish to specify time. (However this I haven't tried)
来源:https://stackoverflow.com/questions/41178517/influxdb-write-multiple-points-vs-single-point-multiple-times