问题
I am new to d3.js and dc.js and I have spend the best part of a week reading through the tutorials and API. It has a relatively steep learning curve however I am (slowly) becoming familiar with the individual manipulations. That said I still lack the practical experience to construct what I need.
I have a JSON file that contains the following data structure (The record set is relatively large ~2 million objects):
[
{
"index": "device_1",
"state": -1,
"frequencies": [
"800PS"
],
"events": [
{
"start": "04/07/2014 04:24:19",
"end": "07/21/2014 08:53:19",
"name": "event_1234"
}
]
},
{
"index": "device_2",
"state": 1,
"frequencies": [
"2100AWS",
"1900PCS"
],
"events": [
{
"start": "02/20/2014 04:03:20",
"end": "04/30/2014 07:24:35",
"name": "event_3456"
},
{
"start": "04/30/2014 07:25:37",
"end": "07/01/2014 06:35:44",
"name": "event_766"
},
{
"start": "06/02/2014 00:02:16",
"end": "06/02/2014 00:04:25",
"name": "event_8967"
},
{
"start": "06/11/2014 15:38:59",
"end": "06/11/2014 15:41:15",
"name": "event_385"
},
{
"start": "06/28/2014 07:37:00",
"end": "06/28/2014 07:39:34",
"name": "event_8959"
},
{
"start": "07/01/2014 07:06:06",
"end": "07/03/2014 03:27:55",
"name": "event_2654"
},
{
"start": "07/03/2014 04:16:55",
"end": "07/21/2014 08:53:19",
"name": "event_94768"
}
]
},
...
]
What I am trying to achieve is to organise the data so I can create a daily uptime report per device where I gather a cumulative event time per day per device.
Effectively I am trying to convert the original data (above) into a new dataset that looks something like this:
[
{"device":"device_1", "date": "01/01/2014", "cumulative": 2530},
{"device":"device_2", "date": "01/01/2014", "cumulative": 1234},
{"device":"device_1", "date": "01/02/2014", "cumulative": 456},
{"device":"device_2", "date": "01/02/2014", "cumulative": 198},
...
]
* Where *cumulative* is the number of seconds that all cumulative events occurred on that day for that device.
Once I get to that stage I can use something like: d3.nest().key().rollup().entries()
to sort and group the data ready for display.
I suspect that d3 has a built in mechanism to handle this situation but my current approach is as follows:
Import the data set
d3.json("data.json", function(error, json_data) { if (error)return console.warn(error); ... }
Convert the Strings to date objects
var dateFormat = d3.time.format("%m/%d/%Y %H:%M:%S"); json_data.forEach(function(d) { d.dstart = d.events.map(function(x) { return dateFormat.parse(x.start); }); d.dend = d.events.map(function(x) { return dateFormat.parse(x.end); }); });
Specify a start and end date range for the report at daily intervals
- Determine if an event spanned across more than 1 day, if so break the event into several segments
- Sum the cumulative duration of the daily device events
(N.B. I do have control over the JSON data format! I could technically create the final dataset directly. However, the current format is very useful for other reports and I am keen to avoid having two data files as they are <20MB each so ideally I need to avoid changing the JSON design.)
回答1:
The data structure that comes to mind is an interval tree. I haven't tried this library but it might help - interval tree.
Otherwise, at least you could skip the last step and just break events by day. Accumulation is what crossfilter is great at - use reduceSum
.
来源:https://stackoverflow.com/questions/24906877/splitting-and-grouping-records-into-daily-sets-using-d3-js-and-dc-js