I have a series of monthly gridded datasets in CSV form. I want to read them, add a few dimensions, and then write to netcdf. I\'ve had great experience using xarray (xray)
You can use .expand_dims()
to add a new dimension and .assign_coords()
to add coordinate values for the corresponding dimension. Below code adds new_dim
dimension to ds
dataset and sets a corresponding corrdinate with the list_of_values
you provide.
expanded_ds = ds.expand_dims("new_dim").assign_coords(new_dim=("new_dim", [list_of_values]))
Your first example is pretty close:
lats = np.arange(-89.75, 90, 0.5) * -1
lngs = np.arange(-179.75, 180, 0.5)
coords = {'lat': lats, 'lng': lngs}
coords['time'] = [datetime.datetime(year, month, day)]
da = xr.DataArray(data, coords=coords, dims=['lat', 'lng', 'time'])
da.to_dataset(name='variable_name')
You'll notice a few changes in my version:
ValueError: Coordinate objects must be 1-dimensional
is trying to tell you (by the way -- if you have ideas for how to make that error message more helpful, I'm all ears!).dims
argument to the DataArray constructor. Passing in a (non-ordered) dictionary is a little dangerous because the iteration order is not guaranteed.datetime.datetime
instead of pd.datetime
. The later is simply an alias for the former.Another sensible approach is to use concat
with a list of one item once you've added 'time' as a scalar coordinate, e.g.,
lats = np.arange(-89.75, 90, 0.5) * -1
lngs = np.arange(-179.75, 180, 0.5)
coords = {'lat': lats, 'lng': lngs, 'time': datetime.datetime(year, month, day)}
da = xr.DataArray(data, coords=coords, dims=['lat', 'lng'])
expanded_da = xr.concat([da], 'time')
This version generalizes nicely to joining together data from a bunch of days -- you simply make the list of DataArrays longer. In my experience, most of the time the reason why you want the extra dimension in the first place is to be able to able to concat along it. Length 1 dimensions are not very useful otherwise.