get mean of netcdf file using xarray

ⅰ亾dé卋堺 提交于 2021-01-02 08:17:13

问题


I have opened a netcdf file in python using xarray, and the dataset summary looks like this.

Dimensions:    (latitude: 721, longitude: 1440, time: 41)
Coordinates:
  * longitude  (longitude) float32 0.0 0.25 0.5 0.75 ... 359.25 359.5 359.75
  * latitude   (latitude) float32 90.0 89.75 89.5 89.25 ... -89.5 -89.75 -90.0
    expver     int32 1
  * time       (time) datetime64[ns] 1979-01-01 1980-01-01 ... 2019-01-01
Data variables:
    z          (time, latitude, longitude) float32 50517.914 ... 49769.473
Attributes:
    Conventions:  CF-1.6
    history:      2020-03-02 12:47:40 GMT by grib_to_netcdf-2.16.0: /opt/ecmw...

I want to get the mean of the values of z along the latitude and longitude dimensions.

I've tried to use this code:

df.mean(axis = 0)

But it's removing the time coordinate, and returning me something like this.

Dimensions:  (latitude: 721, longitude: 1440)
Coordinates:
    expver   int32 1
Dimensions without coordinates: latitude, longitude
Data variables:
    z        (latitude, longitude) float32 49742.03 49742.03 ... 50306.242

Am I doing something wrong here. Please help me with this.


回答1:


You need to specify by dimension (dim) instead of axis.

Use df.mean(dim='longitude')




回答2:


WARNING!!! The accepted answer will give you the wrong result if you apply it along latitude (which you need to do to fully answer the question), since you need to weight each cell, they are not the same size.

Xarray solution:

Thus to make a weighted mean you need to do construct the weights as per the following code:

import numpy as np
weights = np.cos(np.deg2rad(df.z))
weights.name = "weights"
z_weighted = df.z.weighted(weights)
weighted_mean = z_weighted.mean(("longitude", "latitude"))

See this discussion in the xarray documentation for further details and an example comparison.

The size of the error depends on the region over which you are averaging, and how strong the gradient of the variable is in the latitudinal direction - the larger the region in the latitudinal extent and variable gradient, the worse it is... For a global field of temperature this is the example error from the xarray documentation, well over 5degC! The unweighted answer is colder since the poles are counted equally even though the grid cells are much smaller there.

Alternative CDO solution

By the way, as an aside you can also do this from the command line with cdo like this

cdo fldmean in.nc out.nc 

cdo accounts for the grid, so you don't need to worry about the weighting issues.



来源:https://stackoverflow.com/questions/60571445/get-mean-of-netcdf-file-using-xarray

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!