Re: Preliminary HDF5 Dimension documents

To: netcdf-hdf@xxxxxxxxxxxxxxxx
Subject: Re: Preliminary HDF5 Dimension documents
From: Quincey Koziol <koziol@xxxxxxxxxxxxx>
Date: Mon, 29 Sep 2003 17:21:30 -0500 (CDT)
Hi Russ,

> >     This document introduces dimensions as an optional method of composing
> > a dataspace in HDF5, so they ought to be completely analogous to netCDF
> > dimensions.
> >     One possible difference is that I wasn't planning on naming the 
> > dimensions
> > within a dataspace.  They were just going to be indexed by their rank within
> > the dataspace (i.e. the 0th dimension, the 1st dimension, etc).  This could
> > reference a named dimensions through an indirect dimension (see the 
> > shareability
> > document), but the actual dimensions in the dataspace weren't planned on 
> > having
> > names associated with them.
> >     Do you think this is an important requirement?  Does the netCDF API
> > require that the dimensions in a dataspace for a dataset have names, or
> > will having shared dimensions using the names of dimension objects in the
> > grouping hierarchy be sufficient?
> 
> Currently, each netCDF dimension must have a unique name.  There are
> several reasons for this:
> 
>  1.  To support functions such as
> 
>      int nc_inq_dimid (int ncid, const char *name, int *dimidp);
> 
>      which returns a dimension ID from its name.  The netCDF ID can be
>      used to get the dimension length and determine whether it is an
>      unlimited dimension.
    This seems to be the strongest reason in favor of having names.

>  2.  To support the association between a netCDF dimension and the
>      corresponding coordinate variable, if any.  When there is a
>      corresponding coordinate variable, it is identified by having the
>      same name as the dimension.
    In the data model I was initially proposing, the scales for a dimension
would be directly attached to the dimension itself, so this wouldn't be
necessary.
    However, as I'm considering the affect of implementing a coordinate system
for a dataspace, I'm thinking about attaching the scales directly to the
dataspace, instead of the dimensions and that might make having the same name
an important consideration.

>  3.  In some discipline-specific netCDF conventions that associate
>      special meanings with particular dimension names, for example,
>      in the "CDC netCDF Conventions: Gridded Data":
> 
>        http://www.cdc.noaa.gov/cdc/conventions/cdc_netcdf_standard.shtml
    Ok, I'll read through this, thanks.
    
>  4.  To distinguish between dimensions that happen to have the same
>      length but that are intended to represent distinct dimensions.
>      If two variable's dimensions are not related, we recommend
>      creating separate dimensions for them, even if they happen to
>      have the same length.
> 
> However, in netCDF-4, we have discussed supporting anonymous
> dimensions as an enhancement.  For example, if we want to provide
> explicit support for a vector or list object of variable length that
> keeps its own private length, it need not have a name if there is some
> way to get its length from the vector/list object.
    Yes, I think a dimension's name should be optional.  (It will need to be
so for backward compatibility with existing HDF5 datasets anyway... :-)

> So as a bottom line, I'd say we have to be able to support a name with
> every dimension as a default, but that it might be convenient to be
> able to explicitly specify that a dimension be anonymous as a special
> case, to support objects that "know their own length".
    Ok.

        Quincey
References:
- Re: Preliminary HDF5 Dimension documents
  - From: Russ Rew