Re: parallel I/O and netCDF-4

To: netcdf-hdf@xxxxxxxxxxxxxxxx
Subject: Re: parallel I/O and netCDF-4
From: Quincey Koziol <koziol@xxxxxxxxxxxxx>
Date: Sun, 18 Jul 2004 23:44:45 -0500 (CDT)
Hi Ed,

> Quincey Koziol <koziol@xxxxxxxxxxxxx> writes:
> 
> >     From HDF5's perspective, you have to use H5Pset_fapl_<foo>(params) to
> > choose to use a particular file driver to access a file.  Probably something
> > like this should be exported/translated out to the netCDF4 layer for users 
> > to
> > choose which driver to access the file with.
> >     Here's the URL for the parallel HDF5 info currently:
> >         http://hdf.ncsa.uiuc.edu/HDF5/PHDF5/
> 
> I'm seeing three steps to parallel HDF5:
> 
> 1 - Initialize MPI
> 2 - When opening/creating the file, set a property in file access
> properties. 
> 3 - Every time reading or writing file, pass a correctly set transfer
> property.
    I'm assuming you mean reading/writing "raw" data.

> Does that seem to sum it up?
    That's some of it.  You also have to make certain that the functions
listed below are called correctly.

> But I see below that you are also asking that "these properties must
> be set to the same values when they > are used in a parallel program,"
> 
> What do you mean by that?
    You can't have half the processes set a property to one value and the other
half set the same property to a different value.  (i.e. everybody must agree
that the userblock is 512 bytes, for example :-)

> In parallel I/O do multiple processes try and create the file? Or does
> one create it, and the rest just open it? Sorry if that seems like a
> dumb question!
    In MPI-I/O, file creation is a collective operation, so all the processes
participate in the create (from our perspective at least, I don't know how it
happens internally in the MPI-I/O library).
    You are going to have fun learning how to do parallel programming with
MPI - think of it as multi-threaded programs with bad debugging support... :-/

    Quincey

> > > For reading, what does this mean to the API, if anything? 
> >     Well, I've appended a list of HDF5 API functions that are required to be
> > performed collectively to the bottom of this document (I can't find the link
> > on our web-pages).
> > 
> > > Everyone gets to open the file read-only, and read from it to their
> > > heart's content, confident that they are getting the most recent data
> > > at that moment. That requires no API changes.
> > > 
> > > Is that it for readers? Or do they get some special additional
> > > features, like notification of data arrival, etc?
> >     User's would also need the option to choose to use collective or
> > independent I/O when reading or writing data to the file.  That reminds me -
> > are y'all planning on adding any wrappers to the H5P* routines in HDF5 which
> > set/get various properties for objects?
> 
> This is truly an important question that I will treat in it's own
> email thread...
> 
> 
> > 
> >     Quincey
> > 
> > ==============================================================
> > 
> > Collective functions:
> >     H5Aclose (2)
> >     H5Acreate
> >     H5Adelete
> >     H5Aiterate
> >     H5Aopen_idx
> >     H5Aopen_name
> >     H5Aread (6)
> >     H5Arename (A)
> >     H5Awrite (3)
> > 
> >     H5Dclose (2)
> >     H5Dcreate
> >     H5Dfill (6) (A)
> >     H5Dopen
> >     H5Dextend (5)
> >     H5Dset_extent (5) (A)
> > 
> >     H5Fclose (1)
> >     H5Fcreate
> >     H5Fflush
> >     H5Fmount
> >     H5Fopen
> >     H5Funmount
> > 
> >     H5Gclose (2)
> >     H5Gcreate
> >     H5Giterate
> >     H5Glink
> >     H5Glink2 (A)
> >     H5Gmove
> >     H5Gmove2 (A)
> >     H5Gopen
> >     H5Gset_comment
> >     H5Gunlink
> > 
> >     H5Idec_ref (7) (A)
> >     H5Iget_file_id (B)
> >     H5Iinc_ref (7) (A)
> > 
> >     H5Pget_fill_value (6)
> > 
> >     H5Rcreate
> >     H5Rdereference
> > 
> >     H5Tclose (4)
> >     H5Tcommit
> >     H5Topen
> > 
> >     Additionally, these properties must be set to the same values when they
> > are used in a parallel program:
> >         File Creation Properties:
> >             H5Pset_userblock
> >             H5Pset_sizes
> >             H5Pset_sym_k
> >             H5Pset_istore_k
> > 
> >         File Access Properties:
> >             H5Pset_fapl_mpio
> >             H5Pset_meta_block_size
> >             H5Pset_small_data_block_size
> >             H5Pset_alignment
> >             H5Pset_cache
> >             H5Pset_gc_references
> > 
> >         Dataset Creation Properties:
> >             H5Pset_layout
> >             H5Pset_chunk
> >             H5Pset_fill_value
> >             H5Pset_deflate
> >             H5Pset_shuffle
> > 
> >         Dataset Access Properties:
> >             H5Pset_buffer
> >             H5Pset_preserve
> >             H5Pset_hyper_cache
> >             H5Pset_btree_ratios
> >             H5Pset_dxpl_mpio
> > 
> >     Notes:
> >         (1) - All the processes must participate only if this is the last
> >             reference to the file ID.
> >         (2) - All the processes must participate only if all the file IDs 
> > for
> >             a file have been closed and this is the last outstanding object 
> > ID.
> >         (3) - Because the raw data for an attribute is cached locally, all
> >             processes must participate in order to guarantee that future
> >             H5Aread calls return the correct results on all processes.
> >         (4) - All processes must participate only if the datatype is for a
> >             committed datatype, all the file IDs for the file have been 
> > closed
> >             and this is the last outstanding object ID.
> >         (5) - All processes must participate only if the number of chunks in
> >             the dataset actually changes.
> >         (6) - All processes must participate only if the datatype of the
> >             attribute a a variable-length datatype (sequence or string).
> >         (7) - This function may be called independently if the object ID 
> > does
> >             not refer to an object that was collectively opened.
> > 
> >         (A) - Available only in v1.6 or later versions of the library.
> >         (B) - Available only in v1.7 or later versions of the library.
>