Hi,
I've appended the output of a benchmark program that times hyperslab
accesses for netCDF data. I'm only including the results for
floating-point, since the results for variables of other types were
analogous. However, if you want to see the full benchmark results, you can
look in the files
~russ/hdf/netcdf-timings
~russ/hdf/niche-timings
The source for the program that produces the timings is in
~russ/sdmsrc/netcdf/nctest/timeit.c
These timings are for a four-dimensional variable of size 10x20x30x40, where
the last dimension varies fastest and the first dimension is the unlimited
dimension. The time is a sum of the user and system time, as returned from
the times(3) function. The clock resolution is not as accurate as these
numbers look, but each test was run enough times for at least one second to
have elapsed.
These timings seem to show that niche does better than netCDF in cases where
contiguous data is accessed (e.g. in accessing the 20x30x40 cube in each
record it is about 3.9 times as fast), but for other kinds of hyperslab
access that cross record boundaries the performance is significantly
degraded (e.g. in accessing the 10x20x30 cube of values that are not
contiguous anywhere it is about 35 times as slow).
--Russ
netCDF
----- float_var(10,20,30,40)
time for ncvarput 240000 values 1111111.1 usec, 1/sec
time for ncvarget 1 point 83.4 usec, 11989/sec
time for ncvarget 10x1x1x1 vector 12532.3 usec, 80/sec
time for ncvarget 1x20x1x1 vector 15116.3 usec, 66/sec
time for ncvarget 1x1x30x1 vector 2924.0 usec, 342/sec
time for ncvarget 1x1x1x40 vector 284.8 usec, 3512/sec
time for ncvarget 10x20x1x1 plane 137037.0 usec, 7/sec
time for ncvarget 10x1x30x1 plane 25128.2 usec, 40/sec
time for ncvarget 10x1x1x40 plane 14857.9 usec, 67/sec
time for ncvarget 1x20x30x1 plane 26923.1 usec, 37/sec
time for ncvarget 1x20x1x40 plane 18974.4 usec, 53/sec
time for ncvarget 1x1x30x40 plane 8527.1 usec, 117/sec
time for ncvarget 10x20x30x1 cube 216666.7 usec, 5/sec
time for ncvarget 10x20x1x40 cube 181481.5 usec, 6/sec
time for ncvarget 10x1x30x40 cube 77451.0 usec, 13/sec
time for ncvarget 1x20x30x40 cube 124074.1 usec, 8/sec
niche
----- float_var(10,20,30,40)
time for ncvarput 240000 values 827777.8 usec, 1/sec
time for ncvarget 1 point 2079.3 usec, 481/sec
time for ncvarget 10x1x1x1 vector 19487.2 usec, 51/sec
time for ncvarget 1x20x1x1 vector 32323.2 usec, 31/sec
time for ncvarget 1x1x30x1 vector 46969.7 usec, 21/sec
time for ncvarget 1x1x1x40 vector 1788.6 usec, 559/sec
time for ncvarget 10x20x1x1 plane 320000.0 usec, 3/sec
time for ncvarget 10x1x30x1 plane 377777.8 usec, 3/sec
time for ncvarget 10x1x1x40 plane 19743.6 usec, 51/sec
time for ncvarget 1x20x30x1 plane 666666.7 usec, 2/sec
time for ncvarget 1x20x1x40 plane 33333.3 usec, 30/sec
time for ncvarget 1x1x30x40 plane 2501.6 usec, 400/sec
time for ncvarget 10x20x30x1 cube 7688889.2 usec, 0/sec
time for ncvarget 10x20x1x40 cube 373333.3 usec, 3/sec
time for ncvarget 10x1x30x40 cube 124074.1 usec, 8/sec
time for ncvarget 1x20x30x40 cube 31818.2 usec, 31/sec