Re: [netcdf-java] ucar.nc2.FileWriter bug fix: copySome actually copies everything at once

To: netcdf-java@xxxxxxxxxxxxxxxx
Subject: Re: [netcdf-java] ucar.nc2.FileWriter bug fix: copySome actually copies everything at once
From: John Caron <caron@xxxxxxxxxxxxxxxx>
Date: Wed, 14 Apr 2010 11:24:04 -0600

HI Christian:

The original FileWriter has a bug confusing chunk size with nelems. Italso had a limited algorithm of only carving up the outer dimension.Your patch looks like it fixes both. The latest release(4.2.20100414.1713) incorporates them. Testing from anyone would beappreciated.


thanks for your very nice patch

On 4/12/2010 5:34 AM, Christian D Ward-Garrison wrote:

Hello all,

I decided that the nelems parameter of copySome() was sort of awkward,so I rewrote parts of copySome() and copyVarData() to work with amaxChunkSize. The new signature of copySome() is:

/**

* Copies data from {@code oldVar} to {@code ncfile}. The writesare done in a series of chunks no larger than

     * {@code maxChunkSize} bytes.
     *
     * @param ncfile        the NetCDF file to write to.

* @param oldVar a variable from the original file to copydata from.* @param maxChunkSize the size, <b>in bytes</b>, of the largestchunk to write.

     * @throws IOException  if an I/O error occurs.
     */

private static void copySome(NetcdfFileWriteable ncfile, VariableoldVar, long maxChunkSize) throws IOException {


It is called from copyVarData() with:

            if (size <= maxSize) {
                copyAll(ncfile, oldVar);
            } else {
                copySome(ncfile, oldVar, maxSize);
            }

Much simpler!

I've attached another patch. Apply it to the original FileWriter.java,not to the result of the first patch.


Regards,
Christian Ward-Garrison


-----netcdf-java-bounces@xxxxxxxxxxxxxxxx wrote: -----

    To: netcdf-java@xxxxxxxxxxxxxxxx
    From: Christian Ward-Garrison <cwardgar@xxxxxxxx>
    Sent by: netcdf-java-bounces@xxxxxxxxxxxxxxxx
    Date: 04/10/2010 12:13AM
    Subject: Re: [netcdf-java] ucar.nc2.FileWriter bug fix: copySome
    actually copies everything at once

    Hello all,

    So, I think I figured out what the original copySome() is actually
    doing. Apparently, nelems is the maximum dim0 value that will be
    used in the shapes of the Arrays that are read and written. The
    values for the other dimensions of the shapes are always maxed
    out. That's, um, interesting. In that light, Robert Bridle's code
    attempts to write chunks big enough to hold N elements, where N is
    the largest integer multiple of dim0's stride that still yields a
    chunk smaller than maxSize. Unfortunately, N will be 0 for
    sufficiently large dim0 strides.

    I mention this because I use nelems differently in my copySome()
    implementation: it is simply the maximum number of elements to
    stuff into each chunk. The sizes of the chunks *in bytes* will be
    no larger than nelems*oldVar.getElementSize().

    Regards,
    Christian Ward-Garrison


    On 4/9/2010 11:12 PM, Christian Ward-Garrison wrote:

    Hello all,

    In NJ 4.2.20100409.0054 the method ucar.nc2.FileWriter.copySome()
    is supposed to copy data for a large variable in a series of
    small chunks. As written, however, it actually attempts to copy
    everything at once. (Here's hoping that Thunderbird preserves the
    whitespace in my preformatted code samples, or else this post
    will be very tough to follow.)

      private static void copySome(NetcdfFileWriteable ncfile,
    Variable oldVar, int nelems) throws IOException {
        String newName = N3iosp.makeValidNetcdfObjectName(
    oldVar.getName());

        int[] shape = oldVar.getShape();
        int[] origin = new int[oldVar.getRank()];
        int size = shape[0];

        for (int i = 0; i<  size; i += nelems) {
          origin[0] = i;
          int left = size - i;
          shape[0] = Math.min(nelems, left);

          Array data;
          try {
            data = oldVar.read(origin, shape);
        ...


    I'm not exactly sure what the intended logic was, but it's clear
    that in the first iteration of the loop, origin will be all
    zeroes (e.g. {0, 0, 0} if the rank of oldVar is 3) and that shape
    will be identical to oldVar's shape. Therefore, the code will
    attempt to read all of oldVar's data at once and an
    OutOfMemoryError will result if oldVar is large.

    On 03/29/2010, Robert Bridle proposed some new code for
    FileWriter.copyVarData() (which calls copySome()) that would
    split the write job into chunks:

    ///////////// ORIGINAL CODE //////////////////////
          int nelems = (int) (size / maxSize);
          if (nelems<= 1)
            copyAll(ncfile, oldVar);
          else
            copySome(ncfile, oldVar, nelems);
    ////////////////////////////////////////////////////

    ////////////// PROPOSED CODE //////////////////////
    /*      if(size>  maxSize)
          {
            int[] shape = oldVar.getShape();

            // determine the size of all the dimensions, other than
    the first.
            long sizeOfOtherDimensions = 1;
            for (int i = 1; i<  shape.length; i++) {
              if (shape[i]>= 0)
                sizeOfOtherDimensions *= shape[i];
            }

            // determine number of bytes in all the dimensions, other
    than the first.
            long bytesInOtherDimensions = sizeOfOtherDimensions *
    oldVar.getElementSize();

            // first dimension chunk-size that will fit within
    maxSize of memory.
            int firstDimensionChunkSize = (int)
    (maxSize/bytesInOtherDimensions);
            //System.out.println("We can fit: " +
    firstDimensionChunkSize + " chunks in: " + maxSize + " bytes of
    memory.");

            copySome(ncfile, oldVar, firstDimensionChunkSize);
          }
          else
          {
            copyAll(ncfile, oldVar);
          }    */
    ////////////////////////////////////////////////////


    This will write the data in N chunks where N is the size of the
    outer-most dimension. But what about when the stride of the outer
    dimension is very large? For example, there's a variable from a
    massive aggregated dataset I'm working with that has the CDL:

       float pr(ensemble=8, time=1560, lat=128, lon=256);


    which means an outer-most dimension stride of 1560*128*256 =
    51,118,080. Using 32-bit floats, that would require 195 MB to
    store--quite a bit larger than the maxSize of 1 MB.

    So, I propose a different algorithm:

        /**
         * An index that computes chunk shapes. It is intended to be
    used to compute the origins and shapes for a series
         * of contiguous writes to a multidimensional array.
         */
        public static class ChunkingIndex extends Index {
            public ChunkingIndex(int[] shape) {
                super(shape);
            }

            /**
             * Computes the shape of the largest
    possible<b>contiguous</b>  chunk, starting at {@link
    #getCurrentCounter()}
             * and with {@code size<= maxChunkSize}.
             *
             * @param maxChunkSize  the maximum size of the chunk
    shape. The actual size of the shape returned is likely
             *                      to be different, and can be found
    with {@link Index#computeSize}.
             * @return  the shape of the largest possible contiguous
    chunk.
             */
            public int[] computeChunkShape(int maxChunkSize) {
                int[] chunkShape = new int[rank];

                for (int iDim = 0; iDim<  rank; ++iDim) {
                    chunkShape[iDim] = maxChunkSize / stride[iDim];
                    chunkShape[iDim] = (chunkShape[iDim] == 0) ? 1 :
    chunkShape[iDim];
                    chunkShape[iDim] = Math.min(chunkShape[iDim],
    shape[iDim] - current[iDim]);
                }

                return chunkShape;
            }
        }

        private static void copySome(NetcdfFileWriteable ncfile,
    Variable oldVar, int nelems) throws IOException {
            String newName =
    N3iosp.makeValidNetcdfObjectName(oldVar.getName());

            ChunkingIndex index = new ChunkingIndex(oldVar.getShape());
            while (index.currentElement()<  index.getSize()) {
                try {
                    int[] chunkOrigin = index.getCurrentCounter();
                    int[] chunkShape  = index.computeChunkShape(nelems);
                    Array data = oldVar.read(chunkOrigin, chunkShape);

                    if (oldVar.getDataType() == DataType.STRING) {
                        data =
    convertToChar(ncfile.findVariable(newName), data);
                    }

                    if (data.getSize()>  0) {// zero when record
    dimension = 0
                        ncfile.write(newName, chunkOrigin, data);
                        if (debugWrite) {
                            System.out.println("write " +
    data.getSize() + " bytes");
                        }
                    }

                    index.setCurrentCounter(index.currentElement() +
    (int) Index.computeSize(chunkShape));
                } catch (InvalidRangeException e) {
                    e.printStackTrace();
                    throw new IOException(e.getMessage());
                }
            }
        }


    This will result in chunks that are *always* smaller than nelems,
    regardless of oldVar's size or shape. For example, if
    oldVar.getShape() == { 5, 16, 8 } and nelems = 100, the origins
    and shapes of the chunk read/writes will be:

         origin      shape       size
    r/w: [0, 0, 0] , [1, 12, 8], 96
    r/w: [0, 12, 0], [1, 4, 8] , 32
    r/w: [1, 0, 0] , [1, 12, 8], 96
    r/w: [1, 12, 0], [1, 4, 8] , 32
    r/w: [2, 0, 0] , [1, 12, 8], 96
    r/w: [2, 12, 0], [1, 4, 8] , 32
    r/w: [3, 0, 0] , [1, 12, 8], 96
    r/w: [3, 12, 0], [1, 4, 8] , 32
    r/w: [4, 0, 0] , [1, 12, 8], 96
    r/w: [4, 12, 0], [1, 4, 8] , 32


    As you can see, none of the chunks is actually 100 elements in
    size, but given the constraints of the Netcdf API, I don't think
    it can be helped. We'd need to be able to read and write 1D
    Arrays of values from/to a specific offset in the 1D backing array.

    If you're interested, I've attached a patch containing the changes.

    Regards,
    Christian Ward-Garrison


    _______________________________________________
    netcdf-java mailing list
    netcdf-java@xxxxxxxxxxxxxxxx

    For list information or to unsubscribe, visit:
    http://www.unidata.ucar.edu/mailing_lists/

    _______________________________________________
    netcdf-java mailing list
    netcdf-java@xxxxxxxxxxxxxxxx
    For list information or to unsubscribe, visit:

http://www.unidata.ucar.edu/mailing_lists/



_______________________________________________
netcdf-java mailing list
netcdf-java@xxxxxxxxxxxxxxxx
For list information or to unsubscribe, visit: 
http://www.unidata.ucar.edu/mailing_lists/

References:
- [netcdf-java] ucar.nc2.FileWriter bug fix: copySome actually copies everything at once
  - From: Christian Ward-Garrison
- Re: [netcdf-java] ucar.nc2.FileWriter bug fix: copySome actually copies everything at once
  - From: Christian Ward-Garrison
- Re: [netcdf-java] ucar.nc2.FileWriter bug fix: copySome actually copies everything at once
  - From: Christian D Ward-Garrison