The Future of NetCDF
Russ Rew
NetCDF Annual Update
2012-10-26
    
    
Overview
    
Short- and long-term plans for netCDF and other data access
infrastructure development
 
Tentative plans for netCDF-4.3 and beyond
 
Speculations about the future of scientific data access ...
    
	
	Goals for Unidata data access infrastructure
    
    
    
    
    The next 5 years will be challenging for Unidata's data access infrastructure efforts.
    
	
    
    Our efforts will be focused on
    incremental innovations to:
    
     
    
    
    
    
	
	    - Manage a graceful transition from a simple data model (netCDF-3) to the enhanced Common Data Model of netCDF-4
- Provide better support for remote access and server-side data analysis
- Respond to the need to faithfully represent observational data as well as gridded data
- Scale up to handle larger volumes of data efficiently
- Serve a larger user community wishing to integrate
	    satellite products, geospatial data, observations, and model outputs from growing archives
	Near-term plans for netCDF
	
	
	
	We are constrained by backward compatibility commitments:
	
    
	
	    - 
		
		Don't break archives:
	    
	     new versions must be able to access existing netCDF data
	
- 
	    
	    Don't break programs:
	
	 new libraries must support previous APIs
    
 
     Plans for the next year are fairly
    fluid.  Follow changing plans on our projects site.
    
     
    
    Tentative plans:
    
	
	 
	
    
    
    
    C-4.3 plans:
    
      - CMake support for Windows VS
- bug and documentation fixes
 
    
    Fortran-4.3 plans:
    
      - addition of a few missing functions
- Fortran-2003 C-interoperability support ?
- CMake support for Windows VS ?
- bug and documentation fixes
    Longer Term Plans
 
    
        - Finish documentation conversion to Doxygen
- "Lazy open" for data from many large files
- Improve compression to GRIB2 levels
- Client support for DAP4 protocol
- Automatic packing/unpacking in library
- Support array slice query notation
- Big test data collection for tool developers
- Support high-level chunking policies
- Provide guidance on chunking & compression
- Refactor into more & smaller utilities
- Support asynchronous I/O for remote access
    Even Longer Term Plans
 
Some of these may just be crazy talk ...
    
        - Support data access by coordinates instead of indexes
- Make more netCDF-Java advanced functionality available from C
- Implement standard requests for server-side analysis
- Keep up with HDF5 advances for high-performance computing
- Develop and implement intelligent chunking & compression
- Space Filling Curves!
- Make library updates easy for users
    Speculations
    
    
        - 
            I/O bottlenecks for high-performance computing will worsen
        
- 
            Use of massively parallel shared-nothing file systems will grow
        
- 
            Data will be generated too fast to store, filtering will
            become a priority
        
- 
            Multi-resolution wavelet representations will get more popular
        
- 
            Non-volatile memory technologies will replace most
	spinning disks and change programming
        
- 
	    Lack of organizational support will lead to losses of Valuable Data 
	
- 
            Format-independent conventions will continue to evolve too slowly
        
    We appreciate feedback on netCDF plans!
    
      - Other speculations?
- Questions?
- Feedback?