On 5/2/2011 12:27 PM, jerry y pan wrote:
Hi John,
Our TDS (4.2) uses some compressed netcdf files (*.nc.gz) and it works 
fine, except that the very first access to them were slow (relatively 
large files, about 400 MB each). The subsequent accesses would be much 
faster, but it would become slow again after a while of non-activity. 
I can see that TDS uncompress these files to the temp data location, 
my question is that if TDS cleans up these temp files, which leads to 
the work to decompress them next time and hence the subsequent 
slowness? If so, is there a way to keep the cache there permanently? 
Or, perhaps the faster response right after the first access is due to 
in memory cache? Any configuration I could twist the cache?
Thanks,
-Jerry Pan
Hi Jerry:
Yes, compressed files are uncompressed the first time they are seen, and 
likely thats why you see the slowdown.
To control how these files are cached, see:
http://www.unidata.ucar.edu/projects/THREDDS/tech/tds4.2/reference/ThreddsConfigXMLFile.html#DiskCache
I would suggest that you use
 <*DiskCache*>
    <*alwaysUse*>true</alwaysUse>
    <*scour*>1 hour</scour>
    <*maxSize*>10 Gb</maxSize>
  </DiskCache>
and choose maxSize carefully. The default directory is  
{tomcat}/content/thredds/cache/cdm/ by default, or set it in the above xml.
monitor the cache directory closely to see what files are uncompressed, 
perhaps test accessing the datasets with and without compression and 
time the difference.
esentially this is a space / time tradeoff. I assume you dont want to 
store the files uncompressed, so you have to pay the price of that. The 
trick is to make maxSize big enough to keep the "working set" 
uncompressed, ie if there is a reletively small "hot" set of files that 
get accessed a lot, you want to give enough cache space to keep them 
uncompressed.
John