Hi Nathan,
Sorry this is taking awhile. I'm trying to figure out some of the trade 
offs and such involved in a variety of ways of handling this. I should 
have a more detailed response tomorrow.
Ethan
Nathan Potter wrote:
Ethan et al.,
After talking with Ethan on the phone today I think I can state the 
issue more clearly:
The current THREDDS Servlet Framework (TSF) does not allow the 
collection/dataset information to be retrieved via the request URL.
The API method DataRootHandler.getCatalog(java.lang.String path, 
java.net.URI baseURI) expects the "path" parameter to be the path in 
the THREDDS catalog to the catalog file. There is no restriction on 
the file name of the catalog file. The path in the THREDDS catalog to 
the file may be different that the access URL.
What this means is that when a servlet receives an access request, 
even one that comes from a valid access link in a THREDDS 
catalog(.html), the servlet only knows about the request URL, nothing 
more. If the servlet needs to get the THREDDS dataset/collection 
information (and associated metadata if any) then it has no recourse 
but to attempt to search the catalog from the highest level looking 
for a dataset with a matching "urlPath" attribute. This activity may 
fail if:
- The THREDDS catalog employs <catalogRef> elements.
- The "urlPath" is not unique within the catalog.
I think that the TSF API should be augmented with accessor methods 
that allow the DataRootHandler to return InvDataset an InvCatalog to 
be retrieved based on information that a servlet has access to at run 
time, i.e. data that can be retrieved from the HttpServletRequest object.
Nathan
On Jun 4, 2007, at 5:00 PM, Nathan Potter wrote:
On Jun 4, 2007, at 1:05 PM, Ethan Davis wrote:
Hi Nathan,
Can you explain the context for these questions. This is on the 
server side (in Hyrax)?
Yes, server side.
Nathan Potter wrote:
Greetings,
So I am using the THREDDS API in an attempt to get the <property> 
elements for a dataset. I've run into a couple of (possibly 
related) problems.
Just to clarify our terminology. When you say "THREDDS API" you mean 
both the thredds.catalog and thredds.servlet packages? I generally 
split those apart and call the thredds.catalog package the "THREDDS 
Catalog API" and call the thredds.servlet package the "THREDDS 
Servlet Framework" (TSF).
[Note: the TSF is probably only useful for those writing servers.]
I wasn't distinguishing. But since DataRootHandler is in the TSF then 
that is where I am suggesting an API change.
** 1) I can't get the dataset information without searching.
In the HttpServletRequest I have the URL for the dataset, say:
http://localhost:8080/opendap/wcs/MODIS/Grid/test.hdf.html
Is this URL for an OPeNDAP HTML response?
Right, but the requested response isn't really meaningful in this 
discussion since all I am really after is the THREDDS dataset 
information for the atom/leaf/dataset test.hdf
Are you trying to get the property from the THREDDS catalog so you 
can use it in the OPeNDAP response?
Well... In truth it's much more complex than that, but since I will 
have to do that too we can roll with that vision for the moment.
In order for me to get THREDDS to divulge the <property> elements 
for the dataset I have to:
- take the dataset name "wcs/MODIS/Grid/test.hdf.html" and back 
track to the
  collection name, "wcs/MODIS/Grid/".
- ask the DataRootHandler for the InvCatalog for "wcs/MODIS/Grid/"
- Ask the InvCatalog for the InvDataset for "wcs/MODIS/Grid/"
- Search the child datasets of the "wcs/MODIS/Grid/" InvDataset for 
the
  one whose name (lexically) matches "wcs/MODIS/Grid/test.hdf.set"
- Read the properties of that InvDataset
That seems awfully complex. (Of course there may a more straight 
forward way that I am not aware of.)
That is about as simple as it gets. Though I would suggest you make 
sure the THREDDS configuration (TSF) knows about this dataset first 
by getting the CrawlableDataset that matches the dataset URL:
      DataRootHandler.getCrawlableDataset("wcs/MODIS/Grid/test.hdf")
      // I dropped of the trailing ".html" assuming it was the 
OPeNDAP dataset URL extension
When I tried this I could only get CrawlableDataset objects for 
catalogs that were part of a <datasetScan>
Are you using InvDataset.findDatasetByName( String name) to find the 
child dataset?
No.
Also, depending on how you setup your dataset IDs, you could ask the 
catalog to find the dataset by ID, like
      cat.findDatasetByID( "wcs/MODIS/Grid/test.hdf")
Ahhh... I just tried that and it works. So, that greatly simplifies 
that step, thanks!
** 2) When I ask for a catalog I have to know the name of the XML 
file in which it resides.
In the above example, when I ask the DataRootHandler for the 
InvCatalog I ask for: " wcs/MODIS/Grid/catalog.xml" Which is all 
well and good if all of the catalogs are stored in files called 
catalog.xml. Essentially this means that anyone configuring a 
THREDDS catalog has to create a hierarchy of directories that 
mimics the organizatiopn of the collections, and all of the THREDDS 
information must be stored in files called "catalog.xml".
Why do you need to create this hierarchy of directories mimicking 
the data collection hierarchy? The TSF should keep track of your 
config catalogs and the automatically generated catalogs.
Right, but if all of the THREDDS catalog files have the name 
"catalog.xml" they can't all be in the same directory, so they have 
to live in some kind of directory hierarchy - I just figured it made 
sense to mimic the collection organization, but that's not necessary.
THREDDS does not actually require this - I can make a complex 
hierarchy of collections by using either a single (complex) top 
level catalog.xml file, or a collection of XML files in a single 
directory that employ <catalogRef> elements to create their 
organizations.
However the API breaks down in both cases.
If the catalog is composed of a collection of XML files in a single 
directory that employ <catalogRef> elements to create their 
organizations, then in order to retrieve catalog information I 
would have to KNOW how the information was organized (file names, 
directory hierarchy , etc.) But I don't know - since the catalog 
may be created by a user after compile time (although THREDDS does 
know this since it parsed all of the catalog information at start 
up) - and I shouldn't have to know. For me to know would require 
that I parse the top level catalog.xml file and build the XML doc 
tree myself. At which point it I can get the elusive <property> 
elements from the XML doc in memory.
If the catalog is composed of a single (complex) top level 
catalog.xml file then I would have to know that and just ask for 
the top level catalog.
(Searching the entire catalog from the top down for my dataset 
doesn't seem to work either...)
I'm sorry, I'm having a hard time following here. What are you 
trying to do and why?
For any request that is looking for one of the OPeNDAP data responses 
I need to search the THREDDS catalog for the dataset, and if found, I 
need to extract any metadata that may in the catalog for that dataset.
Is the problem that you may not know if the dataset is contained in 
a catalog generated because of a datasetScan element or contained 
directly in one of the THREDDS config catalogs?
I think that's a separate issue.
All of these methods of writing and organizing catalogs are 
legitimate in THREDDS, and users writing THREDDS catalogs would 
likely employ one or more of these methods when writing their 
catalogs.
I propose that the THREDDS API be extended so that one can simply 
ask the DataRootHandler for an InvDataset or an InvCatalog. Like:
    InvDataset id = drh.getDataSet("wcs/MODIS/foo.nc");
    InvCatalog id = drh.getCatalog("wcs/MODIS/");
or possible the InvDataset that represents a collection:
    InvDataset id = drh.getDataSet("wcs/MODIS/");
If the DataRootHandler doesn't have it, return null.
Is that unreasonable?
I'll have to take a closer look at this.
Ethan
Nathan
= 
Nathan Potter                        ndp at opendap.org
OPeNDAP, Inc.                        541.752.1852
=============================================================================== 
To unsubscribe thredds, visit:
http://www.unidata.ucar.edu/mailing-list-delete-form.html
=============================================================================== 
--
Ethan R. Davis                                Telephone: (303) 497-8155
Software Engineer                             Fax:       (303) 497-8690
UCAR Unidata Program Center                   E-mail:    
edavis@xxxxxxxx
P.O. Box 3000
Boulder, CO  80307-3000                       
http://www.unidata.ucar.edu/
--------------------------------------------------------------------------- 
= 
Nathan Potter                        ndp at opendap.org
OPeNDAP, Inc.                        541.752.1852
= 
Nathan Potter                        ndp at opendap.org
OPeNDAP, Inc.                        541.752.1852
=============================================================================== 
To unsubscribe thredds, visit:
http://www.unidata.ucar.edu/mailing-list-delete-form.html
=============================================================================== 
--
Ethan R. Davis                                Telephone: (303) 497-8155
Software Engineer                             Fax:       (303) 497-8690
UCAR Unidata Program Center                   E-mail:    edavis@xxxxxxxx
P.O. Box 3000
Boulder, CO  80307-3000                       http://www.unidata.ucar.edu/
---------------------------------------------------------------------------
==============================================================================
To unsubscribe thredds, visit:
http://www.unidata.ucar.edu/mailing-list-delete-form.html
==============================================================================