Hi Jon,
Thanks for these details on your DEWS WCS. I'd seen some of these
details before in a paper Jeremy gave us but after all this asynchronous
discussion it is making a bit more sense to me.
Comments below.
Jon Blower wrote:
Hi all,
As Adit said in an earlier post, we designed and built a WCS with
asynchronous capability in DEWS. Thought it might be useful to
summarize on this list the key features of the design, which borrows
from the WPS spec.
The asynchronous behaviour is specified by two parameters, STORE and
STATUS, which both default to false (meaning that we are
backward-compatible with WCS1.0.0).
STORE=true means "Give me a URL to the data instead of the data itself"
STATUS=true means "Let me monitor the extraction process"
There are three possible behaviours (and one that makes no sense and
is disallowed):
(1) STORE=false and STATUS=false. "Fully-synchronous." The server
waits until the data have been extracted and then replies to the
client with the data as a direct response to the request.
(2) STORE=true and STATUS=false. "Semi-synchronous." The server
waits until the data have been extracted and responds to the client
with a URL to the data file.
(3) STORE=true and STATUS=true. "Asynchronous." The server replies
*immediately* with a document containing a unique job ID. The client
can then poll the server using this job ID to discover information
about the progress of the extraction. When the extraction is complete
the polling results in the URL to the data file being returned to the
client.
One case that seems important to me but is not satisfied by the above is
when the client would prefer the data returned synchronously but still
wants the data even if the server can't return it till later. Or should
that be handled by a combination of the above? Which makes me think that
some kind of negotiation is needed. For instance, what about the client
that wants the data now or within an hour but doesn't want it if it will
take a day.
Perhaps a fully-synchronous request that returns an exception (maybe
with a new "AsynchronousResponseRequired" code) followed by an
asynchronous request. It would be nice if the client could get an
estimate of how long the asynchronous response might take but the
exception reports don't really support structured information beyond the
code and location.
Though this make me think of "use exceptions only for exceptional
situations" (from "Effective Java", Josh Bloch). So maybe a different
kind of negotiation mechanism would be better.
By the way, what HTTP response code do you guys use for the above
store/status situations? And do you use the 400 (Bad Request) when you
return an exception? As I work on implementations, I keep wishing the
WCS (and other OGC) spec(s) were more clear about HTTP response codes.
So, as we continue with this asynchronous response discussion, I'd like
to make sure we have some detail on the HTTP response codes.
Jon, this leads me to an earlier comment of yours,
"I'm not sure about the use of HTTP response codes and RESTful
paradigms to manage the asynchronous download (I'm a fan of REST in
general by the way). I would recommend thinking carefully about the
complexities that this design would impose on the design of clients
(the same goes for a "serverDecide" option in the asynchronous
parameter of the WCS request)."
Can you explain this a bit more? Since we are working on top of HTTP,
this seems like a natural way to go. Do you see the complexity arising
from: 1) the lack of detail on the content of a 202 (Accepted) response,
2) the client having to check the response code before deciding how to
deal with the response, or 3) something else?
STORE=false, STATUS=true makes no sense and is disallowed (server
responds with an error).
The server can respond with an error if it does not wish to satisfy a
fully- or semi-synchronous request (because the data extraction will
take too long, for example). We chose this design instead of a
"serverDecide" option because it simplifies the client (the client
always knows what kind of response to expect).
For me, the three opposing forces, in order of importance, are: 1) make
sure the spec is clear; 2) allow for simple, easy to implement, clients;
3) allow for simple, easy to implement, servers. I worry that striving
for a simple client could make the spec less understandable by those
that will be
As Adit says, the format of the status documents was inspired by the
WPS ExecuteResponse document. I think we diverged from this for
reasons of expediency, but I think the ER format is logical: a large
data extraction is conceptually the same as a long-running processing
job.
I haven't looked at the ExecuteResponse document yet but I'm all for
using an existing spec.
There will be many possible designs but at least we know this one
works (at least for us!) and it is close to an existing OGC spec
(WPS). The DEWS WCS is a "reference implementation" for this design
and we're happy to share the code (Java web app).
Thanks again for going into some details on the DEWS WCS implementation.
Thanks everyone for a great discussion.
Ethan
Hope this helps,
Jon
--
Ethan R. Davis Telephone: (303) 497-8155
Software Engineer Fax: (303) 497-8690
UCAR Unidata Program Center E-mail: edavis@xxxxxxxx
P.O. Box 3000
Boulder, CO 80307-3000 http://www.unidata.ucar.edu/
---------------------------------------------------------------------------