Re: [wcsplus] Design of asynchronous request in DEWS WCS

To: Jon Blower <jdb@xxxxxxxxxxxxxxxxxxxx>
Subject: Re: [wcsplus] Design of asynchronous request in DEWS WCS
From: Ethan Davis <edavis@xxxxxxxxxxxxxxxx>
Date: Fri, 26 Oct 2007 16:38:13 -0600

Hi Jon,

Thanks for these details on your DEWS WCS. I'd seen some of thesedetails before in a paper Jeremy gave us but after all this asynchronousdiscussion it is making a bit more sense to me.


Comments below.

Jon Blower wrote:

Hi all,

As Adit said in an earlier post, we designed and built a WCS with
asynchronous capability in DEWS.  Thought it might be useful to
summarize on this list the key features of the design, which borrows
from the WPS spec.

The asynchronous behaviour is specified by two parameters, STORE and
STATUS, which both default to false (meaning that we are
backward-compatible with WCS1.0.0).

STORE=true means "Give me a URL to the data instead of the data itself"
STATUS=true means "Let me monitor the extraction process"

There are three possible behaviours (and one that makes no sense and
is disallowed):
(1) STORE=false and STATUS=false.  "Fully-synchronous."  The server
waits until the data have been extracted and then replies to the
client with the data as a direct response to the request.
(2) STORE=true and STATUS=false.  "Semi-synchronous."  The server
waits until the data have been extracted and responds to the client
with a URL to the data file.
(3) STORE=true and STATUS=true.  "Asynchronous."  The server replies
*immediately* with a document containing a unique job ID.  The client
can then poll the server using this job ID to discover information
about the progress of the extraction.  When the extraction is complete
the polling results in the URL to the data file being returned to the
client.

One case that seems important to me but is not satisfied by the above iswhen the client would prefer the data returned synchronously but stillwants the data even if the server can't return it till later. Or shouldthat be handled by a combination of the above? Which makes me think thatsome kind of negotiation is needed. For instance, what about the clientthat wants the data now or within an hour but doesn't want it if it willtake a day.

Perhaps a fully-synchronous request that returns an exception (maybewith a new "AsynchronousResponseRequired" code) followed by anasynchronous request. It would be nice if the client could get anestimate of how long the asynchronous response might take but theexception reports don't really support structured information beyond thecode and location.

Though this make me think of "use exceptions only for exceptionalsituations" (from "Effective Java", Josh Bloch). So maybe a differentkind of negotiation mechanism would be better.

By the way, what HTTP response code do you guys use for the abovestore/status situations? And do you use the 400 (Bad Request) when youreturn an exception? As I work on implementations, I keep wishing theWCS (and other OGC) spec(s) were more clear about HTTP response codes.So, as we continue with this asynchronous response discussion, I'd liketo make sure we have some detail on the HTTP response codes.


Jon, this leads me to an earlier comment of yours,

   "I'm not sure about the use of HTTP response codes and RESTful
   paradigms to manage the asynchronous download (I'm a fan of REST in
   general by the way). I would recommend thinking carefully about the
   complexities that this design would impose on the design of clients
   (the same goes for a "serverDecide" option in the asynchronous
   parameter of the WCS request)."

Can you explain this a bit more? Since we are working on top of HTTP,this seems like a natural way to go. Do you see the complexity arisingfrom: 1) the lack of detail on the content of a 202 (Accepted) response,2) the client having to check the response code before deciding how todeal with the response, or 3) something else?

STORE=false, STATUS=true makes no sense and is disallowed (server
responds with an error).

The server can respond with an error if it does not wish to satisfy a
fully- or semi-synchronous request (because the data extraction will
take too long, for example).  We chose this design instead of a
"serverDecide" option because it simplifies the client (the client
always knows what kind of response to expect).

For me, the three opposing forces, in order of importance, are: 1) makesure the spec is clear; 2) allow for simple, easy to implement, clients;3) allow for simple, easy to implement, servers. I worry that strivingfor a simple client could make the spec less understandable by thosethat will be

As Adit says, the format of the status documents was inspired by the
WPS ExecuteResponse document.  I think we diverged from this for
reasons of expediency, but I think the ER format is logical: a large
data extraction is conceptually the same as a long-running processing
job.

I haven't looked at the ExecuteResponse document yet but I'm all forusing an existing spec.

There will be many possible designs but at least we know this one
works (at least for us!) and it is close to an existing OGC spec
(WPS).  The DEWS WCS is a "reference implementation" for this design
and we're happy to share the code (Java web app).


Thanks again for going into some details on the DEWS WCS implementation.

Thanks everyone for a great discussion.

Ethan

Hope this helps,
Jon


--
Ethan R. Davis                                Telephone: (303) 497-8155
Software Engineer                             Fax:       (303) 497-8690
UCAR Unidata Program Center                   E-mail:    edavis@xxxxxxxx
P.O. Box 3000
Boulder, CO  80307-3000                       http://www.unidata.ucar.edu/
---------------------------------------------------------------------------

Follow-Ups:
- Re: [wcsplus] Design of asynchronous request in DEWS WCS
  - From: Jon Blower

References:
- [wcsplus] Design of asynchronous request in DEWS WCS
  - From: Jon Blower