Re: [wcsplus] asynchronous response [was: Re: WCS 1.0+]

To: Stefano Nativi <nativi@xxxxxxxxxxx>
Subject: Re: [wcsplus] asynchronous response [was: Re: WCS 1.0+]
From: Ethan Davis <edavis@xxxxxxxxxxxxxxxx>
Date: Thu, 11 Oct 2007 23:04:41 -0600

Hi Stefano,

Stefano Nativi wrote:

Hi Ethan,

Sorry for the delay of my answer; I had to prepare my term in Padua.

I try to answer inline.
I agree "Asynchronous Access" is not very clear. "Store" is muchclearer. Does "Persistent Storage" add some specific meaning for youto "storage"?
We used the term "persistent" to express the storage capability tolast a certain time period allowing more than one download. Perhaps,"persistent" is too strong.

I think "storage" already implies keeping items for some amount of time.Is the key point for you the time period it is stored ("persistent", or"long-term"?), that it doesn't go away after the first access (?), orthat more than one client can request the stored data ("shared storage"?).

I agree that "Push" capabilities might be handy. However, I think itis loaded with difficulties. The main issue is that there will bemany firewall issues to get around on the client side. Most firewallsare setup to allow outgoing HTTP requests and incoming responses butnot incoming requests. The other issue is that it means a clientneeds to also be an HTTP server so it can accept HTTP POST requests.If there are a lot of clients trying to receive PUSH responses, weare back to the firewall which may only have one or two machines eachwith one or two ports open for HTTP. There are ways to deal with allthese issues (e.g., you mention the possibility of an upload server(proxy?) that then deals with the client) but all the PUSH issue addsa lot of complexity to the already quite complex asynchronous issue.Because of this, I think we (WCS 1.0+) should skip the PUSH issue fornow.
First of all, we agree that the "push" approach should be avoided fornow.
As for your comment, it is possible to use a proxy application serverto avoid the firewall issue.


Agreed on both counts.

One comment on the response codes, I think we should use the 201(Created) HTTP response code rather than 302 (Found) for theimmediate/store case. I think the meaning of the 201 code (therequest has caused a new resource to be created and it can be foundat the given URI) more closely matches this case than the meaning ofthe 302 code (the resource is not at the request URI but can be foundat the given redirect URI). A subtle difference perhaps but I thinkit is important to be careful that our mapping matches the standardmeaning of the HTTP response codes.
Actually there's a subtle diversity, here; it mainly depends on theused interface. Let's consider the following use cases:
a) case where to use "302 Found" with Location: U2
    > I send to U1 URI a GET request to retrieve the U1 resource
    > the U1 resource representation is available at URI U2
> the authoritative address of the resource is still U1 (clientsmust send their following requests to U1 URI)
b) case where to use "201 Create" with Location: U2
    > I send to U1 URI a POST request for creating a new resource U2
    > the created U2 resource is available at U2 URI
> the authoritative address of the resource is U2 (clients mustsend their following GET requests to U2 URI)
In summary, a GET request should be used to retrieve a"representation" of an existing resource; while, POST is used to"create" a new resource along with its authoritative address.

That is a pretty strict interpretation of the line between resource andrepresentation and between GET and POST. Since a WCS response can be aquite complex (possibly never to be repeated) "representation" of aresource (subset, remap, interpolate, etc), I really think of each oneas a new resource rather than a representation.

Also, for a stored WCS response, clients should be able to share theresulting URI with others and access it multiple times. The HTTP spec isvery clear that the 302 response should not be cached or used multipletimes, that any repeated access should go to the original URI.


So, I still feel that 201 is a more appropriate fit for this case.

I don't understand your delayed/non-stored/pull case. Isn't itimplicitly the same as the delay/store/pull? The server startsprocessing on the first request, ignores any requests till it isdone, and once finished stores the data till it is requested again.Why not use the 202 response to send information about when to checkagain and all the other stuff recommended as content in the 202response body.
The difference is about the redirection aspect:

a) delayed/non-stored/pull case:
    > client requests U1 resource
    > server returns the 202 Accept (with status ?)
    > .....
    > client requests U1 resource
    > server returns the 200 OK with the resource representation
    > /following requests requires the entire processing, again.

/b) delayed/stored/pull case:
    > client requests U1 resource
    > server returns the 302 Found with Location: U2
    > client requests U2 resource
    > server returns the 202 Accept (with status ?)
    > .....
    > client requests U2 resource
    > server returns the 200 OK with the resource representation
> /following requests may be directed to U2, accessing theexisting resource representation (persistent store).
/

I'm still not sure I understand. What happens if two users make the samerequest around the same time? Does the server have to do the sameprocessing twice? Why would anyone prefer the delayed/non-stored/pullcase over delayed/stored/pull?

Ah ha. Upon re-reading the "202 Accept" section of the HTTP spec, Irealize that there is nothing in the spec that says anything about theresults of the accepted processing. The 202 response seems to have beentargeted only at requests for processing where knowing it has beencompleted is all that is important. Not, as I have interpreted it, thatprocessing is done and may have resulted in a new resource (all encodedin the body of the response or the results of a status monitor). I thinkour interpretation of the 202 response is the root of the difference insome of our responses.

Though I still find the 202 response the cleanest mapping to anasynchronous response. Whether the accepted processing results in anexternally accessible artifact or not, the 202 response seems to capturewhat is going on. It is up to the body of the 202 response and anyresponse to the "status monitor" to communicate information about anyartifacts of the accepted processing.

//In my opinion, this discussion and the related documentation is veryinteresting and I'd like to consolidate it in an OGC discussionpaper. What do you think? Who is interested in co-authoring thisdocument?

I'm not familiar with OGC discussion papers but I'm ok withconsolidating the discussion for review by others outside the WCS 1.0.0+group.


Ethan


--Stefano


--
Ethan R. Davis                                Telephone: (303) 497-8155
Software Engineer                             Fax:       (303) 497-8690
UCAR Unidata Program Center                   E-mail:    edavis@xxxxxxxx
P.O. Box 3000
Boulder, CO  80307-3000                       http://www.unidata.ucar.edu/
---------------------------------------------------------------------------

References:
- [wcsplus] asynchronous response [was: Re: WCS 1.0+]
  - From: Ethan Davis
- Re: [wcsplus] asynchronous response [was: Re: WCS 1.0+]
  - From: Stefano Nativi
- Re: [wcsplus] asynchronous response [was: Re: WCS 1.0+]
  - From: Ethan Davis
- Re: [wcsplus] asynchronous response [was: Re: WCS 1.0+]
  - From: Stefano Nativi