Hi Stefano,
Stefano Nativi wrote:
Hi Ethan,
Sorry for the delay of my answer; I had to prepare my term in Padua.
I try to answer inline.
I agree "Asynchronous Access" is not very clear. "Store" is much
clearer. Does "Persistent Storage" add some specific meaning for you
to "storage"?
We used the term "persistent" to express the storage capability to
last a certain time period allowing more than one download. Perhaps,
"persistent" is too strong.
I think "storage" already implies keeping items for some amount of time.
Is the key point for you the time period it is stored ("persistent", or
"long-term"?), that it doesn't go away after the first access (?), or
that more than one client can request the stored data ("shared storage"?).
I agree that "Push" capabilities might be handy. However, I think it
is loaded with difficulties. The main issue is that there will be
many firewall issues to get around on the client side. Most firewalls
are setup to allow outgoing HTTP requests and incoming responses but
not incoming requests. The other issue is that it means a client
needs to also be an HTTP server so it can accept HTTP POST requests.
If there are a lot of clients trying to receive PUSH responses, we
are back to the firewall which may only have one or two machines each
with one or two ports open for HTTP. There are ways to deal with all
these issues (e.g., you mention the possibility of an upload server
(proxy?) that then deals with the client) but all the PUSH issue adds
a lot of complexity to the already quite complex asynchronous issue.
Because of this, I think we (WCS 1.0+) should skip the PUSH issue for
now.
First of all, we agree that the "push" approach should be avoided for
now.
As for your comment, it is possible to use a proxy application server
to avoid the firewall issue.
Agreed on both counts.
One comment on the response codes, I think we should use the 201
(Created) HTTP response code rather than 302 (Found) for the
immediate/store case. I think the meaning of the 201 code (the
request has caused a new resource to be created and it can be found
at the given URI) more closely matches this case than the meaning of
the 302 code (the resource is not at the request URI but can be found
at the given redirect URI). A subtle difference perhaps but I think
it is important to be careful that our mapping matches the standard
meaning of the HTTP response codes.
Actually there's a subtle diversity, here; it mainly depends on the
used interface. Let's consider the following use cases:
a) case where to use "302 Found" with Location: U2
> I send to U1 URI a GET request to retrieve the U1 resource
> the U1 resource representation is available at URI U2
> the authoritative address of the resource is still U1 (clients
must send their following requests to U1 URI)
b) case where to use "201 Create" with Location: U2
> I send to U1 URI a POST request for creating a new resource U2
> the created U2 resource is available at U2 URI
> the authoritative address of the resource is U2 (clients must
send their following GET requests to U2 URI)
In summary, a GET request should be used to retrieve a
"representation" of an existing resource; while, POST is used to
"create" a new resource along with its authoritative address.
That is a pretty strict interpretation of the line between resource and
representation and between GET and POST. Since a WCS response can be a
quite complex (possibly never to be repeated) "representation" of a
resource (subset, remap, interpolate, etc), I really think of each one
as a new resource rather than a representation.
Also, for a stored WCS response, clients should be able to share the
resulting URI with others and access it multiple times. The HTTP spec is
very clear that the 302 response should not be cached or used multiple
times, that any repeated access should go to the original URI.
So, I still feel that 201 is a more appropriate fit for this case.
I don't understand your delayed/non-stored/pull case. Isn't it
implicitly the same as the delay/store/pull? The server starts
processing on the first request, ignores any requests till it is
done, and once finished stores the data till it is requested again.
Why not use the 202 response to send information about when to check
again and all the other stuff recommended as content in the 202
response body.
The difference is about the redirection aspect:
a) delayed/non-stored/pull case:
> client requests U1 resource
> server returns the 202 Accept (with status ?)
> .....
> client requests U1 resource
> server returns the 200 OK with the resource representation
> /following requests requires the entire processing, again.
/b) delayed/stored/pull case:
> client requests U1 resource
> server returns the 302 Found with Location: U2
> client requests U2 resource
> server returns the 202 Accept (with status ?)
> .....
> client requests U2 resource
> server returns the 200 OK with the resource representation
> /following requests may be directed to U2, accessing the
existing resource representation (persistent store).
/
I'm still not sure I understand. What happens if two users make the same
request around the same time? Does the server have to do the same
processing twice? Why would anyone prefer the delayed/non-stored/pull
case over delayed/stored/pull?
Ah ha. Upon re-reading the "202 Accept" section of the HTTP spec, I
realize that there is nothing in the spec that says anything about the
results of the accepted processing. The 202 response seems to have been
targeted only at requests for processing where knowing it has been
completed is all that is important. Not, as I have interpreted it, that
processing is done and may have resulted in a new resource (all encoded
in the body of the response or the results of a status monitor). I think
our interpretation of the 202 response is the root of the difference in
some of our responses.
Though I still find the 202 response the cleanest mapping to an
asynchronous response. Whether the accepted processing results in an
externally accessible artifact or not, the 202 response seems to capture
what is going on. It is up to the body of the 202 response and any
response to the "status monitor" to communicate information about any
artifacts of the accepted processing.
//In my opinion, this discussion and the related documentation is very
interesting and I'd like to consolidate it in an OGC discussion
paper. What do you think? Who is interested in co-authoring this
document?
I'm not familiar with OGC discussion papers but I'm ok with
consolidating the discussion for review by others outside the WCS 1.0.0+
group.
Ethan
--Stefano
--
Ethan R. Davis Telephone: (303) 497-8155
Software Engineer Fax: (303) 497-8690
UCAR Unidata Program Center E-mail: edavis@xxxxxxxx
P.O. Box 3000
Boulder, CO 80307-3000 http://www.unidata.ucar.edu/
---------------------------------------------------------------------------