XHR LC comments

Discussion:

XHR LC comments

Julian Reschke

2008-05-04 09:47:13 UTC

Review of <http://www.w3.org/TR/2008/WD-XMLHttpRequest-20080415/>.

General points:

a) I'm confused about the approach to this document. On the one hand,
we're being told that it can't define anything not already in use (and
that new stuff belongs into XHR2), on the other hand it relies on HTML5,
which is a moving target. It's good that this stuff is being written
down, but if it relies on HTML5, I'd propose to consider other
publication options.

b) Algorithms: the spec uses a method to describe algorithms that IMHO
is extremely hard to read (see for instance send() method). This may be
good for implementors, but seems to be bad for everybody else.
Minimally, the lists should be structured for better readability.

c) Structure: It would be nice if Section 4 had more structure. Right
now it's ugly to navigate and refer to.

2.1 Dependencies

"DOM

A conforming user agent must support some subset of the
functionality defined in DOM Events and DOM Core that this specification
relies upon. [DOM2Events] [DOM3Core]"

That reads a bit strange. Must the subset be non-empty?

2.2 Terminology

"Two URIs are same-origin if after performing scheme-based normalization
on both URIs as described in section 5.3.3 of RFC 3987 the scheme, ihost
and port components are identical. If either URI does not have an ihost
component the URIs must not be considered same-origin. [RFC3987]"

Why are we referring to the IRI spec (RFC3987) when talking about URIs,
as defined RFC3986?

3. Security Considerations

"Apart from requirements affecting security made throughout this
specification implementations may, at their discretion, not expose
certain headers, such as HttpOnly cookies."

"...such as headers containing HttpOnly cookies".

Besides that: this may be a non-optimal example unless we can point to a
definition of "HttpOnly cookies". Can we?

4. The XMLHttpRequest Object

"onreadystatechange of type EventListener

This attribute is an event handler DOM attribute and must be
invoked whenever a readystatechange event is targated at the object."

s/targated/targeted/

"If stored method case-insensitively matches CONNECT, DELETE, GET,
HEAD, OPTIONS POST, PUT, TRACE, or TRACK let stored method be the
canonical uppercase form of the matched method name."

- missing comma after OPTIONS
- TRACK??? There's probably a rational for that. If there is, please
include it in the spec.

"If the user argument was not omitted and is not null let stored user be
user encoded using the encoding specified in the relevant
authentication scheme or UTF-8 if the scheme fails to specify an encoding."

Why is XHR talking about the encoding here? Is "stored user" a string or
a byte array?

(same for password)

"Abort the send() algorithm, set response entity body to "null" and
reset the list of request headers."

This is a very confusing statement, until one realizes that it's in the
description of "open", not "send". It would be good to rephrase this so
it becomes clear that this aborts a *previous* send request.

"If the value argument is null terminate these steps. (Do not raise an
exception.)."

This makes it impossible to set empty headers, which are allowed in
HTTP. Even worse, it silently fails.

This is profiling of the base spec, and I don't see any compelling
reason to do so. Don't do it.

"For security reasons, these steps should be terminated if the header
argument case-insensitively matches one of the following headers:

* Accept-Charset
* Accept-Encoding
* Connection
* Content-Length
* Content-Transfer-Encoding
* Date
* Expect
* Host
* Keep-Alive
* Referer
* TE
* Trailer
* Transfer-Encoding
* Upgrade
* Via "

It's unclear why there's a security reason not to allow things like
"Accept-Charset" or "Accept-Encoding". Please explain.

General comment on "setRequestHeader(header, value), method": the way it
is specified makes it impossible for a client to reliably set headers.
We need a way to either retrieve the current value for inspection, or a
way to reset the header. Or both.

"If stored method is GET act as if the data argument is null."

Another case of HTTP/1.1 being profiled. Don't do it.

"Serialize data into a namespace well-formed XML document and encoded
using the encoding given by data.inputEncoding, when not null, or UTF-8
otherwise. Or, if this fails because the Document cannot be serialized
act as if data is null."

Silent failure????

"If no Content-Type header has been set using setRequestHeader() append
a Content-Type header to the list of request headers with a value of
application/xml;charset=charset where charset is the encoding used to
encode the document."

This will result in an invalid Content-Type header if the UA has
initialized the headers with a default (which I think the spec currently
allows; and at least one UA was reported to do). See comment above about
header handling.

"While downloading the resource the following rules are to be observed."

That reads strange. HTTP requests do not "download" resources. Say
"while executing the request".

"If the user agent supports HTTP State Management it should persist,
discard and send cookies (as received in the Set-Cookie and Set-Cookie2
response headers, and sent in the Cookie header) as applicable. [RFC2965]"

This should probably include a reference to the Set-Cookie (not
Set-Cookie2) spec as well (RFC2109).

"If the user agent implements server-driven content-negotiation it
should set Accept-Encoding and Accept-Charset headers as appropriate; it
must not automatically set the Accept."

s/set the Accept/set the Accept header/

"Responses to such requests must have the content-encodings
automatically decoded. [RFC2616]"

"Such requests" is a bit fuzzy here. Please rephrase in terms of what
the response contains (such as Content-Encoding header being present etc).

"// The following script:
var client = new XMLHttpRequest();
client.open("GET", "test.txt", true);
client.send();
client.onreadystatechange = function() {
if(this.readyState == 3) {
print(this.getAllResponseHeaders());
}
}

// ...should output something similar to the following text:
Date: Sun, 24 Oct 2004 04:58:38 GMT
Server: Apache/1.3.31 (Unix)
Keep-Alive: timeout=15, max=99
Connection: Keep-Alive
Transfer-Encoding: chunked
Content-Type: text/plain; charset=utf-8"

I think examples like this would be more readable (and take less space)
when using the syncr. mode.

"status of type unsigned short, readonly

On getting, if available, it must return the HTTP status code sent by
the server (typically 200 for a successful request). Otherwise, if not
available, the user agent must raise an INVALID_STATE_ERR exception."

This may be incorrect when the UA caches (304 vs 200).

"statusText of type DOMString, readonly

On getting, if available, it must return the HTTP status text sent
by the server (appears after the status code). Otherwise, if not
available, the user agent must raise an INVALID_STATE_ERR exception."

Really? It seems to me that if somebody really implements this, clients
are likely to break. Why not allow an empty string here?

Finally, my main other issue with this spec that it is silent about the
recommended behaviour for unsafe methods, about which RFC2616 says in
Section 9.1.1
(<http://greenbytes.de/tech/webdav/rfc2616.html#rfc.section.9.1.1>):

"Implementors should be aware that the software represents the user in
their interactions over the Internet, and should be careful to allow the
user to be aware of any actions they might take which may have an
unexpected significance to themselves or others.

In particular, the convention has been established that the GET and HEAD
methods SHOULD NOT have the significance of taking an action other than
retrieval. These methods ought to be considered "safe". This allows user
agents to represent other methods, such as POST, PUT and DELETE, in a
special way, so that the user is made aware of the fact that a possibly
unsafe action is being requested."

Thus, allowing a web page to submit a PUT, POST or DELETE request
without user interaction seems to be a very dangerous thing to me, and
the spec should point that out (see also
<http://ietf.osafoundation.org:8080/bugzilla/show_bug.cgi?id=237>).

Sunava Dutta

2008-05-05 21:58:35 UTC

Permalink

Julian Reschke wrote:
a) I'm confused about the approach to this document. On the one hand,
we're being told that it can't define anything not already in use (and
that new stuff belongs into XHR2), on the other hand it relies on HTML5,
which is a moving target. It's good that this stuff is being written
down, but if it relies on HTML5, I'd propose to consider other
publication options.

+1. I had suggested something along the lines of not linking to other specifications that are moving targets but other publication options if we do decide to go this route are fine.

"Ensure all new entities like constants, flags etc are versioned or in a new object.
The draft frequently cross references specifications in the W3C.For example, the spec makes references to the DOM 3 events and core, namespaces in XML, Window Object 1.0 etc (Some of which are drafts themselves). We fail to see the value in implicitly embedding other large specs. Simplicity and standing on its own would be good."

Julian Reschke wrote:

b) Algorithms: the spec uses a method to describe algorithms that IMHO
is extremely hard to read (see for instance send() method). This may be
good for implementors, but seems to be bad for everybody else.
Minimally, the lists should be structured for better readability.

Julian Reschke wrote:
c)
"- TRACK??? There's probably a rational for that. If there is, please
include it in the spec."

TRACK is unsafe and should be removed. I remember reading about this awhile back. Here's something I found on the web: http://www.aqtronix.com/Advisories/AQ-2003-02.txt

Julian Reschke wrote:
d)
""If the user argument was not omitted and is not null let stored user be
user encoded using the encoding specified in the relevant
authentication scheme or UTF-8 if the scheme fails to specify an encoding."

Why is XHR talking about the encoding here? Is "stored user" a string or
a byte array?

(same for password)"

+1. I'm not quite sure what this means and the relevancy.

Julian Reschke wrote:
e)
"For security reasons, these steps should be terminated if the header
argument case-insensitively matches one of the following headers:

* Accept-Charset
* Accept-Encoding
* Connection
* Content-Length
* Content-Transfer-Encoding
* Date
* Expect
* Host
* Keep-Alive
* Referer
* TE
* Trailer
* Transfer-Encoding
* Upgrade
* Via "

It's unclear why there's a security reason not to allow things like
"Accept-Charset" or "Accept-Encoding". Please explain."

+1. I've given this feedback before but haven't heard back anything. We should mention why these are bad and also consider what is currently allowed today by UA's.

http://lists.w3.org/Archives/Public/public-webapi/2008Apr/0191.html

-----Original Message-----
From: public-webapi-***@w3.org [mailto:public-webapi-***@w3.org] On Behalf Of Julian Reschke
Sent: Sunday, May 04, 2008 2:47 AM
To: public-***@w3.org
Subject: XHR LC comments

Review of <http://www.w3.org/TR/2008/WD-XMLHttpRequest-20080415/>.

General points:

a) I'm confused about the approach to this document. On the one hand,
we're being told that it can't define anything not already in use (and
that new stuff belongs into XHR2), on the other hand it relies on HTML5,
which is a moving target. It's good that this stuff is being written
down, but if it relies on HTML5, I'd propose to consider other
publication options.

b) Algorithms: the spec uses a method to describe algorithms that IMHO
is extremely hard to read (see for instance send() method). This may be
good for implementors, but seems to be bad for everybody else.
Minimally, the lists should be structured for better readability.

c) Structure: It would be nice if Section 4 had more structure. Right
now it's ugly to navigate and refer to.

2.1 Dependencies

"DOM

A conforming user agent must support some subset of the
functionality defined in DOM Events and DOM Core that this specification
relies upon. [DOM2Events] [DOM3Core]"

That reads a bit strange. Must the subset be non-empty?

2.2 Terminology

"Two URIs are same-origin if after performing scheme-based normalization
on both URIs as described in section 5.3.3 of RFC 3987 the scheme, ihost
and port components are identical. If either URI does not have an ihost
component the URIs must not be considered same-origin. [RFC3987]"

Why are we referring to the IRI spec (RFC3987) when talking about URIs,
as defined RFC3986?

3. Security Considerations

"Apart from requirements affecting security made throughout this
specification implementations may, at their discretion, not expose
certain headers, such as HttpOnly cookies."

"...such as headers containing HttpOnly cookies".

Besides that: this may be a non-optimal example unless we can point to a
definition of "HttpOnly cookies". Can we?

4. The XMLHttpRequest Object

"onreadystatechange of type EventListener

This attribute is an event handler DOM attribute and must be
invoked whenever a readystatechange event is targated at the object."

s/targated/targeted/

"If stored method case-insensitively matches CONNECT, DELETE, GET,
HEAD, OPTIONS POST, PUT, TRACE, or TRACK let stored method be the
canonical uppercase form of the matched method name."

- missing comma after OPTIONS
- TRACK??? There's probably a rational for that. If there is, please
include it in the spec.

"If the user argument was not omitted and is not null let stored user be
user encoded using the encoding specified in the relevant
authentication scheme or UTF-8 if the scheme fails to specify an encoding."

Why is XHR talking about the encoding here? Is "stored user" a string or
a byte array?

(same for password)

"Abort the send() algorithm, set response entity body to "null" and
reset the list of request headers."

This is a very confusing statement, until one realizes that it's in the
description of "open", not "send". It would be good to rephrase this so
it becomes clear that this aborts a *previous* send request.

"If the value argument is null terminate these steps. (Do not raise an
exception.)."

This makes it impossible to set empty headers, which are allowed in
HTTP. Even worse, it silently fails.

This is profiling of the base spec, and I don't see any compelling
reason to do so. Don't do it.

"For security reasons, these steps should be terminated if the header
argument case-insensitively matches one of the following headers:

* Accept-Charset
* Accept-Encoding
* Connection
* Content-Length
* Content-Transfer-Encoding
* Date
* Expect
* Host
* Keep-Alive
* Referer
* TE
* Trailer
* Transfer-Encoding
* Upgrade
* Via "

It's unclear why there's a security reason not to allow things like
"Accept-Charset" or "Accept-Encoding". Please explain.

General comment on "setRequestHeader(header, value), method": the way it
is specified makes it impossible for a client to reliably set headers.
We need a way to either retrieve the current value for inspection, or a
way to reset the header. Or both.

"If stored method is GET act as if the data argument is null."

Another case of HTTP/1.1 being profiled. Don't do it.

"Serialize data into a namespace well-formed XML document and encoded
using the encoding given by data.inputEncoding, when not null, or UTF-8
otherwise. Or, if this fails because the Document cannot be serialized
act as if data is null."

Silent failure????

"If no Content-Type header has been set using setRequestHeader() append
a Content-Type header to the list of request headers with a value of
application/xml;charset=charset where charset is the encoding used to
encode the document."

This will result in an invalid Content-Type header if the UA has
initialized the headers with a default (which I think the spec currently
allows; and at least one UA was reported to do). See comment above about
header handling.

"While downloading the resource the following rules are to be observed."

That reads strange. HTTP requests do not "download" resources. Say
"while executing the request".

"If the user agent supports HTTP State Management it should persist,
discard and send cookies (as received in the Set-Cookie and Set-Cookie2
response headers, and sent in the Cookie header) as applicable. [RFC2965]"

This should probably include a reference to the Set-Cookie (not
Set-Cookie2) spec as well (RFC2109).

"If the user agent implements server-driven content-negotiation it
should set Accept-Encoding and Accept-Charset headers as appropriate; it
must not automatically set the Accept."

s/set the Accept/set the Accept header/

"Responses to such requests must have the content-encodings
automatically decoded. [RFC2616]"

"Such requests" is a bit fuzzy here. Please rephrase in terms of what
the response contains (such as Content-Encoding header being present etc).

"// The following script:
var client = new XMLHttpRequest();
client.open("GET", "test.txt", true);
client.send();
client.onreadystatechange = function() {
if(this.readyState == 3) {
print(this.getAllResponseHeaders());
}
}

// ...should output something similar to the following text:
Date: Sun, 24 Oct 2004 04:58:38 GMT
Server: Apache/1.3.31 (Unix)
Keep-Alive: timeout=15, max=99
Connection: Keep-Alive
Transfer-Encoding: chunked
Content-Type: text/plain; charset=utf-8"

I think examples like this would be more readable (and take less space)
when using the syncr. mode.

"status of type unsigned short, readonly

On getting, if available, it must return the HTTP status code sent by
the server (typically 200 for a successful request). Otherwise, if not
available, the user agent must raise an INVALID_STATE_ERR exception."

This may be incorrect when the UA caches (304 vs 200).

"statusText of type DOMString, readonly

On getting, if available, it must return the HTTP status text sent
by the server (appears after the status code). Otherwise, if not
available, the user agent must raise an INVALID_STATE_ERR exception."

Really? It seems to me that if somebody really implements this, clients
are likely to break. Why not allow an empty string here?

Finally, my main other issue with this spec that it is silent about the
recommended behaviour for unsafe methods, about which RFC2616 says in
Section 9.1.1
(<http://greenbytes.de/tech/webdav/rfc2616.html#rfc.section.9.1.1>):

"Implementors should be aware that the software represents the user in
their interactions over the Internet, and should be careful to allow the
user to be aware of any actions they might take which may have an
unexpected significance to themselves or others.

In particular, the convention has been established that the GET and HEAD
methods SHOULD NOT have the significance of taking an action other than
retrieval. These methods ought to be considered "safe". This allows user
agents to represent other methods, such as POST, PUT and DELETE, in a
special way, so that the user is made aware of the fact that a possibly
unsafe action is being requested."

Thus, allowing a web page to submit a PUT, POST or DELETE request
without user interaction seems to be a very dangerous thing to me, and
the spec should point that out (see also
<http://ietf.osafoundation.org:8080/bugzilla/show_bug.cgi?id=237>).

Julian Reschke

2008-05-06 07:34:25 UTC

Permalink

Sunava,

it would be helpful if you'd use a mail client that can properly quote
:-) In your mail your text appears as if it was indirectly quoted by
myself... I have reformatted your reply so it becomes clear again who
said what.

Post by Sunava Dutta
c)
"- TRACK??? There's probably a rational for that. If there is, please
include it in the spec."

TRACK is unsafe and should be removed. I remember reading about this awhile back. Here's something I found on the web: http://www.aqtronix.com/Advisories/AQ-2003-02.txt

That implies that Microsoft closed the vulnerability with IIS 6.0, so
I'm not entirely sure why a spec in last call in 2008 needs to speak
about it.

There are surely other old servers that have other vulnerabilities that
could be exploited using XHR, should we consider all of these?

That being said, I'm ok with *mentioning* the issue somewhere, but just
enumerating TRACK along with TRACE, as if this was a standard HTTP
method, is *highly* confusing.

...

BR, Julian

Sunava Dutta

2008-05-06 18:34:20 UTC

Permalink

Ahh, I see my mail client can do that. I just need to make a few changes. Having a standardized guidance here would be very helpful -:p.

-----Original Message-----
From: Julian Reschke [mailto:***@gmx.de]
Sent: Tuesday, May 06, 2008 12:34 AM
To: Sunava Dutta
Cc: public-***@w3.org; IE8 Core AJAX SWAT Team
Subject: Re: XHR LC comments

Sunava,

it would be helpful if you'd use a mail client that can properly quote
:-) In your mail your text appears as if it was indirectly quoted by
myself... I have reformatted your reply so it becomes clear again who
said what.

Post by Sunava Dutta
c)
"- TRACK??? There's probably a rational for that. If there is, please
include it in the spec."

TRACK is unsafe and should be removed. I remember reading about this awhile back. Here's something I found on the web: http://www.aqtronix.com/Advisories/AQ-2003-02.txt

...

BR, Julian

Anne van Kesteren

2008-05-12 15:53:04 UTC

Permalink

On Sun, 04 May 2008 11:47:13 +0200, Julian Reschke <***@gmx.d=
e> =

Post by Julian Reschke
Review of <http://www.w3.org/TR/2008/WD-XMLHttpRequest-20080415/>.
a) I'm confused about the approach to this document. On the one hand, =

Post by Julian Reschke
we're being told that it can't define anything not already in use (and=

Post by Julian Reschke
that new stuff belongs into XHR2), on the other hand it relies on HTML=

5, =

Post by Julian Reschke
which is a moving target. It's good that this stuff is being written =
down, but if it relies on HTML5, I'd propose to consider other =
publication options.

The problem is that concepts such "origin" and determining the encoding =
of =

a text/html stream are not defined anywhere else. It's not really clear =
to =

me what to do about that.

Post by Julian Reschke
b) Algorithms: the spec uses a method to describe algorithms that IMHO=

Post by Julian Reschke
is extremely hard to read (see for instance send() method). This may b=

e =

Post by Julian Reschke
good for implementors, but seems to be bad for everybody else. =
Minimally, the lists should be structured for better readability.

Could you elaborate on what kind of change you envision? I'm not sure ho=
w =

they are not structured right now.

Post by Julian Reschke
c) Structure: It would be nice if Section 4 had more structure. Right =

Post by Julian Reschke
now it's ugly to navigate and refer to.

This is better in XMLHttpRequest Level 2. I rather not revise that entir=
e =

section editorially as it might introduce new errors.

Post by Julian Reschke
2.1 Dependencies
"DOM
A conforming user agent must support some subset of the =
functionality defined in DOM Events and DOM Core that this specificati=

on =

Post by Julian Reschke
relies upon. [DOM2Events] [DOM3Core]"
That reads a bit strange. Must the subset be non-empty?

Yes, as stated it must be a subset that matches what XMLHttpRequest =

requires from the eventing and core specifications.

Post by Julian Reschke
2.2 Terminology
"Two URIs are same-origin if after performing scheme-based normalizati=

on =

Post by Julian Reschke
on both URIs as described in section 5.3.3 of RFC 3987 the scheme, iho=

st =

Post by Julian Reschke
and port components are identical. If either URI does not have an ihos=

t =

Post by Julian Reschke
component the URIs must not be considered same-origin. [RFC3987]"
Why are we referring to the IRI spec (RFC3987) when talking about URIs=

, =

Post by Julian Reschke
as defined RFC3986?

For scheme-bases normalization and ihost. Maybe I should use IRI instead=
=

of URI?

Post by Julian Reschke
3. Security Considerations
"Apart from requirements affecting security made throughout this =
specification implementations may, at their discretion, not expose =
certain headers, such as HttpOnly cookies."
"...such as headers containing HttpOnly cookies".

Done.

Post by Julian Reschke
Besides that: this may be a non-optimal example unless we can point to=

a =

Post by Julian Reschke
definition of "HttpOnly cookies". Can we?

I don't believe we can, but since this was put in mostly for HttpOnly =

cookies I rather not remove that. I think it will be clear enough for =

people reading the document.

Post by Julian Reschke
4. The XMLHttpRequest Object
"onreadystatechange of type EventListener
This attribute is an event handler DOM attribute and must be =
invoked whenever a readystatechange event is targated at the object."
s/targated/targeted/

Already fixed.

Post by Julian Reschke
"If stored method case-insensitively matches CONNECT, DELETE, GET, =
HEAD, OPTIONS POST, PUT, TRACE, or TRACK let stored method be the =
canonical uppercase form of the matched method name."
- missing comma after OPTIONS

Fixed.

Post by Julian Reschke
- TRACK??? There's probably a rational for that. If there is, please =
include it in the spec.

It's a security issue, as should be clear from the next bullet point.

Post by Julian Reschke
"If the user argument was not omitted and is not null let stored user =

be =

Post by Julian Reschke
user encoded using the encoding specified in the relevant =
authentication scheme or UTF-8 if the scheme fails to specify an =
encoding."
Why is XHR talking about the encoding here? Is "stored user" a string =

or =

Post by Julian Reschke
a byte array?
(same for password)

They're a string (in the API).

Post by Julian Reschke
"Abort the send() algorithm, set response entity body to "null" and =
reset the list of request headers."
This is a very confusing statement, until one realizes that it's in th=

e =

Post by Julian Reschke
description of "open", not "send". It would be good to rephrase this s=

o =

Post by Julian Reschke
it becomes clear that this aborts a *previous* send request.

Added a note.

Post by Julian Reschke
"If the value argument is null terminate these steps. (Do not raise an=

Post by Julian Reschke
exception.)."
This makes it impossible to set empty headers, which are allowed in =
HTTP. Even worse, it silently fails.

Empty headers can be set using the empty string, no? Not raising an =

exception is consistent with implementations and I don't think it matter=
s =

much as it doesn't have any effect.

Post by Julian Reschke
This is profiling of the base spec, and I don't see any compelling =
reason to do so. Don't do it.

How is it profiling?

Post by Julian Reschke
"For security reasons, these steps should be terminated if the header =

Post by Julian Reschke
* Accept-Charset
* Accept-Encoding
* Connection
* Content-Length
* Content-Transfer-Encoding
* Date
* Expect
* Host
* Keep-Alive
* Referer
* TE
* Trailer
* Transfer-Encoding
* Upgrade
* Via "
It's unclear why there's a security reason not to allow things like =
"Accept-Charset" or "Accept-Encoding". Please explain.

This was done based on implementation feedback. I haven't investigated =

what the reasons were for the various headers. If implementors read this=
=

maybe they could chime in and point it out.

Post by Julian Reschke
General comment on "setRequestHeader(header, value), method": the way =

it =

Post by Julian Reschke
is specified makes it impossible for a client to reliably set headers.=

Post by Julian Reschke
We need a way to either retrieve the current value for inspection, or =

a =

Post by Julian Reschke
way to reset the header. Or both.

http://lists.w3.org/Archives/Public/public-webapi/2008May/0139.html

Post by Julian Reschke
"If stored method is GET act as if the data argument is null."
Another case of HTTP/1.1 being profiled. Don't do it.

This was done on request of implementations.

Post by Julian Reschke
"Serialize data into a namespace well-formed XML document and encoded =

Post by Julian Reschke
using the encoding given by data.inputEncoding, when not null, or UTF-=

8 =

Post by Julian Reschke
otherwise. Or, if this fails because the Document cannot be serialized=

Post by Julian Reschke
act as if data is null."
Silent failure????

Yes.

Post by Julian Reschke
"If no Content-Type header has been set using setRequestHeader() appen=

d =

Post by Julian Reschke
a Content-Type header to the list of request headers with a value of =
application/xml;charset=3Dcharset where charset is the encoding used =

to =

Post by Julian Reschke
encode the document."
This will result in an invalid Content-Type header if the UA has =
initialized the headers with a default (which I think the spec current=

ly =

Post by Julian Reschke
allows; and at least one UA was reported to do). See comment above abo=

ut =

Post by Julian Reschke
header handling.

Rephrased.

Post by Julian Reschke
"While downloading the resource the following rules are to be observed=

Post by Julian Reschke
That reads strange. HTTP requests do not "download" resources. Say =
"while executing the request".

Thanks, fixed.

Post by Julian Reschke
"If the user agent supports HTTP State Management it should persist, =
discard and send cookies (as received in the Set-Cookie and Set-Cookie=

2 =

Post by Julian Reschke
response headers, and sent in the Cookie header) as applicable. =
[RFC2965]"
This should probably include a reference to the Set-Cookie (not =
Set-Cookie2) spec as well (RFC2109).

I believe it used to do that and it was pointed out that that =

specification is not useful in practice and would actually do more harm =
=

than good. I'm not really sure what to do here.

Post by Julian Reschke
"If the user agent implements server-driven content-negotiation it =
should set Accept-Encoding and Accept-Charset headers as appropriate; =

it =

Post by Julian Reschke
must not automatically set the Accept."
s/set the Accept/set the Accept header/

Fixed due to other changes.

Post by Julian Reschke
"Responses to such requests must have the content-encodings =
automatically decoded. [RFC2616]"
"Such requests" is a bit fuzzy here. Please rephrase in terms of what =

Post by Julian Reschke
the response contains (such as Content-Encoding header being present =
etc).

I simply dropped "to such requests". I hope that's ok.

Post by Julian Reschke
var client =3D new XMLHttpRequest();
client.open("GET", "test.txt", true);
client.send();
client.onreadystatechange =3D function() {
if(this.readyState =3D=3D 3) {
print(this.getAllResponseHeaders());
}
}
Date: Sun, 24 Oct 2004 04:58:38 GMT
Server: Apache/1.3.31 (Unix)
Keep-Alive: timeout=3D15, max=3D99
Connection: Keep-Alive
Transfer-Encoding: chunked
Content-Type: text/plain; charset=3Dutf-8"
I think examples like this would be more readable (and take less space=

) =

Post by Julian Reschke
when using the syncr. mode.

I would like to avoid encouraging authors to use the synchronous API.

Post by Julian Reschke
"status of type unsigned short, readonly
On getting, if available, it must return the HTTP status code sent by =

Post by Julian Reschke
the server (typically 200 for a successful request). Otherwise, if not=

Post by Julian Reschke
available, the user agent must raise an INVALID_STATE_ERR exception."
This may be incorrect when the UA caches (304 vs 200).

That's why it says typically.

Post by Julian Reschke
"statusText of type DOMString, readonly
On getting, if available, it must return the HTTP status text sen=

t =

Post by Julian Reschke
by the server (appears after the status code). Otherwise, if not =
available, the user agent must raise an INVALID_STATE_ERR exception."
Really? It seems to me that if somebody really implements this, client=

s =

Post by Julian Reschke
are likely to break. Why not allow an empty string here?

This is what clients have implemented as far as I can tell. Though the =

HTTP status text could be the empty string, if that's what you mean...

Post by Julian Reschke
Finally, my main other issue with this spec that it is silent about th=

e =

Post by Julian Reschke
recommended behaviour for unsafe methods, about which RFC2616 says in =

Post by Julian Reschke
Section 9.1.1 =
"Implementors should be aware that the software represents the user in=

Post by Julian Reschke
their interactions over the Internet, and should be careful to allow t=

he =

Post by Julian Reschke
user to be aware of any actions they might take which may have an =
unexpected significance to themselves or others.
In particular, the convention has been established that the GET and HE=

AD =

Post by Julian Reschke
methods SHOULD NOT have the significance of taking an action other tha=

n =

Post by Julian Reschke
retrieval. These methods ought to be considered "safe". This allows us=

er =

Post by Julian Reschke
agents to represent other methods, such as POST, PUT and DELETE, in a =

Post by Julian Reschke
special way, so that the user is made aware of the fact that a possibl=

y =

Post by Julian Reschke
unsafe action is being requested."
Thus, allowing a web page to submit a PUT, POST or DELETE request =
without user interaction seems to be a very dangerous thing to me, and=

Post by Julian Reschke
the spec should point that out (see also =
<http://ietf.osafoundation.org:8080/bugzilla/show_bug.cgi?id=3D237>).

All requirements from HTTP are taken over unless explicitly stated so I =
=

don't think this is needed.

Kind regards,

-- =

Anne van Kesteren
<http://annevankesteren.nl/>
<http://www.opera.com/>

Julian Reschke

2008-05-14 13:02:10 UTC

Permalink

Post by Anne van Kesteren
On Sun, 04 May 2008 11:47:13 +0200, Julian Reschke

Post by Julian Reschke
Review of <http://www.w3.org/TR/2008/WD-XMLHttpRequest-20080415/>.
a) I'm confused about the approach to this document. On the one hand,
we're being told that it can't define anything not already in use (and
that new stuff belongs into XHR2), on the other hand it relies on
HTML5, which is a moving target. It's good that this stuff is being
written down, but if it relies on HTML5, I'd propose to consider other
publication options.

The problem is that concepts such "origin" and determining the encoding
of a text/html stream are not defined anywhere else. It's not really
clear to me what to do about that.

In some cases, it may be possible to copy the current definition. In
other cases, it may be possible just not to depend on it (for instance,
by not specifying encoding sniffing).

Post by Anne van Kesteren

Post by Julian Reschke
b) Algorithms: the spec uses a method to describe algorithms that IMHO
is extremely hard to read (see for instance send() method). This may
be good for implementors, but seems to be bad for everybody else.
Minimally, the lists should be structured for better readability.

Could you elaborate on what kind of change you envision? I'm not sure
how they are not structured right now.

An example would be steps 8..11 in the description of open():

- these steps deal with credentials, and the whole list would be more
readable if each group of steps that belong together would me structured
that way;

- optimally, thing like this shouldn't be expressed as a set of
instructions, but in a declarative way.

Post by Anne van Kesteren

Post by Julian Reschke
c) Structure: It would be nice if Section 4 had more structure. Right
now it's ugly to navigate and refer to.

This is better in XMLHttpRequest Level 2. I rather not revise that
entire section editorially as it might introduce new errors.

But then, it makes a comparison with XHR2 harder. Please reconsider.

Post by Anne van Kesteren

Post by Julian Reschke
2.1 Dependencies
"DOM
A conforming user agent must support some subset of the
functionality defined in DOM Events and DOM Core that this
specification relies upon. [DOM2Events] [DOM3Core]"
That reads a bit strange. Must the subset be non-empty?

Yes, as stated it must be a subset that matches what XMLHttpRequest
requires from the eventing and core specifications.

Then it would be clearer if it said "the subset" instead of "some subset".

Post by Anne van Kesteren

Post by Julian Reschke
2.2 Terminology
"Two URIs are same-origin if after performing scheme-based
normalization on both URIs as described in section 5.3.3 of RFC 3987
the scheme, ihost and port components are identical. If either URI
does not have an ihost component the URIs must not be considered
same-origin. [RFC3987]"
Why are we referring to the IRI spec (RFC3987) when talking about
URIs, as defined RFC3986?

For scheme-bases normalization and ihost. Maybe I should use IRI instead
of URI?

Well, if we're talking about URIs (and I think we do), then we need to
refer to RFC3986 grammar and comparison rules.

Post by Anne van Kesteren

Post by Julian Reschke
Besides that: this may be a non-optimal example unless we can point to
a definition of "HttpOnly cookies". Can we?

I don't believe we can, but since this was put in mostly for HttpOnly
cookies I rather not remove that. I think it will be clear enough for
people reading the document.

So why don't we refer to the specification for httpOnly? Do you consider
it a problem that it's a Microsoft document?

Post by Anne van Kesteren

Post by Julian Reschke
- TRACK??? There's probably a rational for that. If there is, please
include it in the spec.

It's a security issue, as should be clear from the next bullet point.

As TRACK doesn't seem to be documented anywhere, and not implemented in
current IIS versions anymore, I'd really like to see this made a foot
node. The way it's written now is just totally confusing to every reader
who doesn't know the full story around it.

Post by Anne van Kesteren

Post by Julian Reschke
"If the user argument was not omitted and is not null let stored user
be user encoded using the encoding specified in the relevant
authentication scheme or UTF-8 if the scheme fails to specify an encoding."
Why is XHR talking about the encoding here? Is "stored user" a string
or a byte array?
(same for password)

They're a string (in the API).

When they are a string, then taking about character encoding doesn't
make any sense here.

Post by Anne van Kesteren

Post by Julian Reschke
"If the value argument is null terminate these steps. (Do not raise an
exception.)."
This makes it impossible to set empty headers, which are allowed in
HTTP. Even worse, it silently fails.

Empty headers can be set using the empty string, no? Not raising an
exception is consistent with implementations and I don't think it
matters much as it doesn't have any effect.

Sorry, was reading one thing, but thinking about something else.

Thinking of it, could you please add a clarification that setting to an
empty string is legal, and MUST NOT be ignored? I recall that
Microsoft's original XHR (ActiveX) implementation got that wrong, not
setting the header at all.

Post by Anne van Kesteren

Post by Julian Reschke
"For security reasons, these steps should be terminated if the header
* Accept-Charset
* Accept-Encoding
* Connection
* Content-Length
* Content-Transfer-Encoding
* Date
* Expect
* Host
* Keep-Alive
* Referer
* TE
* Trailer
* Transfer-Encoding
* Upgrade
* Via "
It's unclear why there's a security reason not to allow things like
"Accept-Charset" or "Accept-Encoding". Please explain.

This was done based on implementation feedback. I haven't investigated
what the reasons were for the various headers. If implementors read this
maybe they could chime in and point it out.

Please. And if they don't, please remove all headers for which nobody
can explain why they are in this list.

Post by Anne van Kesteren

Post by Julian Reschke
General comment on "setRequestHeader(header, value), method": the way
it is specified makes it impossible for a client to reliably set
headers. We need a way to either retrieve the current value for
inspection, or a way to reset the header. Or both.

http://lists.w3.org/Archives/Public/public-webapi/2008May/0139.html

Yes, we continue to disagree on this.

Post by Anne van Kesteren

Post by Julian Reschke
"If stored method is GET act as if the data argument is null."
Another case of HTTP/1.1 being profiled. Don't do it.

This was done on request of implementations.

That's IMHO not sufficient reason to do it. Please add a convincing
rational, or leave this to the HTTP WG.

Post by Anne van Kesteren

Yes.

Very bad.

Does anybody rely on that? I would be very suprised.

Post by Anne van Kesteren

Post by Julian Reschke
"If no Content-Type header has been set using setRequestHeader()
append a Content-Type header to the list of request headers with a
value of application/xml;charset=charset where charset is the
encoding used to encode the document."
This will result in an invalid Content-Type header if the UA has
initialized the headers with a default (which I think the spec
currently allows; and at least one UA was reported to do). See comment
above about header handling.

Rephrased.

Pointer?

Post by Anne van Kesteren

Post by Julian Reschke
"If the user agent supports HTTP State Management it should persist,
discard and send cookies (as received in the Set-Cookie and
Set-Cookie2 response headers, and sent in the Cookie header) as
applicable. [RFC2965]"
This should probably include a reference to the Set-Cookie (not
Set-Cookie2) spec as well (RFC2109).

I believe it used to do that and it was pointed out that that
specification is not useful in practice and would actually do more harm
than good. I'm not really sure what to do here.

Well, the one that is not used in practice is RFC2965, not RFC2109. That
being said, you probably need to reference both.

Post by Anne van Kesteren

Post by Julian Reschke
var client = new XMLHttpRequest();
client.open("GET", "test.txt", true);
client.send();
client.onreadystatechange = function() {
if(this.readyState == 3) {
print(this.getAllResponseHeaders());
}
}
Date: Sun, 24 Oct 2004 04:58:38 GMT
Server: Apache/1.3.31 (Unix)
Keep-Alive: timeout=15, max=99
Connection: Keep-Alive
Transfer-Encoding: chunked
Content-Type: text/plain; charset=utf-8"
I think examples like this would be more readable (and take less
space) when using the syncr. mode.

I would like to avoid encouraging authors to use the synchronous API.

Disagreed. I think readability and compactness is more important here.

Post by Anne van Kesteren

Post by Julian Reschke
"status of type unsigned short, readonly
On getting, if available, it must return the HTTP status code sent by
the server (typically 200 for a successful request). Otherwise, if not
available, the user agent must raise an INVALID_STATE_ERR exception."
This may be incorrect when the UA caches (304 vs 200).

That's why it says typically.

Hm, no.

When the UA caches, and the server sent 304, the client will potentially
see a 200. This would contradict what *this* paragraph says.

Post by Anne van Kesteren

Post by Julian Reschke
"statusText of type DOMString, readonly
On getting, if available, it must return the HTTP status text
sent by the server (appears after the status code). Otherwise, if not
available, the user agent must raise an INVALID_STATE_ERR exception."
Really? It seems to me that if somebody really implements this,
clients are likely to break. Why not allow an empty string here?

This is what clients have implemented as far as I can tell. Though the
HTTP status text could be the empty string, if that's what you mean...

Does the "if not available" apply to any of the existing
implementations? Why would it be "not available"? Please clarify.

Post by Anne van Kesteren

Post by Julian Reschke
Finally, my main other issue with this spec that it is silent about
the recommended behaviour for unsafe methods, about which RFC2616 says
in Section 9.1.1
"Implementors should be aware that the software represents the user in
their interactions over the Internet, and should be careful to allow
the user to be aware of any actions they might take which may have an
unexpected significance to themselves or others.
In particular, the convention has been established that the GET and
HEAD methods SHOULD NOT have the significance of taking an action
other than retrieval. These methods ought to be considered "safe".
This allows user agents to represent other methods, such as POST, PUT
and DELETE, in a special way, so that the user is made aware of the
fact that a possibly unsafe action is being requested."
Thus, allowing a web page to submit a PUT, POST or DELETE request
without user interaction seems to be a very dangerous thing to me, and
the spec should point that out (see also
<http://ietf.osafoundation.org:8080/bugzilla/show_bug.cgi?id=237>).

All requirements from HTTP are taken over unless explicitly stated so I
don't think this is needed.

Well, the spec repeats lots of things specified somewhere else already.

The warning from the HTTP spec is relevant and should appear here, as
XHR is related to UAs, and existing UAs are known to ignore this
security consideration.

BR, Julian

Ian Hickson

2008-05-14 20:18:51 UTC