Issue7606
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2009-12-30 21:46 by pitrou, last changed 2022-04-11 14:56 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| xmlrpc_server_ascii_traceback.patch | vstinner, 2010-01-31 02:31 | |||
| Messages (13) | |||
|---|---|---|---|
| msg97063 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2009-12-30 21:46 | |
I configured my buildbot to use a non-ascii path to the interpreter and
test_xmlrpc fails as follows:
----------------------------------------
Exception happened during processing of request from ('127.0.0.1', 59091)
Traceback (most recent call last):
File
"/home/buildbot/cpython-ucs4-nonascii-€/3.1.pitrou-ubuntu-wide/build/Lib/xmlrpc/server.py",
line 448, in do_POST
size_remaining = int(self.headers["content-length"])
ValueError: invalid literal for int() with base 10: 'I am broken'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File
"/home/buildbot/cpython-ucs4-nonascii-€/3.1.pitrou-ubuntu-wide/build/Lib/socketserver.py",
line 281, in _handle_request_noblock
self.process_request(request, client_address)
File
"/home/buildbot/cpython-ucs4-nonascii-€/3.1.pitrou-ubuntu-wide/build/Lib/socketserver.py",
line 307, in process_request
self.finish_request(request, client_address)
File
"/home/buildbot/cpython-ucs4-nonascii-€/3.1.pitrou-ubuntu-wide/build/Lib/socketserver.py",
line 320, in finish_request
self.RequestHandlerClass(request, client_address, self)
File
"/home/buildbot/cpython-ucs4-nonascii-€/3.1.pitrou-ubuntu-wide/build/Lib/socketserver.py",
line 614, in __init__
self.handle()
File
"/home/buildbot/cpython-ucs4-nonascii-€/3.1.pitrou-ubuntu-wide/build/Lib/http/server.py",
line 352, in handle
self.handle_one_request()
File
"/home/buildbot/cpython-ucs4-nonascii-€/3.1.pitrou-ubuntu-wide/build/Lib/http/server.py",
line 346, in handle_one_request
method()
File
"/home/buildbot/cpython-ucs4-nonascii-€/3.1.pitrou-ubuntu-wide/build/Lib/xmlrpc/server.py",
line 472, in do_POST
self.send_header("X-traceback", traceback.format_exc())
File
"/home/buildbot/cpython-ucs4-nonascii-€/3.1.pitrou-ubuntu-wide/build/Lib/http/server.py",
line 410, in send_header
self.wfile.write(("%s: %s\r\n" % (keyword, value)).encode('ASCII',
'strict'))
UnicodeEncodeError: 'ascii' codec can't encode character '\u20ac' in
position 93: ordinal not in range(128)
----------------------------------------
======================================================================
FAIL: test_fail_with_info (test.test_xmlrpc.FailingServerTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
File
"/home/buildbot/cpython-ucs4-nonascii-€/3.1.pitrou-ubuntu-wide/build/Lib/test/test_xmlrpc.py",
line 555, in test_fail_with_info
p.pow(6,8)
xmlrpc.client.ProtocolError: <ProtocolError for 127.0.0.1:57828/RPC2:
500 Internal Server Error>
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File
"/home/buildbot/cpython-ucs4-nonascii-€/3.1.pitrou-ubuntu-wide/build/Lib/test/test_xmlrpc.py",
line 562, in test_fail_with_info
self.assertTrue(e.headers.get("X-traceback") is not None)
AssertionError: False is not True
----------------------------------------------------------------------
|
|||
| msg97064 - (view) | Author: Martin v. Löwis (loewis) * ![]() |
Date: 2009-12-30 22:03 | |
> self.send_header("X-traceback", traceback.format_exc())
That's fairly tricky. send_header expects two strings (bytes are
not acceptable), and also requires these strings to be ASCII.
This is why it breaks: format_exc returns a non-ASCII string.
I see two options:
a) allow non-Unicode values for keyword and value in send_header,
and have xmlrpc.server encode the header itself, or
b) properly MIME-encode value if it contains non-ASCII characters
(keyword really must be ASCII, I think).
Not sure whether there is any precedence for UTF-8 in HTTP
headers.
|
|||
| msg97068 - (view) | Author: R. David Murray (r.david.murray) * ![]() |
Date: 2009-12-30 23:30 | |
A little googling came up with this page: http://publib.boulder.ibm.com/infocenter/tivihelp/v2r1/topic/com.ibm.itame.doc/am61_webseal_admin570.htm Their solution is to uri encode the UTF8 encoded data. However, this article references the RFCs, which look like they call for rfc2047 (MIME) encoded words: http://stackoverflow.com/questions/324470/http-headers-encoding-decoding-in-java |
|||
| msg97069 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2009-12-30 23:38 | |
If it's only about transmitting the string representation of the traceback, perhaps we can simply use "replace" or "ignore" as the error handler? |
|||
| msg97071 - (view) | Author: Martin v. Löwis (loewis) * ![]() |
Date: 2009-12-30 23:49 | |
David: I think it's a little bit more complicated. RFC 2616 says that
the value of a header is *TEXT, which is defined as
The TEXT rule is only used for descriptive field contents and values
that are not intended to be interpreted by the message parser. Words
of *TEXT MAY contain characters from character sets other than
ISO-8859-1 only when encoded according to the rules of RFC 2047
So I think send_header should change in the following way:
a) if isinstance(value, bytes): send value as-is
b) if value can be encoded in latin-1: encode in latin-1, then send as-is
c) otherwise: MIME-encode as UTF-8, using the following algorithm
1. count the number of non-ascii characters, by encoding with
ascii, ignore, and comparing result lengths
2. if there are less than 10% non-ascii character, use the Q encoding
3. otherwise, use the B encoding
The purpose of the algorithm in c) would be that text containing a few
non-latin characters still comes out right even if the receiver fails to
decode the header.
The same change would also apply to the client-side of sending headers.
On the receiving side, we should offer an option to decode headers (both
for client and server); this should be an option because senders may not
comply with RFC 2616. Reading should then proceed as follows:
1. check whether there are MIME markers in the text
2. if so, MIME-decode
3. if not, decode as latin-1
|
|||
| msg97072 - (view) | Author: Martin v. Löwis (loewis) * ![]() |
Date: 2009-12-30 23:51 | |
Antoine: sure, to fix the issue at hand, we can work-around. However, the issue of sending non-ASCII headers in HTTP remains, and should also be fixed. |
|||
| msg98593 - (view) | Author: STINNER Victor (vstinner) * ![]() |
Date: 2010-01-31 02:31 | |
#7608 was a duplicate issue. Copy of my message (msg98091): ----- SimpleXMLRPCRequestHandler.do_POST() writes the traceback in the HTTP header "X-traceback". But an HTTP header value is ASCII only, whereas a traceback can contain any character (eg. an non-ASCII character from a directory name for this issue). A simple fix would be to use the ASCII charset with the backslashreplace error handler. Attached patch uses: trace = str(trace.encode('ASCII', 'backslashreplace'), 'ASCII') Is there an easier method to escape non-ASCII characters without double conversion (unicode->bytes and bytes->unicode)? ----- I also copied my patch to this issue. |
|||
| msg98594 - (view) | Author: STINNER Victor (vstinner) * ![]() |
Date: 2010-01-31 02:39 | |
pitrou> If it's only about transmitting the string representation of the pitrou> traceback, perhaps we can simply use "replace" or "ignore" as the error pitrou> handler? Both replace and ignore loose information. My patch keeps all information by using backslashreplace. It's consistent with Python behaviour: Python writes a backtrace to stderr which uses also the backslashreplace error handler. |
|||
| msg103275 - (view) | Author: STINNER Victor (vstinner) * ![]() |
Date: 2010-04-15 23:20 | |
What do you think about my solution (convert the traceback to ASCII to avoid the encoding issue)? If you would like to support non-ASCII characters in HTTP headers, you should open a new issue. For the compatibility, I prefer to use pure ASCII headers because I fear that third party programs doesn't support non-ASCII headers. |
|||
| msg103322 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2010-04-16 13:27 | |
> What do you think about my solution (convert the traceback to ASCII to > avoid the encoding issue)? It's fine for me. Perhaps you should add a comment to explain why this is necessary. |
|||
| msg103323 - (view) | Author: STINNER Victor (vstinner) * ![]() |
Date: 2010-04-16 13:28 | |
Commited: r80112 (py3k). Waiting for the buildbots before te backport to 3.1. |
|||
| msg103335 - (view) | Author: STINNER Victor (vstinner) * ![]() |
Date: 2010-04-16 15:48 | |
> Commited: r80112 (py3k) Looks good: r80118 (3.1). |
|||
| msg103382 - (view) | Author: STINNER Victor (vstinner) * ![]() |
Date: 2010-04-17 00:35 | |
If anyone would like to work on non-ASCII HTTP header, please open a new issue with a pointer to this one. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022-04-11 14:56:55 | admin | set | github: 51855 |
| 2010-04-17 00:35:53 | vstinner | set | messages: + msg103382 |
| 2010-04-16 15:48:46 | vstinner | set | status: open -> closed resolution: fixed messages: + msg103335 |
| 2010-04-16 13:28:37 | vstinner | set | messages: + msg103323 |
| 2010-04-16 13:27:50 | pitrou | set | messages: + msg103322 |
| 2010-04-15 23:20:24 | vstinner | set | messages: + msg103275 |
| 2010-04-13 23:37:47 | vstinner | link | issue8242 dependencies |
| 2010-02-27 14:43:50 | flox | set | nosy:
+ flox |
| 2010-01-31 02:39:27 | vstinner | set | messages: + msg98594 |
| 2010-01-31 02:31:06 | vstinner | set | files:
+ xmlrpc_server_ascii_traceback.patch nosy: + vstinner messages: + msg98593 keywords: + patch |
| 2009-12-30 23:51:05 | loewis | set | messages: + msg97072 |
| 2009-12-30 23:49:24 | loewis | set | messages: + msg97071 |
| 2009-12-30 23:38:03 | pitrou | set | messages: + msg97069 |
| 2009-12-30 23:30:32 | r.david.murray | set | nosy:
+ r.david.murray messages: + msg97068 |
| 2009-12-30 22:03:05 | loewis | set | messages: + msg97064 |
| 2009-12-30 21:46:35 | pitrou | create | |
➜
