bpo-36274: Encode request lines with surrogate escapes #12315

tipabu · 2019-03-13T23:59:51Z

While this is out of spec according to RFC 7230 (which limits
expected octets to some subset of ASCII), it is often useful to
be able to mimic an out-of-spec client when testing a server or
application.

Don't use Latin-1 (though that would be in keeping with how we
handle headers and bodies) to encourage callers to write
RFC-complient clients. Rather, use surrogate escape sequences
('\udc80' - '\udcff') to increase friction while still
allowing out-of-spec requests to be expressable.

This is the second fix proposed in the bug report; the first was submitted as #12314 so reviewers can decide between fixes.

https://bugs.python.org/issue36274

ZackerySpytz · 2019-07-08T05:41:30Z

Lib/test/test_httplib.py

I believe there is too much duplication in the unit tests for both of your pull requests. Please use a loop or two (like what is done in test_invalid_headers()).

Sure; done. Any preference on which approach to take, though?

While this is out of spec according to RFC 7230 (which limits expected octets to some subset of ASCII), it is often useful to be able to mimic an out-of-spec client when testing a server or application. Don't use Latin-1 (though that would be in keeping with how we handle headers and bodies) to encourage callers to write RFC-complient clients. Rather, use surrogate escape sequences ('\udc80' - '\udcff') to increase friction while still allowing out-of-spec requests to be expressable. https://bugs.python.org/issue36274

jaraco

This change looks good to me and is my preferred option from those presented. The only thing I'd like to see is some explanation in the tests linking to the justification (already made in the issue tracker).

jaraco · 2019-09-11T11:42:12Z

After reviewing this request with @ericsnowcurrently, we've decided that this approach is dangerous in that it has the potential to expose users unexpectedly to non-compliant behavior, where as currently they are assured compliance. In particular, if a user had input from a source where it was surrogate-escaped non-ascii unicode, the request would currently be rejected but now will be accepted. Instead, we would like to see a more explicit opt-in, such as through a separate method or through a setting on the call and/or client object.

As a result, I'll be closing this PR ad the alternate and will follow up in the bug.

the-knights-who-say-ni added the CLA signed label Mar 13, 2019

bedevere-bot added the awaiting review label Mar 13, 2019

auvipy approved these changes May 31, 2019

View reviewed changes

bedevere-bot added awaiting core review and removed awaiting review labels May 31, 2019

ZackerySpytz reviewed Jul 8, 2019

View reviewed changes

tipabu force-pushed the bpo-36274-surrogate-escape branch from d57b342 to 0573040 Compare July 8, 2019 18:09

jaraco approved these changes Sep 11, 2019

View reviewed changes

bedevere-bot added awaiting merge and removed awaiting core review labels Sep 11, 2019

jaraco closed this Sep 11, 2019

jaraco mentioned this pull request Sep 11, 2019

bpo-36274: Encode request lines as Latin-1 #12314

Closed

tipabu deleted the bpo-36274-surrogate-escape branch July 17, 2020 18:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

bpo-36274: Encode request lines with surrogate escapes #12315

bpo-36274: Encode request lines with surrogate escapes #12315

Uh oh!

tipabu commented Mar 13, 2019 •

edited by bedevere-bot

Loading

Uh oh!

ZackerySpytz Jul 8, 2019

Uh oh!

tipabu Jul 8, 2019

Uh oh!

jaraco left a comment

Uh oh!

jaraco commented Sep 11, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Uh oh!

bpo-36274: Encode request lines with surrogate escapes #12315

bpo-36274: Encode request lines with surrogate escapes #12315

Uh oh!

Conversation

tipabu commented Mar 13, 2019 • edited by bedevere-bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ZackerySpytz Jul 8, 2019

Choose a reason for hiding this comment

Uh oh!

tipabu Jul 8, 2019

Choose a reason for hiding this comment

Uh oh!

jaraco left a comment

Choose a reason for hiding this comment

Uh oh!

jaraco commented Sep 11, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

tipabu commented Mar 13, 2019 •

edited by bedevere-bot

Loading