-
Notifications
You must be signed in to change notification settings - Fork 384
Closed
Labels
Description
Background
This is a follow-on from #289 and also encompasses some or all of #241, #274, #262, #332, #242, #321, and #338. My goal is to make this general story better, ASAP, while leaving remaining work clearly defined (regardless of who ends up taking a swing at it).
I need to doublecheck some of those linked issues that I've not spent a lot of time on, but the bulk of discussion is in #289 and #274, much of it driven by @pfmoore who (as a Python 3 on Windows user) serves as a very competent & useful counterweight to myself (a default-Python-2 on Unix developer). Any solution that works for both of us is likely one that serves most users.
The basics that fall out of what's been said there so far:
- Where & how bytes/strings are decoded/encoded, needs to be better defined/documented/contracted;
- Encoding/decoding should be pushed as far out as possible, to the "edges" of the code, to limit how often it happens & becomes an issue;
- But also centralized further - the code handling it needs to be clearly labeled & no faffing about should happen outside that one area.
- The default behavior should encompass as many use cases as is reasonable to do, and should degrade as gracefully as it can otherwise (re: exceptions, vs replacements, vs ???);
- But users should also be given the ability to work around that default cleanly & fully whenever it doesn't function for their use case.
- Ideally, we'll balance these two things well instead of using "the user can twiddle the knobs we provided" as an excuse to let the default behavior suffer. But this is hard and time is short, so...
TODO
- Triple check existing locations where we explicitly or implicitly encode/decode, and move them all somewhere centralized & obvious (without otherwise altering them).
- Right now the only explicit mentions of
(encode|decode)\(are a bare handful inrunner.py, so there's likely others that are implicit instead. - Not sure what the "most obvious" place is, but given most of this is directly tied to
run()and friends, it may still just live in theRunnerclass orrunner.pygenerally.
- Right now the only explicit mentions of
- Identify & implement additional tests (integration or regular) for use cases not represented by the
funky_characters_in_stdoutintegration tests- Those tests should have caught some of the basic e.g.
cat tree -> kaboombugs that snuck in during 0.11.x, but did not. - I identified the basic reason why & have updated it so they fail again, in a branch.
- But they only represent a single slice of the problem - look at the discussion in Fixed a number of encoding issues (in
runcontext) #274, as well as the other top level issues above, and see what other tests can be added that currently fail. - Ideally at least some of these would go in the regular, not integration, suite.
- Those tests should have caught some of the basic e.g.
- Get those tests passing, either based off of Fixed a number of encoding issues (in
runcontext) #274 or with something new - If not implied in above: probably makes sense to add more encoding config toggles (maybe up to "one for each stream") so users can override them granularly; the single
encodingis nice and all, but won't work in cases where e.g. one's invoke-invoking terminal is Encoding A but the environment the subprocess is operating in is using Encoding B. - Handle, or surface into this or other tickets, any TODOs added in the branch left undone
- See if @pfmoore has time for immediate feedback to those changes (sadly I wasn't able to give him a thumbs up for his offer of making more PRs on his end, back in Feb) just in case.
- Re: comments below, update
read_proc_outputso we always expect bytes:- Enhance docstring(s) to make it clear that any file-like objects given explicitly, need to yield bytes
- Remove the "if" statement
- Update our test suites to use
BytesIOinstead ofStringIO(& any related string literal changes, or maybe a wrapper if lazy).
- Update changelog entry for any fully-closed tickets & their authors