Further adventures with dead IO threads and threading timeouts

This is a follow-up to #351, specifically:

- The general cause of that issue was errors within IO or other worker threads causing them to stop consuming from a subprocess' pipe, which in turn locked the subprocess once the pipe filled, which then locked up the main Invoke process (as the subprocess never exits).
- The fix was to add a 1s timeout for (non-stdin) IO thread joins, which appeared sufficient in our test cases.
- Integration tests were added to exhibit this, specifically barfing out `/usr/share/dict/words` (~2.4 MB of text)
    - On the side (later? can't recall, too lazy to look up) another test was added that incidentally exposes the same issue - one that deals with Watcher errors, and since Watchers are run in the stdout-handling worker thread body (`Runner._handle_output`), exceptions from them will have the same effect as an unexpected IO-handling exception.
- Much later, when performing incidentally similar tests in Fabric 2, I found that large output was getting mysteriously truncated. Immediately thought of "oh this must be a shutdown/flush problem", and yup...it was that 1s thread join timeout! Setting it to `None` makes the problem mysteriously go away.
    - Generally, the problem is that reading/parsing/etc 2.4 MB of text chunkwise over the network - even to localhost - takes long enough that waiting 1s after process exit is insufficient (In my testing, the thread is naturally joined after about 2-3s, and of course it is reasonable to expect even larger remote jobs to take even longer to wrap up.)
- So how to reconcile these two things?
    - Is there a better way to deal with the thread death than this timeout? Feels like there should be; we have 'dead thread' detection so `wait()` ought to be able to continue to cleanup even if the subprocess hasn't finished...
     - Otherwise, we need to deal with it as a regular timeout issue of making it configurable and probably having Fabric 2 default it to a longer time, like 5 or 10 seconds.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Further adventures with dead IO threads and threading timeouts #432

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Further adventures with dead IO threads and threading timeouts #432

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions