Skip to content

Errors in the IO threads can cause everything to hang, still #351

@bitprophet

Description

@bitprophet

Running into this under #350 - if the encoding code changes in a way to throw errors within the IO loops, things tend to hang overall until Ctrl-C'd.

Feels like it's the same problem as #289 (comment) (and indeed, I am only able to notice it when I turn debug logging on - but it made it very obvious very fast, so yay past me I guess) except this is for commands definitely not requiring user input (spec, i.e. our test suite) so something else is going on.

Offhand, wondering if something to do with the IO thread not consuming from the subprocess' pipe and said pipe filling up, which I think would be another way to keep the child from ever exiting (putting us back in limbo). This sounds super familiar so I bet it's come up before and I just can't find it when I search.

I also don't see a great way to actually handle this, because "child process is blocking on our own ability to read from it" is indistinguishable from "child process is sitting around waiting for something legit". Unless we can make Runner.wait check for its IO threads' health and abort early if it notices they've died...will look into that.


Side note: at first I thought it was related to #275 but a) I was unable to recreate the issue there, b) inspecting the behavior of the forked pre-execve code doesn't show that it's excepting in this case (it's the parent process' IO threads only) and c) I can get this both with pty on and off.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions