Conversation
a4f9692 to
67fde68
Compare
|
@rosa @dhh I cannot get the reasoning here. Puma plugin is a good idea to for a more robust process. If there is a memory leak Puma would then kill it and keep the app responsive (or not?). This was happening in delayed_jobs a lot, so I can see it happening for solid_queue as well. Background jobs can be a bottleneck too for the app. Is there any other way to keep a supervisor running when a problem occurs? I think the plugin should still be part of the gem and maintained by rails. |
|
I don't understand what you mean. If there's a memory leak in your job handling, you want that process to be isolated, so it can be restarted independently of the web server. |
|
Yes, but there is not a defined process to restart it since solid_queue is based on database like delayed job, is there? With delayed_job i have had issues with process not restarting in the past. a job can fail silently, sometimes it does not crash but become unresponsive and stale. In such cases, the process manager may not detect that the worker is no longer effectively processing jobs and restart it. Also jobs can pile up in such cases. This puma plugin seems like addressing this problem. It checks for time-outs and memory leaks and kills it when necessary to keep the app responsive. (At least that's what my understand was so far, it has a better chance of detecting a failure) Therefore i am not sure isolation provides more resilience here. I feel like there is a benefit of being dependent on the web server (puma specifically), which might be part of a bigger background job problem, rather than whether rails favoring a specific server or not. |
I'm not sure using a database as storage plays a role on this 🤔 At least I can't see why that'd be the case.
It doesn't do that. It just checks if the process is still running, and if it's not, it stops itself as well. Conversely, if Puma dies, Solid Queue is stopped as well. It can't know if a worker in particular is been processing a job for a long time without progressing (for example, if the job has an infinite loop). Same about memory usage. It'd be nice to have memory usage and time consumed by the same job monitored by the supervisor, certainly something for the future, but this is all independent of the Puma plugin. |
Since background jobs are not processed on the memory like Sidekiq would do, on the database jobs can pile up and crash the app completely. There is vulnerability here. There is strong chance that process manager does not restart solid_queue.
Yes, plugin itself does not do that but Puma does, doesn't it? When the thread running the background job has a problem, it could kill that thread. If solid_queue isolated, it would never see it.
I might be wrong but seems like Puma already does that for the solid_queue with this plugin. Therefore suggesting to keep this plugin for now as part of the gem. |
|
I don't think we have a shared understanding here. There's no path where database jobs "piling up" will crash the app. You'll just have a backlog. Anyway, we're going to proceed with a single way of operating SQ. Appreciate your input, but going to move on from this discussion at this time. We have a lot of work to do to get 1.0 out the door, and we're going to focus on that. |
|
Crash in the sense that, say, an email application is no longer sending emails due to piling up. And due to huge backlog, the app cannot come back anytime soon (which is a vulnerability). At that point, it does not matter if the pages are running when the core component of the app which are dependent on background jobs is no longer running. There is currently no process that detects that. I had the assumption that puma plugin was initially for that and this can be monitored from there (possibly in the async mode, now checking the code this is not the case). I still think this can be potentially monitored from a web server, and worst case the plugin should stop the app completely in case of a stale or slow running job so infinite backlog of jobs would not be created. |
|
Sorry one last input here. Here is an example so hopefully it will make more sense. The plugin could be useful for something like this and we can monitor each job within a thread, restart when needed: |
After discussing with DHH, we don't want to be tied to a particular server and this won't be the recommended way to run Solid Queue with Rails, so just remove the provided plugin. It can still be used but it just won't be provided directly by the gem.
67fde68 to
c125821
Compare
|
After reviewing the setup flow further, we're going to keep this puma plugin to ensure that out-of-the-box deployment on a single server doesn't require any form of setup or complication. For scale and performance, it's obviously going to be a bit compromised, but we have a very clear path to move to a dedicated job server in the new stock config/deploy.yml setup and we'll control this plugin activation via an ENV switch. |
After discussing with @dhh, we don't want to be tied to a particular server here, and this won't be the recommended way to run Solid Queue with Rails, so just remove the provided plugin. It can still be used but it just won't be provided directly by the gem.