helm: Allow configuring agent runtime#1697
Conversation
2020cb9 to
734c8a7
Compare
There was a problem hiding this comment.
Pull request overview
This PR aims to make Helm-installed agents configurable to run with the Go runtime (while keeping python as the per-agent default), and adjusts default resource requests/limits to better fit local/kind clusters.
Changes:
- Added
spec.runtimeto multiple agentagent.yamltemplates (defaulting to"python"). - Introduced a new runtime-related value in agent chart
values.yamlfiles. - Reduced default resource requests/limits in the
helm/kagentparent chart and set agent runtime defaults togothere.
Reviewed changes
Copilot reviewed 21 out of 21 changed files in this pull request and generated 12 comments.
Show a summary per file
| File | Description |
|---|---|
| helm/kagent/values.yaml | Sets agent runtimes to go and reduces resources for agents and kagent-tools. |
| helm/agents/promql/values.yaml | Adds a runtime-related value/comment to the promql agent chart values. |
| helm/agents/promql/templates/agent.yaml | Adds spec.runtime templating (defaults to python). |
| helm/agents/observability/values.yaml | Adds a runtime-related value/comment to the observability agent chart values. |
| helm/agents/observability/templates/agent.yaml | Adds spec.runtime templating (defaults to python). |
| helm/agents/kgateway/values.yaml | Adds a runtime-related value/comment to the kgateway agent chart values. |
| helm/agents/kgateway/templates/agent.yaml | Adds spec.runtime templating (defaults to python). |
| helm/agents/k8s/values.yaml | Adds a runtime-related value/comment to the k8s agent chart values. |
| helm/agents/k8s/templates/agent.yaml | Adds spec.runtime templating (defaults to python). |
| helm/agents/istio/values.yaml | Adds a runtime-related value/comment to the istio agent chart values. |
| helm/agents/istio/templates/agent.yaml | Adds spec.runtime templating (defaults to python). |
| helm/agents/helm/values.yaml | Adds a runtime-related value/comment to the helm agent chart values. |
| helm/agents/helm/templates/agent.yaml | Adds spec.runtime templating (defaults to python). |
| helm/agents/cilium-policy/values.yaml | Adds a runtime-related value/comment to the cilium-policy agent chart values. |
| helm/agents/cilium-policy/templates/agent.yaml | Adds spec.runtime templating (defaults to python). |
| helm/agents/cilium-manager/values.yaml | Adds a runtime-related value/comment to the cilium-manager agent chart values. |
| helm/agents/cilium-manager/templates/agent.yaml | Adds spec.runtime templating (defaults to python). |
| helm/agents/cilium-debug/values.yaml | Adds a runtime-related value/comment to the cilium-debug agent chart values. |
| helm/agents/cilium-debug/templates/agent.yaml | Adds spec.runtime templating (defaults to python). |
| helm/agents/argo-rollouts/values.yaml | Adds a runtime-related value/comment to the argo-rollouts agent chart values. |
| helm/agents/argo-rollouts/templates/agent.yaml | Adds spec.runtime templating (defaults to python). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
EItanya
left a comment
There was a problem hiding this comment.
I like the addition of runtime to the agent charts, but the changes to the parent chart are mostly breaking so I don't think we can make those
| version: ${VERSION} | ||
| repository: file://../tools/grafana-mcp | ||
| condition: tools.grafana-mcp.enabled, agents.observability-agent.enabled | ||
| condition: grafana-mcp.enabled, observability-agent.enabled |
There was a problem hiding this comment.
Why did this get updated? IIRC this happened before and was incorrect
There was a problem hiding this comment.
Because otherwise you can't pass in the values for the sub charts.
If you keep the values aligned with the names of the subcharts, those values can be passed to the subcharts, so users can override e.g. resources etc.
There was a problem hiding this comment.
Also see the values.yaml basically all values under
grafana-mcp (the name of the dependency) will propagate to the nested charts.
There was a problem hiding this comment.
FYI. Would be great if this sort of change could at least be highlighted in the release notes since this is ultimately a breaking change. I certainly didn't expect that my current Helm values setup would need to be rewritten in a patch release that doesn't even mention changes to Helm values - "Allow configuring agent runtime" doesn't really highlight this 😄
Because otherwise you can't pass in the values for the sub charts.
To this point... This is incorrect - yes you can still pass values through to the sub-chart. The defaults here (e.g. resources) were not being passed through. But there was nothing stopping anyone from configuring their own values.yaml correctly with something like
sub-chart:
runtime: "go" Also, FYI, the side-effect of moving enabled under sub-chart: is that now, that key/value pair enabled: true will always be passed through to the sub-chart as well.
| limits: | ||
| cpu: 1000m | ||
| memory: 1Gi | ||
| k8s-agent: |
There was a problem hiding this comment.
It is a mandatory change as otherwise the values will not propagate to the subcharts. The current value structure is broken.
If we want to be able to override values of the subcharts we need name the values at the root.
There was a problem hiding this comment.
Also see the Chart.yaml.
Values will only propagate to the nested/dependency charts if the keys in the yaml here match the name/alias of a chart dependency.
So if we have a dependency -name: k8s-agent then the value structure here must be k8s-agent: in order for the values to propagate. agents.k8s-agent will not propagate. This is simply how Helm works.
There was a problem hiding this comment.
Tested on main, and subchart values are currently broken and not propagated properly when they are set here so these changes make sense to me
|
Thanks for the questions in the review. I see the helm chart isn't following helm practices and has quite a lot more flaws in the design. Without changing value structure, none of the values would propagate to the subcharts. E.g. I'm even thinking to drop the subcharts approach and make all part of the same chart. Then we don't have to change the value structure. We would end up with a single helm chart. Using Helm templates to loop over the agents. I can also implement it that way. Simplifying the helm chart. Making the things more idiomatic etc. Let me know. I'm planning to contribute more and to address a whole bunch of gaps. So we can do a couple more production grade configurations. Also would love to improve the Kagent cli to make it more useable, similar like the cilium cli e.g. |
0cb4c5b to
8454497
Compare
|
This PR is looking great, the only thing I'll say is that I don't think we should change the default runtime yet. We're planning to do that in |
jmhbh
left a comment
There was a problem hiding this comment.
Tested this locally, works great!
| limits: | ||
| cpu: 1000m | ||
| memory: 1Gi | ||
| k8s-agent: |
There was a problem hiding this comment.
Tested on main, and subchart values are currently broken and not propagated properly when they are set here so these changes make sense to me
|
@EItanya tell me what to change, or make suggestions. Too prevent more back and forth... Would love to get this in so my deployment start working and I can contribute in the next improvements. |
For example, you set the runtime to all agents in the |
Signed-off-by: Marco Franssen <marco.franssen@gmail.com>
Signed-off-by: Marco Franssen <marco.franssen@gmail.com>
Signed-off-by: Marco Franssen <marco.franssen@gmail.com>
in 0.10 the swap to go is planned Co-authored-by: Marco Franssen <marco.franssen@gmail.com> Signed-off-by: Marco Franssen <marco.franssen@gmail.com>
af953ce to
c8e116d
Compare
|
All feedback handled @EItanya, removed the runtime override for now. |
This PR allows to configure the agent runtime as Go. In the agent.yaml templates I kept the default as python.
In the kagent chart I set the runtime by default to Go. It needs much less memory and CPU which also allows better use on local kind clusters which are more resource constrainted.
This way people can more easily run and try kagent on their local system. The resources are still overprovisioned to leave some room for more heavy use. Real production use can still override the values to higher numbers.