<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:cc="http://cyber.law.harvard.edu/rss/creativeCommonsRssModule.html">
    <channel>
        <title><![CDATA[Stories by Flant staff on Medium]]></title>
        <description><![CDATA[Stories by Flant staff on Medium]]></description>
        <link>https://medium.com/@flant_com?source=rss-71bfdb9446bd------2</link>
        <image>
            <url>https://cdn-images-1.medium.com/fit/c/150/150/1*qL904Ae77X5SwVJnQsDg7w.png</url>
            <title>Stories by Flant staff on Medium</title>
            <link>https://medium.com/@flant_com?source=rss-71bfdb9446bd------2</link>
        </image>
        <generator>Medium</generator>
        <lastBuildDate>Fri, 10 Apr 2026 01:09:38 GMT</lastBuildDate>
        <atom:link href="https://medium.com/@flant_com/feed" rel="self" type="application/rss+xml"/>
        <webMaster><![CDATA[yourfriends@medium.com]]></webMaster>
        <atom:link href="http://medium.superfeedr.com" rel="hub"/>
        <item>
            <title><![CDATA[10 years of werf: The Cloud Native story we made together]]></title>
            <link>https://blog.werf.io/werf-project-history-10-years-f092486e4224?source=rss-71bfdb9446bd------2</link>
            <guid isPermaLink="false">https://medium.com/p/f092486e4224</guid>
            <category><![CDATA[devops]]></category>
            <category><![CDATA[werf]]></category>
            <category><![CDATA[cloud-native]]></category>
            <category><![CDATA[open-source]]></category>
            <category><![CDATA[kubernetes]]></category>
            <dc:creator><![CDATA[Flant staff]]></dc:creator>
            <pubDate>Thu, 22 Jan 2026 09:23:23 GMT</pubDate>
            <atom:updated>2026-01-22T09:23:23.726Z</atom:updated>
            <content:encoded><![CDATA[<figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*NRYkix8Hhz7gRDNMDxGH3g.png" /><figcaption>werf brief stats</figcaption></figure><p>werf’s first commit was on January 22, 2016, and the project is now celebrating its 10th anniversary. To honor its anniversary, we decided to take a look at its key moments, milestones, wins, and future plans.</p><p>Here’s a brief timeline:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*zVPtLbNosrmEDmiXNpXHmQ.png" /></figure><p>Below, you can find a more detailed story of how werf evolved.</p><h3>werf v0 (dapp): January 2016 — December 2018</h3><p>The project was originally known as <em>dapp</em>, and it started as a tool created in a DevOps agency for building container images within CI/CD pipelines. The focus on efficiency made it special: the incremental approach allowed reusing the results from the previous builds.</p><p>The first version was written in Ruby. We used a Vagrantfile-like <em>Dappfile</em> for defining build configurations, which featured an imperative Ruby-based DSL.</p><pre>dimg &#39;symfony-demo-app&#39; do<br>  docker.from &#39;ubuntu:16.04&#39;<br><br>  git do<br>    add &#39;/&#39; do<br>      to &#39;/demo&#39;<br>      stage_dependencies.before_setup &#39;composer.json&#39;, &#39;composer.lock&#39;<br>    end<br>  end<br><br>  shell do<br>    before_install do<br>      run &#39;apt-get update&#39;,<br>          &#39;apt-get install -y curl php7.0&#39;,<br>          # add the phpapp user<br>          &#39;groupadd -g 242 phpapp&#39;,<br>          &#39;useradd -m  -d /home/phpapp -g 242 -u 242 phpapp&#39;<br>    end<br>    install do<br>      run &#39;apt-get install -y php7.0-sqlite3 php7.0-xml php7.0-zip&#39;,<br>          # install composer<br>          &#39;curl -LsS https://getcomposer.org/download/1.4.1/composer.phar -o /usr/local/bin/composer&#39;,<br>          &#39;chmod a+x /usr/local/bin/composer&#39;<br>    end<br>    before_setup do<br>      # modify source code permissions and run composer install<br>      run &#39;chown phpapp:phpapp -R /demo &amp;&amp; cd /demo&#39;,<br>          &quot;su -c &#39;composer install&#39; phpapp&quot;<br>    end<br>    setup do<br>      # use the current date as the application version<br>      run &#39;echo `date` &gt; /demo/version.txt&#39;,<br>          &#39;chown phpapp:phpapp /demo/version.txt&#39;<br>    end<br>  end<br><br>  # the port must match the port specified in start.sh<br>  docker.expose 8000<br>end</pre><p>The standout feature was its <em>stage-based build model</em>. Every single instruction created a new Docker image (a “stage”). This whole approach became the basis for build orchestration and efficient caching. Even in its first version, werf/dapp supported building any number of images in parallel using a shared configuration.</p><h4>June 2016: Advanced build debugging tools</h4><p>Pretty early in the game, dapp has got some build debugging tools. The user could hop into a container interactively at any build stage — before instructions were executed, after they completed, or upon an error. This greatly simplified the development and debugging of complex build scenarios and became one of dapp’s key features at the time.</p><h4>July 2016: Chef support for describing assembly instructions</h4><p>As builds grew more complex, it became clear that defining modular logic in a plain shell was a pain. To address this, we introduced support for Chef, a tool widely used in our company at the time.</p><p>This way, we could use Chef recipes during the image build process while ensuring no Chef artifacts or its cookbooks cluttered up the final image. So dapp has got a powerful mechanism for modularity and build logic reuse:</p><pre>dimg do<br>  docker.from &#39;ubuntu:16.04&#39;<br><br>  git.add(&#39;/&#39;).to(&#39;/app&#39;)<br>  docker.workdir &#39;/app&#39;<br><br>  docker.cmd [&#39;/bin/bash&#39;, &#39;-lec&#39;, &#39;bundle exec ruby app.rb&#39;]<br>  docker.expose 4567<br><br>  chef do<br>    cookbook &#39;apt&#39;<br>    cookbook &#39;rvm&#39;<br><br>    recipe &#39;ruby&#39;<br>    recipe &#39;bundle_gems&#39;<br>    recipe &#39;app_config&#39;<br>  end<br>end</pre><h4>March 2017: Deploying to Kubernetes via Helm</h4><p>Next year, dapp gained the capability to deploy applications to Kubernetes. It worked by calling the system’s Helm, and you just had to have a chart in the .helm directory.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/887/0*zJZ83i7dICH275Dz.png" /><figcaption>Presenting the `dapp deploy` to deploy Helm charts in Kubernetes (2017)</figcaption></figure><p>But dapp did more than just calling Helm. It introduced linking the build and deploy steps right in the templates, using special Helm functions to handle images. We also added a basic way to track resource statuses during a deployment.</p><h4>December 2017: Smart cleanup of the container registry</h4><p>With more and more builds and tags piling up, cleaning up the container registry became a priority. So, at the end of 2017, we added a cleanup feature to dapp. It was a garbage collection system that worked with different policies (like for branches, commits, or tags) and was smart enough to check which images were being used in Kubernetes.</p><p><em>What are the challenges of cleaning up the container registry? Read more about our approach </em><a href="https://blog.werf.io/cleaning-up-container-images-with-werf-ec35b5d46569"><em>here</em></a><em>.</em></p><h4>March 2018: Support for YAML configuration and Ansible for describing assembly instructions (Chef gets replaced)</h4><p>At this point, dapp shifted to a more user-friendly and familiar approach by adding YAML support. Moving from the imperative Ruby DSL to a declarative syntax made configurations easier to read, more predictable, and simpler to manage (particularly in complex projects).</p><p>At the same time, we moved away from Chef in favor of Ansible for build instructions. Chef turned out to be too cumbersome and didn’t live up to our expectations. Ansible, on the other hand, let us keep the modular and declarative style we wanted, while making it easier to get started with and more straightforward to run. Here’s how it looked:</p><pre>dimg: ~<br>from: alpine:latest<br>git:<br>- add: /<br>  to: /app<br>  owner: app<br>  group: app<br>  excludePaths:<br>  - public/assets<br>  - vendor<br>  - .helm<br>  stageDependencies:<br>    install:<br>    - package.json<br>    - Bowerfile<br>    - Gemfile.lock<br>    - &quot;app/assets/*&quot;<br>- url: https://github.com/kr/beanstalkd.git<br>  add: /<br>  to: /build<br>ansible:<br>  beforeInstall:<br>  - name: &quot;Create non-root main application user&quot;<br>    user:<br>      name: app<br>      comment: &quot;Non-root main application user&quot;<br>      uid: 7000<br>      shell: /bin/bash<br>      home: /app<br>  - name: &quot;Disable docs and man files installation in dpkg&quot;<br>    copy:<br>      content: |<br>        path-exclude=/usr/share/man/*<br>        path-exclude=/usr/share/doc/*<br>      dest: /etc/dpkg/dpkg.cfg.d/01_nodoc<br>  install:<br>  - name: &quot;Precompile assets&quot;<br>    shell: |<br>      set -e<br>      export RAILS_ENV=production<br>      source /etc/profile.d/rvm.sh<br>      cd /app<br>      bundle exec rake assets:precompile<br>    args:<br>      executable: /bin/bash</pre><h4>October 2018: Spinning off deployment tracking into kubedog</h4><p>The logic for tracking deployments (including statuses, events, logs, and waiting for resources to become ready) was moved out of dapp and into its own project: <a href="https://github.com/werf/kubedog">kubedog</a>.</p><p>Separating kubedog into a standalone library allowed us to isolate this feature, make it reusable for other tools, and evolve its monitoring capabilities independently of the main product. Over time, kubedog became a key component of the werf ecosystem and was also adopted by the community for other projects.</p><h4>November 2018: Support for secret values and files in deployment configs</h4><p>dapp has got support for secret values and secret files for deployment configurations. That addressed the challenge of securely storing and using sensitive data like tokens, access keys, TLS certificates, and private keys — without having them stored in plain text in Git (or your Helm charts). Secrets were decrypted only at deployment time for use in Kubernetes templates and manifests</p><h4>December 2018: Seamless migration from Ruby to Go</h4><p>With the project’s growth, expanding use cases, and feedback from our users, it became obvious that the Ruby implementation was holding us back. So, we commenced the migration to Go.</p><p>The migration was done gradually to ensure a smooth experience for users, allowing us to keep developing the product while preserving its stability. In the end, we got rid of the old limitations and made it easier to integrate with the Kubernetes ecosystem.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*xOrL7mjxkqNc7rOs.png" /><figcaption><em>Lines of code in Ruby vs. Go in dapp/werf on the way to the v1 release</em></figcaption></figure><h3>werf v1: December 2018 — March 2020</h3><h4>January 2019: <strong>A new name: werf</strong></h4><p>This is when the project got its new name: <em>werf</em>. We allowed everyone in the company and the wider community participate in discussions and voting for the name. We ended up with around 100 suggestions, from sea and pirate themes to more abstract and technical concepts.</p><p>After the vote, three front-runners were identified:</p><ul><li>grog — 32 %</li><li>flimb — 29,7 %</li><li>werf — 27 %</li></ul><p>Although <em>werf</em> did not end up on the first place in the vote, the team ultimately went with it. It just “clicked” — best reflected the project’s vibe, being associated with a place of assembly and creation — a “shipyard” (“werf” in Dutch) — which organically fit into the brand’s vision and future plans. <em>(By the way, note that we prefer to use a lowercase initial when spelling “werf.”)</em></p><figure><img alt="" src="https://cdn-images-1.medium.com/max/550/1*kCUBiuU43l3QnrNul-qApw.png" /><figcaption>werf: new naming and logo of the project since January 2019</figcaption></figure><h4><strong>January 2019:</strong> The werf update manager (multiwerf)</h4><p><em>multiwerf</em> implemented our approach to handling versions through update channels. Its job was to auto-update werf and, crucially, to isolate a specific version for the active shell session.</p><p>The . $(multiwerf use 1.1 stable --as-file) shell command updated werf automatically in the background from a channel set by the user (<em>stable</em> for v1.1) and “pinned” that version for the active shell session. So you could use multiple werf versions on the same machine without any conflicts.</p><h4>January 2019: Availability on all major operating systems</h4><p>This is when werf was made available for all major operating systems, with testing and distribution now covering Linux, macOS, and Windows.</p><h4>April 2019: Switching to our own fork of Helm</h4><p>A major architectural choice for v1 was to build Helm right into the werf binary. Both the Helm client and Tiller ran inside the same werf process during deployment, so you didn’t need to install anything extra in your Kubernetes cluster. That approach ensured werf was fully compatible with Helm 2 while getting rid of most of its operational headaches and made things more secure.</p><p><em>Fun fact: Helm didn’t adopt this Tiller-less approach until Helm 3 came out in November 2019.</em></p><p>Having Helm built-in allowed us to level-up our deployment process. Instead of treating Helm like a black box with the --wait flag, werf started keeping a close eye on resources. It could track statuses, print events and logs, and stop the deployment process instantly on an error instead of waiting for a timeout. All the monitoring logic for this came from our kubedog project, which we had already spun off into its own solid library.</p><h4>August 2019: 3-way merge is implemented in our Helm 2 fork</h4><p>By this point, werf had begun contributing to the Helm upstream while at the same time developing its own Helm fork to deliver value to users more quickly and to experiment with features the official Helm lacked. One such extension was the 3-way merge for updating Kubernetes resources.</p><p>Having 3-way merge in our Helm 2 fork meant we could apply changes to existing resources way more accurately, because we <em>considered their actual state</em> in the cluster. This feature <a href="https://blog.werf.io/3-way-merge-patches-helm-werf-beb7eccecdfe">was introduced</a> in werf before the Helm 3 showed up. It significantly improved the predictability and safety of updates.</p><p>To enhance reliability, werf also introduced locks to prevent parallel deployments of the same release. This helped prevent race conditions and inconsistent states when multiple deployments happened at once.</p><h4>August 2019: First 1,000 stars on GitHub</h4><p>The project surpassed the 1,000-star milestone on GitHub, a clear signal of growing interest and recognition from the Open Source community.</p><h4>September 2019: Dockerfile support</h4><p>Adding Dockerfile support was a major step in welcoming more users and making migration easier. werf learned to work with both its own Stapel syntax and classic Dockerfiles.</p><h4>December 2019: Distributed locking put into a separate project</h4><p>We pulled out the Go library for distributed locking from werf and made it into a separate tool: <a href="https://github.com/werf/lockgate">lockgate</a>. It supports both local file locks and distributed locking via Kubernetes or an HTTP lock server, making it suitable for different infrastructure scenarios.</p><p>The library was well-received by the community, garnering external contributions and seeing adoption in other projects (over 250 ⭐ on GitHub), which proved it was a genuinely useful tool — again, not just for werf!</p><h3>werf v1.1: March 2020 — November 2020</h3><h4>March 2020: Content-based tagging</h4><p>Support for content-based tags <a href="https://blog.werf.io/content-based-tagging-in-werf-eb96d22ac509">was introduced</a>. This laid out the basis for the subsequent move away from contextual tagging strategies (based on branches, commits, or CI) and toward a unified approach to image handling — at the configuration, cleanup, and storage levels — and made our use of the container registry way more efficient.</p><h4>April 2020: Distributed layer storage based on container registry</h4><p>From this moment, werf took on the task of synchronizing parallel builders that use the same container registry. The mechanism was modeled after how Docker handles its local storage when saving, selecting, and ensuring layer immutability.</p><p>The key difference was the scale: Docker synchronized processes that use a single host’s local storage, whereas werf applied this principle to a distributed environment. This way, multiple builders could work with the same container registry at the same time — no conflicts.</p><h3>werf v1.2: November <strong>2020 — April 2024</strong></h3><h4>December 2020: Bundle support is added</h4><p><em>Bundling</em> was a new way to ship a Helm chart and all its container images as a single artifact package. Bundles contain everything you need to deploy an application, and you can distribute them around and use them without relying on the original Git repo.</p><p>Published bundles can be deployed with werf or any tool that supports Helm charts in OCI registries (like Helm, Argo CD, Flux). You can copy them between container registries, export them as tar files, use offline, or in air-gapped environments. When creating a bundle, werf automatically includes image details, tags, passed values, and global annotations and labels in the chart. This enables reproducible and standalone deployments — all without being tied to the project’s Git repository.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*fR8lUVVSLVXwKOLlggCayg.png" /><figcaption>werf bundles simplified shipping a Helm chart with all related container images</figcaption></figure><h4>May 2021: Secure update manager to replace multiwerf (trdl)</h4><p>multiwerf <a href="https://blog.werf.io/introducing-trdl-an-open-source-solution-for-secure-and-continuous-software-delivery-7c0068f2d02b">was replaced</a> by <a href="https://trdl.dev/">trdl</a> (“true delivery”) — an update manager focused on the secure delivery of binaries from a Git repository to the user’s host. trdl was designed as a secure update channel that eliminates a whole class of risks associated with artifact substitution, compromise, or uncontrolled distribution.</p><p>trdl’s security is based on the combination of Git, a TUF repository, and HashiCorp Vault. Such a combination prevents supply chain attacks, verifies the integrity and authenticity of updates, and minimizes potential damage even if individual components are compromised.</p><p>trdl is not limited to werf and can be used for the secure release and distribution of any software. The CLI supports a wide range of scenarios, but the core update and usage approach known from multiwerf has been kept intact: . &quot;$(trdl use werf 1.2 stable)&quot; will pull the correct version of werf, runs a cryptographic check on it, and activates it in your current shell. That way, you can safely use different versions of tools on the same machine without any conflicts or global installs.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*q6Xib9StzB9NGPcfF-_H8Q.png" /><figcaption>Releasing a new software version (v1.0.1) with trdl</figcaption></figure><h4>December 2021: Online tutorial for developers dedicated to Kubernetes and deployment with werf</h4><p>An <a href="https://werf.io/guides.html">online tutorial</a> was launched, aimed at developers and DevOps engineers seeking to master Kubernetes and practical application delivery with werf. It integrated theory with step-by-step practical guides, covering a spectrum from basic concepts to advanced CI/CD scenarios.</p><p>The tutorial was tailored to popular languages and frameworks, featuring examples of applications and infrastructure (IaC). It allowed users to choose a familiar technology stack to learn Kubernetes and practice with werf on real-world use cases.</p><h4>February 2022: Secure image builds; no privileged daemon required</h4><p>werf got experimental support for Buildah, which enabled secure image builds in a rootless mode without using the Docker daemon. The old Docker-based build workflow was also preserved for situations where it was needed.</p><p>At the same time, the werf release process was updated to include ready-to-use images for running builds with Docker as well as right inside a Kubernetes cluster. This streamlined werf’s integration into diverse CI/CD environments and broadened its potential applications.</p><h4>August 2022: Telemetry was introduced</h4><p>werf has got telemetry to help analyze how the tool is being used. It collects data on versions, update channels, project activity, runtime environments, CLI command usage, and build metrics.</p><p>Telemetry became the basis for understanding how werf is used in the real world, and helped us to assess the stability of releases and identify bottlenecks in the delivery process.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*6nhWWlThv8PE1xzeXmDrvg.png" /><figcaption>The number of active projects that have been using werf throughout 2025</figcaption></figure><h4>December 2022: <strong>werf joined CNCF!</strong></h4><p>The werf project <a href="https://blog.werf.io/werf-joins-cncf-4767462dd8a6">was accepted</a> into the CNCF (Cloud Native Computing Foundation), marking its official recognition in the Cloud Native space. This confirmed the project’s maturity and openness, signaling its readiness for wider adoption while encouraging greater community involvement in its development and integration with other CNCF tools.</p><h4>May 2023: Abandoning the Helm fork and launching Nelm</h4><p>The first commit to Nelm marked a new stage in the evolution of werf’s deployment mechanism. By this point, it had become clear that further development of the custom Helm fork was constrained by Helm’s own architecture. So we decided to abandon the fork and rewrite the deployment subsystem from scratch while maintaining backward compatibility.</p><h4>2023–2024: Community involvement</h4><p>During these years, werf was showcased at several offline events:</p><ul><li>at KCD Czech &amp; Slovak 2023 in Bratislava;</li><li>at the CNCF Project Pavilion during KubeCon + CloudNativeCon Europe 2023 in Amsterdam;</li><li>at the CNCF Project Pavilion during KubeCon + CloudNativeCon Europe 2024 in Paris.</li></ul><iframe src="https://cdn.embedly.com/widgets/media.html?src=https%3A%2F%2Fwww.youtube.com%2Fembed%2F7CfpyTOQ2Mc%3Ffeature%3Doembed&amp;display_name=YouTube&amp;url=https%3A%2F%2Fwww.youtube.com%2Fwatch%3Fv%3D7CfpyTOQ2Mc&amp;image=https%3A%2F%2Fi.ytimg.com%2Fvi%2F7CfpyTOQ2Mc%2Fhqdefault.jpg&amp;type=text%2Fhtml&amp;schema=youtube" width="854" height="480" frameborder="0" scrolling="no"><a href="https://medium.com/media/298ab2957fc85c7dac9b7e87862c253d/href">https://medium.com/media/298ab2957fc85c7dac9b7e87862c253d/href</a></iframe><h3>werf v2: April 2024 — …</h3><h4>March 2025: Nelm 1.0 released</h4><p>Nelm 1.0 <a href="https://blog.werf.io/nelm-cli-helm-compatible-alternative-5648b191f0af">was released</a>. It is a stable, backward-compatible alternative to Helm 3, designed to work with Helm charts. Nelm comes as a standalone CLI tool and can also be used as a library in other tools.</p><h4>November 2025: Ongoing Nelm development and how it compares with Helm 4</h4><p>Nelm adoption is growing, and so does the list of features it provides. After the long-awaited Helm 4 release — its first major update in years, — the community asked whether Nelm was still relevant. Thus, we published a <a href="https://blog.werf.io/nelm-helm-4-comparison-edf0a696f602">detailed overview</a> of how Nelm continues to evolve and remains a decent alternative with a broader feature set and a dedicated user base.</p><h4>2024–2025: Community involvement</h4><p>werf was featured at:</p><ul><li>the CNCF YouTube channel, featuring the <a href="https://www.youtube.com/watch?v=-ny_SXusAks">“From improving Helm to developing Nelm: the evolution of deployments in werf”</a> webinar;</li><li>the <a href="https://www.youtube.com/watch?v=TEZVeWsirsw">“Specialized Templating” episode</a> of the “You Choose!” YouTube show (Ch. 05, Ep. 05), alongside a diverse set of CNCF tools: Porter, Radius, Score, and PipeCD;</li><li>FOSSASIA Summit 2025 in Bangkok.</li></ul><iframe src="https://cdn.embedly.com/widgets/media.html?src=https%3A%2F%2Fwww.youtube.com%2Fembed%2F-fMRksRL30E%3Ffeature%3Doembed&amp;display_name=YouTube&amp;url=https%3A%2F%2Fwww.youtube.com%2Fwatch%3Fv%3D-fMRksRL30E&amp;image=https%3A%2F%2Fi.ytimg.com%2Fvi%2F-fMRksRL30E%2Fhqdefault.jpg&amp;type=text%2Fhtml&amp;schema=youtube" width="854" height="480" frameborder="0" scrolling="no"><a href="https://medium.com/media/c56b2c207211106d64c170af09dd9d15/href">https://medium.com/media/c56b2c207211106d64c170af09dd9d15/href</a></iframe><h4>January 2026: <strong>werf turns 10!</strong></h4><p>The werf project is celebrating its 10th anniversary. Over this period, it has evolved from a simple experiment in incremental Docker image building to a mature Open Source ecosystem for application delivery in Kubernetes. This ecosystem includes its own tools for building, deploying, and distributing software, as well as a number of standalone projects.</p><h3>The werf ecosystem in numbers</h3><p>werf (<a href="https://werf.io/">website</a> + <a href="https://github.com/werf/werf">GitHub</a>):</p><ul><li>4600+ GitHub ⭐</li><li>1300+ releases</li><li>18,000+ active projects using werf</li><li>15,000+ commits</li><li>60+ contributors</li><li>6000+ merged pull requests</li></ul><p>Other projects include:</p><ul><li><a href="https://github.com/werf/nelm">Nelm</a>: 1000+ ⭐, 45 releases, 800+ commits</li><li><a href="https://github.com/werf/trdl">trdl</a> (<a href="https://trdl.dev/">website</a>): 290+ ⭐, 45 releases, 900+ commits</li><li><a href="https://github.com/werf/kubedog">kubedog</a>: 700+ ⭐, 38 releases, 650+ commits</li><li><a href="https://github.com/werf/lockgate">lockgate</a>: 250+ ⭐, 1 release, ~100 commits</li></ul><figure><img alt="" src="https://cdn-images-1.medium.com/max/869/1*4aB2t0Aqdcq6PzeHODFmew.png" /><figcaption>GitHub star history for werf’s repositories</figcaption></figure><h3>What’s next for werf</h3><p>But we’re not stopping here — we have big plans ahead:</p><ul><li>A new build architecture featuring deep integration with Docker BuildKit.</li><li>Enhanced supply chain security, including image signing, verification, SBOMs, and vulnerability scanning.</li><li>A Nelm operator to integrate with tools like Argo CD and Flux, or to be used on its own (<a href="https://github.com/werf/nelm/issues/494">Issue #494</a>).</li><li>The ability to patch Helm chart resources (<a href="https://github.com/werf/nelm/issues/115">Issue #115</a>).</li><li>An alternative to Helm templates (but not a replacement!): our TypeScript experiment is almost ready for you to try (<a href="https://github.com/werf/nelm/pull/502">PR #502</a>, <a href="https://github.com/werf/nelm/issues/54">Issue #54</a>).</li><li>…and a whole lot more. Stay tuned!</li></ul><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=f092486e4224" width="1" height="1" alt=""><hr><p><a href="https://blog.werf.io/werf-project-history-10-years-f092486e4224">10 years of werf: The Cloud Native story we made together</a> was originally published in <a href="https://blog.werf.io">werf blog</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[How Nelm compares to Helm 4: Current differences and future plans]]></title>
            <link>https://blog.werf.io/nelm-helm-4-comparison-edf0a696f602?source=rss-71bfdb9446bd------2</link>
            <guid isPermaLink="false">https://medium.com/p/edf0a696f602</guid>
            <category><![CDATA[cloud-native]]></category>
            <category><![CDATA[werf]]></category>
            <category><![CDATA[kubernetes]]></category>
            <category><![CDATA[open-source]]></category>
            <category><![CDATA[helm]]></category>
            <dc:creator><![CDATA[Flant staff]]></dc:creator>
            <pubDate>Thu, 04 Dec 2025 15:24:22 GMT</pubDate>
            <atom:updated>2025-12-04T15:25:04.142Z</atom:updated>
            <content:encoded><![CDATA[<p>The recent release of Helm 4 provides an excellent opportunity to compare it with the alternative we’ve been developing in werf, Nelm. This article examines the new features of both projects, details their differences, and outlines the future roadmap for Nelm.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*Bn1K9Xq4fQ6ozVcID8ZN2g.png" /></figure><h3>Helm 4 evolution</h3><p>Helm 4 introduced a pack of <a href="https://helm.sh/docs/overview/#new-features">new features</a> for the Cloud Native community. Perhaps the most significant user-facing changes were there adoption of Kubernetes Server-Side Apply (SSA) instead of the 3-Way Merge (to resolve issues with incorrect resource updates) and kstatus-based resource watching. The rest of the new features are mostly focused on reducing technical debt.</p><p>While implementing SSA is a noteworthy achievement deserving of its own release, the community <a href="https://www.reddit.com/r/kubernetes/comments/1ova60o/comment/noj55n5/">seemed to expect more</a> from Helm 4. Among the most popular feature requests were an <a href="https://www.reddit.com/r/kubernetes/comments/1ova60o/comment/noi6hxr/">alternative</a> to Go templates and <a href="https://www.reddit.com/r/kubernetes/comments/1ova60o/comment/nok0a8r/">improved handling</a> of Custom Resource Definition (CRD) deployments.</p><p>The pace of Helm’s development accelerated leading up to the new release. However, given the tremendous adoption of Helm in the industry and strict backward compatibility requirements, further significant architectural changes will likely be postponed until the next major Helm release.</p><h3>How Nelm differs from Helm 4</h3><p><a href="https://github.com/werf/nelm">Nelm</a> is a modern alternative to Helm 4, focusing on introducing new major features while maintaining backward compatibility with Helm charts and releases.</p><p>Nelm <a href="https://blog.werf.io/werf-2-nelm-replacing-helm-a11980c2bdda">was created</a> in werf, following the tool’s users’ needs for improved and more powerful deployment. Later, it became a standalone project that can be used on its own (without werf) to deploy Helm charts to Kubernetes. Under the hood, Nelm uses parts of the Helm codebase, but its most troublesome parts, particularly the deployment engine, have been rewritten from scratch.</p><p>While Nelm <a href="https://blog.werf.io/ssa-vs-3wm-in-helm-werf-nelm-4d7996354ebe">has supported</a> Server-Side Apply for a long time, it had more user-facing features to offer and continued to evolve. For instance, it recently introduced resource lifecycle management via the werf.io/delete-policy, werf.io/ownership, and werf.io/deploy-on annotations. Let’s examine the key differences between Nelm and Helm 4.</p><h4>1. Deploying CRDs</h4><p>Helm recommends placing Custom Resource Definitions (CRDs) in the chart’s crds directory. However, resources in this directory cannot be updated and are only deployed during the initial release installation. The crds directory is ignored during subsequent helm upgrade operations.</p><p>As a workaround, some users deploy CRDs as regular resources by putting them in the templates directory. However, such an approach makes it harder to maintain the deployment order. On top of that, since CRD manifests are so large, you are risking hitting the release Secret’s size limit.</p><p>To get around these issues, some well-known Open Source charts even create a <a href="https://github.com/prometheus-community/helm-charts/tree/aeadc9d62dac30f32c3c5d3d0eadc2bc689d94a5/charts/kube-prometheus-stack/charts/crds">separate subchart</a> just for deploying CRDs.</p><p>With Nelm, you just move your CRDs in the crds directory. Nelm features the fully-fledged deployment mechanism for this directory, so CRDs get updated and deployed every time you run an upgrade.</p><h4>2. Defining deployment order</h4><p>In Helm, deployment order is typically defined using Helm hooks. This method is adequate for simple Jobs that need to run before or after a rollout.</p><p>But what if a Job requires a Deployment to run? Or what if a Job must be run halfway through the release? No standard solutions exist in Helm for these scenarios.</p><p>Before each rollout, Nelm builds a graph of operations with the Kubernetes cluster’s resources, which defines their deployment order:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*G0yz0ayxVFIFxHnRqoio7w.png" /></figure><p>It also provides a simple yet powerful way for setting this order: the <a href="https://github.com/werf/nelm/?tab=readme-ov-file#werfiodeploy-dependency-id-annotation">werf.io/deploy-dependency</a> annotation. This annotation creates a dependency between operations in the graph, thus defining their rollout sequence. For example, the following configuration:</p><pre>kind: Deployment<br>metadata:<br>  name: backend<br>  annotations:<br>    werf.io/deploy-dependency-db: state=ready,kind=StatefulSet,name=postgres</pre><p>… means that the backend Deployment will only be created or updated after the postgres StatefulSet is created/updated and ready. The graph will look like this:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*636U5zIz61VZSS1IF0iLgw.png" /></figure><p>The werf.io/deploy-dependency annotation works for both regular resources and hooks. We plan to add support for specifying dependencies on entire charts in the future.</p><p>As an alternative, Nelm also features the <a href="https://github.com/werf/nelm/?tab=readme-ov-file#werfioweight-annotation">werf.io/weight</a> annotation. It works similarly to helm.sh/hook-weight but applies to both hooks and regular resources.</p><p>There’s also the <a href="https://github.com/werf/nelm/?tab=readme-ov-file#idexternal-dependencywerfioname-annotation">external-dependency.werf.io/resource</a> annotation, which lets you specify a dependency for resources outside of the Helm release, such as a Secret that an operator creates.</p><p>Of course, regular Helm hooks and their weights are also supported.</p><h4>3. Resource lifecycle</h4><p>Helm lets you prevent a resource from being deleted using helm.sh/resource-policy: keep and control when hooks are deleted using helm.sh/hook-delete-policy. But what if you need to deploy an immutable Job mid-release? Or clean up a regular resource after its deployment? Or manage the same resource across different releases?</p><p>We recently added to Nelm a whole new set of features for managing resource lifecycle:</p><ol><li>The <a href="https://github.com/werf/nelm/?tab=readme-ov-file#werfiodelete-policy-annotation">werf.io/delete-policy</a> annotation, which is similar to helm.sh/hook-delete-policy, allows a resource to be recreated instead of updated (before-creation), recreated only upon encountering a “field is immutable” error (before-creation-if-immutable), or deleted after a successful (succeeded) or failed (failed) deployment. This annotation, like all others in Nelm, applies to both hooks and regular resources.</li><li>The <a href="https://github.com/werf/nelm/?tab=readme-ov-file#werfioownership-annotation">werf.io/ownership</a> annotation enables hook-like behavior for regular resources. Specifically, it prevents applying or validating release annotations for the resource, and it protects the resource from deletion if it has been removed from the chart or if the release itself is being deleted.</li><li>Another annotation, <a href="https://github.com/werf/nelm/?tab=readme-ov-file#werfiodeploy-on-annotation">werf.io/deploy-on</a>, allows rendering a resource only during a release install, upgrade, rollback, or uninstall, similar to what you can already do with Helm hooks. Still, using this annotation does not convert the resource into a hook.</li></ol><p>With these annotations, it is possible to replicate the behavior of a Helm hook without formally declaring one. For example, this hook:</p><pre>metadata:<br>  annotations:<br>    helm.sh/hook: pre-install<br>    helm.sh/hook-delete-policy: before-hook-creation</pre><p>… is similar to the following non-hook resource:</p><pre>metadata:<br>  annotations:<br>    werf.io/deploy-on: install<br>    werf.io/delete-policy: before-creation<br>    werf.io/ownership: anyone</pre><p>In general, we recommend Nelm users avoid using Helm hooks when authoring charts. This simplifies charts, allows for more flexible resource behavior, and accelerates rollouts by eliminating the separate hook deployment phase. However, using hooks may still be justified if maintaining compatibility with vanilla Helm is a requirement.</p><h4>4. Advanced resource tracking</h4><p>Helm 3 included a basic mechanism for waiting for certain regular Kubernetes resources to become ready. Helm 4 replaced it with <a href="https://pkg.go.dev/sigs.k8s.io/cli-utils/pkg/kstatus/status">kstatus</a>, which improved readiness detection accuracy, but did not introduce any fundamental changes.</p><p>Nelm features its own advanced resource tracking system. Compared to Helm 4, it:</p><ul><li>is more accurate than kstatus at detecting when a resource is ready;</li><li>can track not just readiness, but also whether a resource exists or not, and can detect and react to errors like failing probes;</li><li>supports readiness tracking for popular Custom Resources with manually defined <a href="https://github.com/werf/kubedog/blob/6ffc5a117ada8447acd9204d381215ab038b9395/pkg/tracker/generic/contrib_resource_status_rules.yaml">rule sets</a>;</li><li>determines the readiness of other Custom Resources heuristically, which works for most resources (no false positives);</li><li>displays real-time status, errors, logs, and events for resources in the terminal during deployment.</li></ul><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*H0zioQ7H5MMwhcZL_6FzVQ.png" /><figcaption>Detailed Kubernetes resources’ tracking while installing a release via Nelm</figcaption></figure><p>Tracking requires no initial configuration but can be fine-tuned or disabled via command-line flags and annotations.</p><h4>5. Encrypting values.yaml and other files</h4><p>Helm doesn’t have built-in support for encrypted files in a chart; this functionality is provided by the <a href="https://github.com/jkroepke/helm-secrets">helm-secrets</a> plugin.</p><p>Nelm, on the other hand, comes with out-of-the-box support for encrypted values files and any other encrypted files in the chart’s secrets directory. Working with secrets in Nelm is easier than using the helm-secrets plugin.</p><p>Generate a secret key and create an encrypted values file:</p><pre>NELM_SECRET_KEY=$(nelm chart secret key create)<br>nelm chart secret values-file edit secret-values.yaml</pre><p>After that, you can use the encrypted values just like any other values:</p><pre># templates/secret.yaml<br>kind: Secret<br>stringData:<br>  mySecret: {{ .Values.mySecretValue }}</pre><pre>nelm release install -n foo -r bar</pre><p>On top of that, Nelm can encrypt <a href="https://github.com/werf/nelm/?tab=readme-ov-file#encrypted-arbitrary-files">arbitrary files</a> within the chart’s secrets directory.</p><h4>6. Release planning</h4><p>Nelm natively implements an analog of the <a href="https://github.com/databus23/helm-diff">helm diff</a> plugin for Helm. The nelm release plan install command accurately displays the changes that will be applied to the Kubernetes cluster’s resources during the next rollout.</p><p>The output is precise as it is based on the plan of operations with resources, which is devised before every deployment. On top of that, unlike helm diff, this plan is based on resource updates performed via Server-Side Apply, not a 3-Way Merge.</p><p>We’re also working on a way to create and save a plan with a single command (nelm release plan install --save-plan) and then pass it to another command (nelm release install --use-plan). This means you can approve a plan and be certain that Nelm will not perform any unplanned actions. This workflow cannot be implemented with Helm and helm diff.</p><h3>What’s missing in Nelm</h3><p>First of all, Nelm does not support Helm 3 CLI plugins. They depend on the Helm CLI, including its command structure, options, and even on the way the logs are rendered. Achieving compatibility would require rewriting a significant portion of the Helm codebase, which is time-consuming and seems to be a pointless task. Instead, we are implementing the functionality of the most popular plugins natively within Nelm (e.g., helm diff and helm secrets).</p><p>Secondly, Nelm lacks support for post-renderers. Instead, we’ll introduce a replacement for Go templates (more on that below) and provide out-of-the-box resource patching, eliminating the need to install external plugins or configure anything. The reasoning behind this approach is detailed in issues <a href="https://github.com/werf/nelm/issues/54">#54</a> and <a href="https://github.com/werf/nelm/issues/115">#115</a>.</p><p>Currently, Nelm cannot be used with Argo CD or Flux. We will address this via a <a href="https://github.com/werf/nelm/issues/494">Nelm operator</a>, with its Custom Resources being deployed via Argo CD, Flux, or any other GitOps tool.</p><p>Finally, tools like Helmfile and Helmwave are not compatible with Nelm. We will likely resolve this by implementing a native Nelmfile accessible right from the Nelm CLI. The Helmwave project <a href="https://github.com/helmwave/helmwave/issues/1100">considers</a> switching to Nelm itself.</p><h3>What’s next for Nelm after the Helm 4.0 release</h3><p>Nelm serves as the deployment engine for werf, a tool currently used in over 20,000 projects. On top of that, Nelm is actively being integrated into other products, such as the <a href="https://github.com/deckhouse/deckhouse">Deckhouse Kubernetes Platform</a>. Being such an essential building block secures a solid future for Nelm, thanks to our commitment to further developing it.</p><p>The Helm 4.0 release didn’t really change anything. Thanks to its initial focus on bringing new capabilities to those in need, Nelm is still far ahead in features and improvements, and we expect this lead to grow. Over the past year, we have stabilized Nelm v1, refactored the entire codebase, and added many new features. We are also excited to have two new full-time developers joining the Nelm team very soon and to see an increasing community engagement in the project development.</p><h3>Future plans</h3><p>Over the next six months, we intend to release Nelm v2, migrate to the Helm 4 codebase, and release the Nelm operator for Argo CD and Flux integration.</p><p>Plans for the next year include an <a href="https://github.com/werf/nelm/issues/54">alternative to Go templates</a> (our <a href="https://github.com/werf/nelm/pull/497">current proposal</a> involves using TypeScript for that), chart patching, and downloading charts directly from Git.</p><p>We will continue to actively develop Nelm, just as we have been developing and supporting <a href="https://github.com/werf/werf">werf </a>for nine years. You can learn more about Nelm and try it out in our <a href="https://github.com/werf/nelm">GitHub repository</a>.</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=edf0a696f602" width="1" height="1" alt=""><hr><p><a href="https://blog.werf.io/nelm-helm-4-comparison-edf0a696f602">How Nelm compares to Helm 4: Current differences and future plans</a> was originally published in <a href="https://blog.werf.io">werf blog</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Canary Deployment in Kubernetes Using Argo Rollouts and Istio]]></title>
            <link>https://blog.deckhouse.io/canary-in-k8s-using-argo-rollouts-and-istio-0d41ba5e1f85?source=rss-71bfdb9446bd------2</link>
            <guid isPermaLink="false">https://medium.com/p/0d41ba5e1f85</guid>
            <category><![CDATA[devops]]></category>
            <category><![CDATA[kubernetes]]></category>
            <category><![CDATA[canary-deployments]]></category>
            <category><![CDATA[istio]]></category>
            <category><![CDATA[argo-rollouts]]></category>
            <dc:creator><![CDATA[Flant staff]]></dc:creator>
            <pubDate>Thu, 13 Nov 2025 11:41:43 GMT</pubDate>
            <atom:updated>2025-11-13T11:41:43.870Z</atom:updated>
            <content:encoded><![CDATA[<p>There are plenty of articles online exploring the theory and practice of different Kubernetes deployment strategies. Still, I think there’s more to say — and to show. My name is Rinat Mukaev, I’m a DevOps Engineer at Deckhouse, and today, I’d like to look at an alternative way of running a canary deployment, this time using passive health checks with Argo Rollouts and Istio. This setup is perfect for when clients in the cluster connect to your app using its Service name.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*XSeslVZpRjckOeZw8OX6KQ.png" /></figure><h3>Canary Deployment Pros and Quirks</h3><p>Before we dive in, let’s quickly recap why canary deployments are so cool. The core idea is to gradually shift traffic to the new app version. This way, you can test fresh features on a small subgroup of real-world users before releasing it to everyone. Canary deployments have a few other cool perks too:</p><ol><li><strong>Saving Resources. </strong>The transition to the new version is smooth and easy. You can gradually scale up instances running the new version while scaling down instances with the old (stable) version on board. No need to double your resource usage, creating a duplicate, full-sized environment.</li><li><strong>Easy Rollbacks.</strong> If something goes wrong, you just switch the traffic back to the good ol’ stable version.</li><li><strong>Zero-Downtime Updates.</strong> Since user traffic is shifted between versions gradually, users won’t experience any downtime or interruptions.</li></ol><p>Keep in mind, however, that with canary deployments, you’re running two different versions of your app at the same time, which can lead to some tricky situations.</p><p>If your app uses a database, you have to be careful. During a canary release, you’ll have two different versions of your app hitting the same database. Thus, you have to make sure your DB schema works with both the old and the new version. You can address this issue by, say, doing migrations in steps — first, a migration to get the schema ready for the new version, and then another one to clean up old stuff after the deployment is complete. Another option is to avoid any backward-incompatible changes to the database.</p><p>On top of that, you have to make sure that user sessions are “sticky,” meaning a user consistently hits the same version of the application. Otherwise, their requests may bounce between the new and old version. One way to address this is by using Hash / Consistent Hashing.</p><p>In this article, I’ll show how to implement a canary deployment using passive health checks with Argo Rollouts and the Istio service mesh. The Ingress NGINX Controller works well if traffic reaches your app through an ingress, but if workloads inside the cluster talk to each other by their Service name, Ingress NGINX won’t do the trick. That’s where Istio comes into play. On top of that, Istio features much more capable observability and security tools.</p><p>Since most applications receive traffic from the outside, we’ll first look at how to shift traffic at the Ingress NGINX level. Then, we’ll dive into how the upgrade works when you’re making requests to an internal service.</p><h3>Getting the Environment Ready</h3><p>We’re going to set up a canary deployment using the Open Source Deckhouse Kubernetes Platform Community Edition (DKP CE). Here’s what we’ll need:</p><ul><li><a href="https://github.com/deckhouse/deckhouse">Deckhouse Kubernetes Platform CE</a></li><li><a href="https://github.com/kubernetes/ingress-nginx">Ingress NGINX Controller</a></li><li><a href="https://github.com/istio/istio">Istio service mesh</a></li><li><a href="https://github.com/argoproj/argo-rollouts">Argo Rollouts</a></li><li>Prometheus (specifically, the <a href="https://github.com/deckhouse/prompp">Deckhouse Prom++</a> flavor)</li></ul><p>To get the DKP CE cluster up and running, follow <a href="https://deckhouse.io/products/kubernetes-platform/gs/">the quick start guide</a>. Once the platform is ready, there are three more prep steps to take.</p><p>The first one is to create an Ingress NGINX Controller so users can reach our app. Make sure to enable the enableIstioSidecar parameter — it puts the NGINX controller under Istio’s control:</p><pre>apiVersion: deckhouse.io/v1<br>kind: IngressNginxController<br>metadata:<br>  name: nginx<br>spec:<br>  ingressClass: nginx<br>  inlet: HostPort<br>  enableIstioSidecar: true<br>  hostPort:<br>    httpPort: 80<br>    httpsPort: 443<br>  nodeSelector:<br>    node-role.kubernetes.io/worker: &quot;&quot;<br>  tolerations:<br>  - effect: NoSchedule<br>    key: node-role.kubernetes.io/worker<br>    operator: Exists</pre><p>Next, enable <a href="https://deckhouse.io/modules/istio/">the Istio module</a>. d8 is the Deckhouse Kubernetes Platform’s CLI manager.</p><pre># d8 system module enable istio</pre><p>Wait for the tasks to finish. In DKP, you can monitor the progress by checking the task queue:</p><pre># d8 system queue list<br>Summary:<br>- &#39;main&#39; queue: empty.<br>- 124 other queues (0 active, 124 empty): 0 tasks.<br>- no tasks to handle.</pre><p>Now that the task queue is empty, we’re clear to continue.</p><p>Time to install Argo Rollouts. While you’d normally use a GitOps approach to install tools in a real-world scenarios, we’ll keep it simple for now and just apply the manifests manually:</p><pre>kubectl create namespace argo-rollouts<br>kubectl apply -n argo-rollouts -f https://github.com/argoproj/argo-rollouts/releases/latest/download/install.yaml</pre><h3>Setting Up Canary Deployment</h3><p>Let’s look at the key components of our setup:</p><ol><li>NGINX Ingress Controller — An Nginx-based controller for receiving user traffic and routing it to our target application.</li><li>Istio is our service mesh. We use it to intelligently route traffic between app versions.</li><li>Argo Rollouts — An operator + set of CRDs (Custom Resource Definitions) for implementing more advanced deployment strategies like canary, blue/green, which you don’t get with vanilla Kubernetes.</li><li>Deckhouse Prom++ — a built-in DKP solution for collecting metrics.</li></ol><p>As for our app, we’ll create a simple Go service featuring three different versions. The first one, v1, is the stable one; it’ll respond with 200 OK to all requests. v2, our “buggy” version, will throw a 500 — Internal Server Error to every other request thus simulating issue with the “new” app. Finally, v3 (the “fixed” version of v2) will return 200 OK to all requests.</p><p>Here’s what our final traffic flow and component setup will look like:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*qdjWxHqs0GH6nd_L0arqZw.png" /></figure><h3>Getting the Manifests Ready</h3><p>Istio handles our user traffic distribution. Let’s apply the manifests needed for our application:</p><pre>---<br>apiVersion: v1<br>kind: Service<br>metadata:<br>  name: app<br>spec:<br>  ports:<br>  - port: 80<br>    targetPort: http<br>    protocol: TCP<br>    name: http<br>  selector:<br>    app: backend<br>---<br>apiVersion: networking.istio.io/v1alpha3<br>kind: VirtualService<br>metadata:<br>  name: app-vsvc<br>spec:<br>  hosts:<br>  - app<br>  http:<br>  - name: primary<br>    route:<br>    - destination:<br>        host: app<br>        subset: stable<br>      weight: 100<br>    - destination:<br>        host: app<br>        subset: canary<br>      weight: 0<br><br>---<br>apiVersion: networking.istio.io/v1alpha3<br>kind: DestinationRule<br>metadata:<br>  name: app-destrule<br>spec:<br>  host: app<br>  subsets:<br>  - name: stable<br>    labels:<br>      app: backend<br>  - name: canary<br>    labels:<br>      app: backend</pre><p>Note the DestinationRule manifest. Istio lets you define the so-called subsets — sets of endpoints of the same Deployment selected by labels. In this case, we define a subset named stable for the stable version and another named canary for the new version.</p><p>In the VirtualService, we set up two destinations. At the start, the stable version receives 100% of the traffic while the canary gets 0%. As the canary deployment kicks in, the Argo Rollouts operator will tweak these numbers to route more traffic to the new version.</p><p>Let’s create the Rollout:</p><pre>---<br>apiVersion: argoproj.io/v1alpha1<br>kind: Rollout<br>metadata:<br>  name: app<br>spec:<br>  strategy:<br>    canary:<br>      analysis:<br>        templates:<br>        - templateName: success-rate<br>        startingStep: 1<br>        args:<br>        - name: service-name<br>          value: app.default.svc.cluster.local<br>      trafficRouting:<br>        istio:<br>          virtualService:<br>            name: app-vsvc<br>            routes:<br>            - primary<br>          destinationRule:<br>            name: app-destrule<br>            canarySubsetName: canary<br>            stableSubsetName: stable<br>      steps:<br>      - setWeight: 20<br>      - pause: {duration: 1m}<br>      - setWeight: 40<br>      - pause: {duration: 1m}<br>      - setWeight: 60<br>      - pause: {duration: 1m}<br>      - setWeight: 80<br>      - pause: {duration: 1m}<br>  selector:<br>    matchLabels:<br>      app: backend<br>  template:<br>    metadata:<br>      labels:<br>        app: backend<br>    spec:<br>      containers:<br>      - name: backend<br>        image: rinamuka/canary:v1<br>        ports:<br>        - name: http<br>          containerPort: 80<br>          protocol: TCP<br>        resources:<br>          requests:<br>              memory: 5Mi<br>              cpu: 5m<br>            limits:<br>              memory: 10Mi<br>              cpu: 10m</pre><p>A Rollout is an Argo Rollouts’ CRD that acts as a wrapper around a Deployment object, but with extra settings for your deployment strategy. As the deployment proceeds, the operator patches the DestinationRule objects by changing their labels, and the VirtualService to alter how traffic is split between the different subsets.</p><p>As you can see, the manifest is similar to a regular vanilla Deployment, except for the canary section. It has three parts:</p><ol><li>analysis: defines which health check template to use. Note the startingStep parameter — it controls at which step the analysis of the new version (i. e., querying Prometheus) commences. Here, the check runs in the background, but you can also run it “inline” as a separate step.</li><li>trafficRouting: links our Rollout to a destinationRule and a virtualService.</li><li>steps: defines the actual canary deployment plan: what percentages of traffic to shift at a time, and how long to wait before moving more traffic to the new version.</li></ol><p>To integrate the app into the service mesh, you just need to add the istio-injection: enabled label to the app namespace. Once that’s done, a sidecar container running the Istio agent will be injected into each app pod.</p><p>Let’s now create an Ingress object to expose our application to the outside world:</p><pre>apiVersion: networking.k8s.io/v1<br>kind: Ingress<br>metadata:<br>  name: app<br>  annotations:<br>    cert-manager.io/cluster-issuer: letsencrypt<br>    nginx.ingress.kubernetes.io/service-upstream: &quot;true&quot;<br>    nginx.ingress.kubernetes.io/upstream-vhost: app.default.svc<br><br>spec:<br>  tls:<br>  - hosts:<br>    - app.31.184.210.137.sslip.io<br>    secretName: app-tls<br>  rules:<br>    - host: app.31.184.210.137.sslip.io<br>      http:<br>        paths:<br>          - path: /<br>            pathType: Prefix<br>            backend:<br>              service:<br>                name: app<br>                port:<br>                  number: 80</pre><p>Note the two important Ingress annotations:</p><ol><li>nginx.ingress.kubernetes.io/service-upstream: “true”: this annotation instructs the Ingress controller to send requests to the service’s ClusterIP instead of directly to the pods. In this case, the istio-proxy sidecar will only intercept traffic to the Service CIDR range, the rest of the requests will be routed directly.</li><li>nginx.ingress.kubernetes.io/upstream-vhost: app.default.svc. In Istio, all routing relies on the Host header. So, instead of making Istio aware of our public domain (app.31.184.210.137.sslip.io), we just use the internal one it already knows about.</li></ol><p>The next step is to create the AnalysisTemplate manifest. This resource defines the process for checking if the new app version is running properly. In our case, it will query the cluster’s Prometheus (which comes with DKP out of the box) and check the percentage of 5xx errors. If it’s more than 5%, the deployment is canceled and all traffic shifted back to the stable app version. If it’s less, we keep sending more traffic to the new app version:</p><pre>apiVersion: argoproj.io/v1alpha1<br>kind: AnalysisTemplate<br>metadata:<br>  name: success-rate<br>spec:<br>  args:<br>    - name: service-name<br>    - name: api-token<br>      valueFrom:<br>        secretKeyRef:<br>          name: rollout-token<br>          key: token<br>  metrics:<br>    - name: success-rate<br>      interval: 1m<br>      successCondition: result[0] &gt;= 0.95<br>      failureLimit: 2<br>      provider:<br>        prometheus:<br>          address: https://prometheus.d8-monitoring:9090<br>          insecure: true<br>          headers:<br>            - key: Authorization<br>              value: &quot;Bearer {{ args.api-token }}&quot;<br>          query: |<br>            sum(irate(istio_requests_total{reporter=&quot;source&quot;,destination_service=~&quot;{{args.service-name}}&quot;,response_code!~&quot;5.*&quot;}[5m])) /<br>            sum(irate(istio_requests_total{reporter=&quot;source&quot;,destination_service=~&quot;{{args.service-name}}&quot;}[5m]))</pre><p>DKP’s built-in Prometheus requires authorization, so you have to add a Kubernetes token to your requests. To get one, we’ll create a ServiceAccount, a Role, and a RoleBinding, and then a Secret to store the token for that ServiceAccount.</p><pre>---<br>apiVersion: v1<br>kind: ServiceAccount<br>metadata:<br>  name: rollout<br>  namespace: default<br><br>---<br>apiVersion: rbac.authorization.k8s.io/v1<br>kind: ClusterRole<br>metadata:<br>  name: app:prometheus-access<br>rules:<br>- apiGroups: [&quot;monitoring.coreos.com&quot;]<br>  resources: [&quot;prometheuses/http&quot;]<br>  resourceNames: [&quot;main&quot;, &quot;longterm&quot;]<br>  verbs: [&quot;get&quot;,&quot;create&quot;]<br><br>---<br>apiVersion: rbac.authorization.k8s.io/v1<br>kind: ClusterRoleBinding<br>metadata:<br>  name: app:prometheus-access<br>roleRef:<br>  apiGroup: rbac.authorization.k8s.io<br>  kind: ClusterRole<br>  name: app:prometheus-access<br>subjects:<br>- kind: ServiceAccount<br>  name: rollout<br>  namespace: default<br><br>---<br>apiVersion: v1<br>kind: Secret<br>metadata:<br>  name: rollout-token<br>  annotations:<br>    kubernetes.io/service-account.name: rollout<br>type: kubernetes.io/service-account-token</pre><h3>Updating the App</h3><p>Time to update our app. The easiest way to watch the traffic switch over is in <a href="https://github.com/kiali/kiali">Kiali</a>, Istio’s web UI. Let’s open it up and commence the rollout for the new v2 version.</p><p>Here is what happens after the tag in the Rollout is changed:</p><ol><li>A replica with the new image version is created (revision 2).</li></ol><pre>root@master-0:~# kubectl argo rollouts get rollout app<br>Name:            app<br>Namespace:       default<br>Status:          ॥ Paused<br>Message:         CanaryPauseStep<br>Strategy:        Canary<br>  Step:          1/8<br>  SetWeight:     20<br>  ActualWeight:  20<br>Images:          rinamuka/canary:v1 (stable)<br>                 rinamuka/canary:v2 (canary)<br>Replicas:<br>  Desired:       1<br>  Current:       2<br>  Updated:       1<br>  Ready:         2<br>  Available:     2<br><br>NAME                             KIND        STATUS     AGE    INFO<br>⟳ app                            Rollout     ॥ Paused   3m28s<br>├──# revision:2<br>│  └──⧉ app-655bb4c96c           ReplicaSet  ✔ Healthy  14s    canary<br>│     └──□ app-655bb4c96c-p2lsv  Pod         ✔ Running  14s    ready:2/2<br>└──# revision:1<br>   └──⧉ app-84975c75b            ReplicaSet  ✔ Healthy  3m28s  stable<br>      └──□ app-84975c75b-tfkqv   Pod         ✔ Running  3m28s  ready:2/2</pre><p>2. Next, the steps from the Rollout manifest are executed. Argo Rollouts patches the DestinationRule and VirtualService objects.</p><p>It adds a subset to the DestinationRule using the ReplicaSet’s hash label. One of the subsets points to the old (stable) version of the app, while the other points to the new (canary) one.</p><pre>subsets:<br>    - labels:<br>        app: backend<br>        rollouts-pod-template-hash: 84975c75b<br>      name: stable<br>    - labels:<br>        app: backend<br>        rollouts-pod-template-hash: 655bb4c96c<br>      name: canary</pre><p>In the VirtualService, the destination weights get changed:</p><pre>spec:<br>    hosts:<br>    - app<br>    http:<br>    - name: primary<br>      route:<br>      - destination:<br>          host: app<br>          subset: stable<br>        weight: 80 # Initially, weight had a value of 100<br>      - destination:<br>          host: app<br>          subset: canary<br>        weight: 20 # Initially, weight had a value of  0</pre><p>On top of that, the Argo Rollouts operator creates an AnalysisRun object, which checks metrics in Prometheus. You can see the status of these checks by describing the object.</p><pre>Status:<br>  Completed At:  2025-08-23T11:59:39Z<br>  Dry Run Summary:<br>  Message:  Metric &quot;success-rate&quot; assessed Failed due to failed (3) &gt; failureLimit (2)<br>  Metric Results:<br>    Count:   3<br>    Failed:  3<br>    Measurements:<br>      Finished At:  2025-08-23T11:57:39Z<br>      Phase:        Failed<br>      Started At:   2025-08-23T11:57:39Z<br>      Value:        [0.9033333333333333]<br>      Finished At:  2025-08-23T11:58:39Z<br>      Phase:        Failed<br>      Started At:   2025-08-23T11:58:39Z<br>      Value:        [0.7958333333333333]<br>      Finished At:  2025-08-23T11:59:39Z<br>      Phase:        Failed<br>      Started At:   2025-08-23T11:59:39Z<br>      Value:        [0.8058333333333334]</pre><p>Two of the three checks failed, so the proportion of 500 errors was greater than 5%. And in Kiali, you can see exactly how the traffic got switched.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*1nnre0IbHREuhBaJ8fmN4w.png" /></figure><p>The check failed, so we’re switching back to the stable version:</p><pre>Normal   RolloutResumed          2m29s                  rollouts-controller  Rollout is resumed<br>  Normal   Updated VirtualService  2m29s                  rollouts-controller  VirtualService `app-vsvc` set to desiredWeight &#39;40&#39;<br>  Normal   TrafficWeightUpdated    2m29s                  rollouts-controller  Traffic weight updated from 20 to 40<br>  Normal   RolloutStepCompleted    2m29s                  rollouts-controller  Rollout step 3/8 completed (setWeight: 40)<br>  Normal   AnalysisRunRunning      2m29s                  rollouts-controller  Background Analysis Run &#39;app-655bb4c96c-2&#39; Status New: &#39;Running&#39; Previous: &#39;&#39;<br>  Warning  AnalysisRunFailed       29s                    rollouts-controller  Background Analysis Run &#39;app-655bb4c96c-2&#39; Status New: &#39;Failed&#39; Previous: &#39;Running&#39;</pre><p>Once it rolls back, the canary pod gets deleted, and all traffic goes to the stable version:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*1SpAC6D4RbQ-jJmlvD--Jg.png" /></figure><p>We have now covered updating to a buggy release and the subsequent rollback. Let’s now update the application to a good release (v3). Insert the v3 image tag in the Rollout:</p><pre>Name:            app<br>Namespace:       default<br>Status:          ॥ Paused<br>Message:         CanaryPauseStep<br>Strategy:        Canary<br>  Step:          1/8<br>  SetWeight:     20<br>  ActualWeight:  20<br>Images:          rinamuka/canary:v1 (stable)<br>                 rinamuka/canary:v3 (canary)<br>Replicas:<br>  Desired:       1<br>  Current:       2<br>  Updated:       1<br>  Ready:         2<br>  Available:     2<br><br>NAME                             KIND         STATUS        AGE    INFO<br>⟳ app                            Rollout      ॥ Paused      11m<br>├──# revision:3<br>│  └──⧉ app-76dfffd666           ReplicaSet   ✔ Healthy     52s    canary<br>│     └──□ app-76dfffd666-6tk68  Pod          ✔ Running     51s    ready:2/2<br>├──# revision:2<br>│  ├──⧉ app-655bb4c96c           ReplicaSet   • ScaledDown  7m46s  delay:passed<br>│  └──α app-655bb4c96c-2         AnalysisRun  ✖ Failed      4m38s  ✖ 3<br>└──# revision:1<br>   └──⧉ app-84975c75b            ReplicaSet   ✔ Healthy     11m    stable<br>      └──□ app-84975c75b-tfkqv   Pod          ✔ Running     11m    ready:2/2</pre><p>Let’s see how the check went:</p><pre>Status:<br>  Dry Run Summary:<br>  Metric Results:<br>    Consecutive Success:  3<br>    Count:                3<br>    Measurements:<br>      Finished At:  2025-08-23T12:04:33Z<br>      Phase:        Successful<br>      Started At:   2025-08-23T12:04:33Z<br>      Value:        [1]<br>      Finished At:  2025-08-23T12:05:33Z<br>      Phase:        Successful<br>      Started At:   2025-08-23T12:05:33Z<br>      Value:        [1]<br>      Finished At:  2025-08-23T12:06:33Z<br>      Phase:        Successful<br>      Started At:   2025-08-23T12:06:33Z<br>      Value:        [1]</pre><p>All three checks were successful, so traffic will be gradually shifted to the new version:</p><pre>Name:            app<br>Namespace:       default<br>Status:          ॥ Paused<br>Message:         CanaryPauseStep<br>Strategy:        Canary<br>  Step:          5/8<br>  SetWeight:     60<br>  ActualWeight:  60<br>Images:          rinamuka/canary:v1 (stable)<br>                 rinamuka/canary:v3 (canary)<br>Replicas:<br>  Desired:       1<br>  Current:       2<br>  Updated:       1<br>  Ready:         2<br>  Available:     2<br><br>NAME                             KIND         STATUS        AGE    INFO<br>⟳ app                            Rollout      ॥ Paused      19m<br>├──# revision:3<br>│  ├──⧉ app-76dfffd666           ReplicaSet   ✔ Healthy     9m1s   canary<br>│  │  └──□ app-76dfffd666-6tk68  Pod          ✔ Running     9m     ready:2/2<br>│  └──α app-76dfffd666-3         AnalysisRun  ◌ Running     5m53s  ✔ 6<br>├──# revision:2<br>│  ├──⧉ app-655bb4c96c           ReplicaSet   • ScaledDown  15m    delay:passed<br>│  └──α app-655bb4c96c-2         AnalysisRun  ✖ Failed      12m    ✖ 3<br>└──# revision:1<br>   └──⧉ app-84975c75b            ReplicaSet   ✔ Healthy     19m    stable<br>      └──□ app-84975c75b-tfkqv   Pod          ✔ Running     19m    ready:2/2</pre><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*LCniQkxH4LE_juHCjcFGAw.png" /></figure><p>For the final test, let’s try the same thing, but hit the service directly instead of going through Ingress NGINX. This is exactly the kind of situation that Ingress NGINX can’t handle.</p><p>Let’s fire up a client pod that will send requests to the service name (app.default) instead of the Ingress host. Traffic switching and load balancing still work here because the Istio sidecar is handling all the routing:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/883/1*SOCGjehsFSwo21PI4be_kQ.png" /></figure><h3>Summary</h3><p>As you see, it is possible to set up a canary deployment for your Kubernetes application with just a couple of tools, Argo Rollouts and Istio. This article only covered the basic features of Argo Rollouts. On top of the features I talked about, it also integrates with HPA and VPA, handles different load balancers for traffic routing, and pulls metrics from various sources, and so on. So you can really tweak it to your liking.</p><p>The cool thing about the approach I described above is that everything runs automatically: switching traffic, checking if the new release works smoothly, and rolling back in case of issues. Plus, you don’t need to mess with your app’s manifests too much, as a Rollout object is basically the same as a standard Deployment. Finally, Istio is a great base to build on later if you need more advanced deployment features.</p><p>Just a friendly reminder in the end: each deployment strategy has its own pros and cons. Choose the one that best suits your application.</p><h3>Useful links</h3><ul><li><a href="https://argo-rollouts.readthedocs.io/en/stable/">Argo Rollout documentation</a></li><li><a href="https://istio.io/latest/docs/reference/config/networking/destination-rule/">Istio documentation</a></li><li><a href="https://github.com/alladinattar/canary-article">Repository with the source code</a></li></ul><h3>Related books</h3><ul><li>Continuous Delivery: Reliable Software Releases through Build, Test, and Deployment Automation by David Farley, Jez Humble</li><li>Cloud Native Patterns: Designing change-tolerant software First Edition by Cornelia Davis</li></ul><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=0d41ba5e1f85" width="1" height="1" alt=""><hr><p><a href="https://blog.deckhouse.io/canary-in-k8s-using-argo-rollouts-and-istio-0d41ba5e1f85">Canary Deployment in Kubernetes Using Argo Rollouts and Istio</a> was originally published in <a href="https://blog.deckhouse.io">Deckhouse blog</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[How to build a home cluster with VMs running in containers for a couple hundred dollars]]></title>
            <link>https://blog.deckhouse.io/run-vms-in-containers-at-home-6048778d417f?source=rss-71bfdb9446bd------2</link>
            <guid isPermaLink="false">https://medium.com/p/6048778d417f</guid>
            <category><![CDATA[virtualization]]></category>
            <category><![CDATA[open-source]]></category>
            <category><![CDATA[kubernetes]]></category>
            <category><![CDATA[devops]]></category>
            <category><![CDATA[deckhouse]]></category>
            <dc:creator><![CDATA[Flant staff]]></dc:creator>
            <pubDate>Wed, 22 Oct 2025 12:06:21 GMT</pubDate>
            <atom:updated>2025-10-28T08:45:27.335Z</atom:updated>
            <content:encoded><![CDATA[<p>Hi all! I’m Valery Khorunzhin, a solution architect on the Deckhouse team. And I’ve set up my own virtualization environment.</p><p>My journey to virtualization started with Obsidian, which doesn’t have a built-in sync feature. I have a small rented VPS, but I ran into an issue where running a simple Docker container would spike the server’s CPU, making even an SSH connection lag.</p><p>While looking for a solution, I realized I wanted more than just to host a single app; I wanted to easily move and scale my entire infrastructure using a declarative configuration. The idea of building my own Kubernetes cluster was very appealing.</p><p>However, containers, despite all their benefits, can introduce security risks. If one of my applications gets compromised, an attacker could potentially break out and access other containers. This may become a huge issue, given that I’m not able to monitor the cluster 24/7.</p><p>So I decided I needed full-fledged virtualization. I went with the <a href="https://github.com/deckhouse/virtualization">Deckhouse Virtualization Platform Community Edition</a> (DVP CE), which is an Open Source tool running on top of Deckhouse Kubernetes Platform. By the way, to avoid confusion, I’ll just call Deckhouse Kubernetes Platform and its components “Deckhouse” from here on out.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*S1aBX69_m-D4ZBa723rJNw.png" /></figure><p>In this post, I’m going to show you how to build a home virtualization cluster from scratch with DVP CE. We’ll pick the hardware, get it ready, set up the network, install the platform, and launch the first few VMs along with their storage.</p><h3>Getting the Home Cluster Ready</h3><p>For my home cluster, I bought three mini-PCs (~$100 a piece) and a gigabit switch ($15) to connect them. The total cost came out to be $345.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*qplhS1mPi4PX9pNcyd4uCg.png" /></figure><p>PC Specifications:</p><ul><li>Processor: Intel N150 (4 cores, 4 threads, 3.6GHz)</li><li>RAM: 16GB</li><li>Storage: 500GB SSD</li></ul><h4>Prepping the Cluster</h4><p>Before deploying the cluster, you have to perform a number of preliminary steps to get the DNS network and SSH access uo and running.</p><p>We’re going to install the <a href="https://github.com/deckhouse/virtualization">DVP CE platform</a> on the cluster.</p><p>Your nodes will need static IPs on the same subnet so they can talk to each other. We will stick to the “one master node and two worker nodes” scheme. For storage, we’ll use the <a href="https://deckhouse.io/modules/sds-replicated-volume/stable/">sds-replicated-volume</a> module from Deckhouse.</p><p>Here’s what our cluster would look like conceptually:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*Ly16SeiDP5UPHoRAJHwuBA.png" /></figure><p>To make sure everything works, you’ll need to configure your DNS records. I will be running the installation from a Windows machine, so I need to add those domains to the hosts file and point them to the master node’s IP:</p><pre>api.homecluster.com<br>argocd.homecluster.com<br>dashboard.homecluster.com<br>documentation.homecluster.com<br>dex.homecluster.com<br>grafana.homecluster.com<br>hubble.homecluster.com<br>istio.homecluster.com<br>istio-api-proxy.homecluster.com<br>kubeconfig.homecluster.com<br>openvpn-admin.homecluster.com<br>prometheus.homecluster.com<br>status.homecluster.com<br>upmeter.homecluster.com</pre><p>You&#39;ll also need to generate an SSH key on your main computer using ssh-keygen and then copy it over to the authorized_keys file on the master node.</p><h4>Installing the OS</h4><p>I went with Ubuntu 24.04 for all the servers in the cluster. You’ll need to install it on each one. I won’t go into detail on this step, but highlight a couple of points.</p><p>During installation, I recommend setting up a static IP for the server right away to avoid having to manually configure it later. After installation, the configs will be in /etc/netplan. In my case, the server uses two network interfaces: Wi-Fi for internet access and the gigabit switch for inter-server communication.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*L-7jdLmdw3O2UAOdGV9vVA.png" /><figcaption><em>Configuring the network</em></figcaption></figure><p>For our distributed storage, we will need block devices. Those can be unpartitioned devices or partitions that have been created but not formatted. In my case, the partitions are laid out as follows:</p><ul><li>/boot — 1gb</li><li>/ — ext4, 100gb</li></ul><p>With the leftover space, just create a partition and pick the “leave unformatted” option. That’s what we’ll use for the sds-replicated-volume storage.</p><h3>Deploying a Cluster</h3><h4><strong>Deploying a Master Node</strong></h4><p>Since my PC is running Windows, I’ll use it to intall the platform. First, you need to make sure the master node is accessible over SSH with a key-based authentication.</p><p>Next, per <a href="https://deckhouse.io/products/virtualization-platform/gs/bm/step2.html">the second step of the DVP homecluster quick start guide</a>, you need to enter the domain name template for your cluster. In my case, it’s %s.homecluster.com:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*Rf-i7SKglaMd1m57FsPF6A.png" /></figure><p>After that, click “Next: Platform installation”, and you will get a ready-made config with the domain name template filled in.</p><p>The only thing you have to do is copy and change the internalNetworkCIDRs parameter that specifies the cluster subnet. This is necessary if our servers use more than one network interface. For me, this is 10.0.4.0/24, which is my Ethernet subnet.</p><blockquote>Keep in mind the parameters that define the subnets reserved for the cluster’s needs. There must be no overlaps with other server networks. If there are, change either the external settings or these parameters.</blockquote><p>Save the resulting config to a file named config.yml.</p><p>Then the installation of DVP CE on the master node will commence:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*utOJda6MlhwPhPoA54PXuA.png" /></figure><p>Once the command completes, the master node deployment is finished.</p><h4><strong>Setting up Worker Nodes</strong></h4><p>A master node is great for system stuff, but your cluster is pretty much useless without worker nodes. You’ll run your user workloads (pods, VMs, and so on) on those. So I have to set up two more nodes.</p><p>First off, let’s create a NodeGroup for our worker nodes:</p><pre>sudo -i d8 k create -f - &lt;&lt; EOF<br>apiVersion: deckhouse.io/v1<br>kind: NodeGroup<br>metadata:<br>  name: worker<br>spec:<br>  nodeType: Static<br>  staticInstances:<br>    count: 2<br>    labelSelector:<br>      matchLabels:<br>        role: worker<br>EOF</pre><p>Note the count parameter — It tells Deckhouse how many nodes to run in this group.</p><p>Next, you need to configure the worker nodes. Deckhouse does the heavy lifting here; you just need to make sure the master and worker nodes can communicate with each other over SSH. To do so, generate an SSH key on the master node with an empty passphrase:</p><pre>ssh-keygen -t rsa -f /dev/shm/caps-id -C &quot;&quot; -N &quot;&quot;</pre><p>Next, create the <a href="https://deckhouse.io/modules/node-manager/cr.html#sshcredentials">SSHCredentials</a> resource on the cluster:</p><pre>sudo -i d8 k create -f - &lt;&lt;EOF<br>apiVersion: deckhouse.io/v1alpha1<br>kind: SSHCredentials<br>metadata:<br>  name: caps<br>spec:<br>  user: caps<br>  privateSSHKey: &quot;`cat /dev/shm/caps-id | base64 -w0`&quot;<br>EOF</pre><p>Now, you need to add the public key you just created to the authorized_keys file for the caps user on your worker nodes. Let’s print it out so you can copy it:</p><pre>cat /dev/shm/caps-id.pub</pre><p>Next, SSH into each worker node and run these commands as root (paste your public key in place of &lt;SSH-PUBLIC-KEY&gt;). They will create the caps user and add your key:</p><pre>export KEY=&#39;&lt;SSH-PUBLIC-KEY&gt;&#39; # Insert your public SSH key here<br>useradd -m -s /bin/bash caps<br>usermod -aG sudo caps<br>echo &#39;caps ALL=(ALL) NOPASSWD: ALL&#39; | sudo EDITOR=&#39;tee -a&#39; visudo<br>mkdir /home/caps/.ssh<br>echo $KEY &gt;&gt; /home/caps/.ssh/authorized_keys<br>chown -R caps:caps /home/caps<br>chmod 700 /home/caps/.ssh<br>chmod 600 /home/caps/.ssh/authorized_keys</pre><p>To add a node to the Deckhouse cluster, you need to create a definition of the static node (<a href="https://deckhouse.io/modules/node-manager/cr.html#staticinstance">StaticInstance</a>) and make sure the master node can access to the worker nodes over SSH. Let’s do it.</p><p>Return to the master node and create a StaticInstance for each worker node. In it, specify the IP address of the taget node (use the IP from the internal node network) and the name of the entity being created (the name parameter):</p><pre>export NODE=&lt;NODE-IP-ADDRESS&gt; # Enter the IP address of the node to connect to the cluster<br>sudo -i d8 k create -f - &lt;&lt;EOF<br>apiVersion: deckhouse.io/v1alpha1<br>kind: StaticInstance<br>metadata:<br>  name: dvp-worker<br>  labels:<br>    role: worker<br>spec:<br>  address: &quot;$NODE&quot;<br>  credentialsRef:<br>    kind: SSHCredentials<br>    name: caps<br>EOF</pre><p>Run the following command to see if StaticInstance resources are ready:</p><pre>d8 k get staticinstances.deckhouse.io -w</pre><figure><img alt="" src="https://cdn-images-1.medium.com/max/564/1*re6ZaNBIqIQyCO7-_iKY-A.png" /></figure><p>Once new nodes are ready, you will see the following output:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/616/1*HXzK0BkqVsQGqayxd1aAEA.png" /></figure><p>Now, check the nodes with d8 k get no:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/566/1*NmyvRQsmUzD1IIV7g4qx6w.png" /></figure><p>Nice, the nodes are up!</p><h4><strong>Installing Software-Defined Storage</strong></h4><p>One of the reasons I settled on a three-node configuration is the need to store data reliably. The basic idea here is to store multiple copies of data. Right now, we need to get our replicated storage configured, and we’ll use Deckhouse’s sds-replicated-volume module for that.</p><p>First, let’s enable the required modules:</p><pre>sudo -i d8 k create -f - &lt;&lt;EOF<br>---<br>apiVersion: deckhouse.io/v1alpha1<br>kind: ModuleConfig<br>metadata:<br>  name: sds-node-configurator<br>spec:<br>  version: 1<br>  enabled: true<br>---<br>apiVersion: deckhouse.io/v1alpha1<br>kind: ModuleConfig<br>metadata:<br>  name: sds-replicated-volume<br>spec:<br>  version: 1<br>  enabled: true<br>EOF</pre><p>Wait for the sds-replicated-volume module to start:</p><pre>sudo -i d8 k wait module sds-replicated-volume --for=&#39;jsonpath={.status.phase}=Ready&#39; --timeout=1200s</pre><p>In Deckhouse, pretty much everything — modules, system images, etc. — is managed by the <em>deckhouse deployment</em> running in the d8-system namespace. Whenever you enable or tweak modules, a bunch of hooks run in the background. You can see what’s happening by checking the Deckhouse queue using the d8 platform queue list command. Let’s run watch d8 platform queue list and wait for that list to clear out:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/941/1*WVH8I1vBwISeRpfPa5NgKQ.png" /></figure><p>Here is what an empty queue looks like:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/817/1*DiiFSWl9bY1tpyKs-UPoWQ.png" /></figure><p>Let’s see what block devices we have (use the d8 k get bd command):</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/924/1*E4qzT6zK15cZoEVKFQ4NYA.png" /></figure><p>sds-replicated-volume features <em>thin</em> and <em>thick</em> pools for data storage. <em>Thick</em> pools occupy the entire allocated space right from the start, while <em>thin</em> pools use only the portion of disk space that is needed at the moment.</p><p><em>Thick</em> pools are faster, but storage provisioning takes more time. On top of that, snapshots don’t work with <em>thick</em> pools. <em>Thin</em> pools save space and provision volumes faster, with the inherent risk the total provisioned space exceeding the actual storage capacity. So you have to monitor the actual disk usage.</p><p>Let’s create an LVMVolumeGroup for each node. You’ll need to substitute the node name and the block device name into the following command:</p><pre>d8 k apply -f - &lt;&lt;EOF<br>apiVersion: storage.deckhouse.io/v1alpha1<br>kind: LVMVolumeGroup<br>metadata:<br>  name: &quot;vg-on-worker-0&quot;<br>spec:<br>  type: Local<br>  local:<br>    # Replace it with the name of the node for which the volume group is being created <br>    nodeName: &quot;worker-0&quot;<br>  blockDeviceSelector:<br>    matchExpressions:<br>      - key: kubernetes.io/metadata.name<br>        operator: In<br>        values:<br>          # Replace with block device names for which the volume group is being created<br>          - dev-ef4fb06b63d2c05fb6ee83008b55e486aa1161aa<br>  # The name of the LVM volume group that will include the above block devices on the selected node<br>  actualVGNameOnTheNode: &quot;vg&quot;<br>  thinPools:<br>    - name: thin-pool-0<br>      size: 70%<br>EOF</pre><figure><img alt="" src="https://cdn-images-1.medium.com/max/1011/1*p1dwe6n1urRWJYzdKN-hEA.png" /></figure><p>Next, create a <em>thin</em> pool:</p><pre>d8 k apply -f - &lt;&lt;EOF<br>apiVersion: storage.deckhouse.io/v1alpha1<br>kind: ReplicatedStoragePool<br>metadata:<br>  name: thin-pool<br>spec:<br>  type: LVMThin<br>  lvmVolumeGroups:<br>    - name: vg-1-on-homecluster0<br>      thinPoolName: thin-pool-0<br>    - name: vg-1-on-homecluster1<br>      thinPoolName: thin-pool-0<br>    - name: vg-1-on-homecluster2<br>      thinPoolName: thin-pool-0<br>EOF</pre><p>With sds-replicated-volume, the user doesn’t set up a StorageClass manually but instead configures a higher-level entity called ReplicatedStorageClass.</p><p>Let’s create the ReplicatedStorageClass (see <a href="https://deckhouse.io/modules/sds-replicated-volume/stable/cr.html#replicatedstorageclass">the documentation</a> to decide which replication option is best for you):</p><pre>d8 k apply -f - &lt;&lt;EOF<br>apiVersion: storage.deckhouse.io/v1alpha1<br>kind: ReplicatedStorageClass<br>metadata:<br>  name: replicated-storage-class<br>spec:<br>  # The name of the storage pool we created earlier<br>  storagePool: thin-pool<br>  # What to do when the PVC is being deleted  <br>  # Can be &quot;Delete&quot; or &quot;Retain&quot;  <br>  # [More info...](https://kubernetes.io/docs/concepts/storage/persistent-volumes/#reclaiming)<br>  reclaimPolicy: Delete<br>  # Replicas can run on any available node, but only one replica per volume on any single node  <br>  # Our cluster doesn&#39;t have any zones (no nodes with the topology.kubernetes.io/zone label)<br>  topology: Ignored<br>  # This mode keeps the volume up for reads and writes even if a replica goes down  <br>  # Data is stored in three separate copies on different nodes<br>  replication: ConsistencyAndAvailability<br>EOF</pre><p>Double-check that it all got created:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*rf69em05I4YSwexVEhmRcg.png" /></figure><p>Now, make it the default StorageClass:</p><pre>DEFAULT_STORAGE_CLASS=replicated-storage-class<br>sudo -i d8 k patch mc global --type=&#39;json&#39; -p=&#39;[{&quot;op&quot;: &quot;replace&quot;, &quot;path&quot;: &quot;/spec/settings/defaultClusterStorageClass&quot;, &quot;value&quot;: &quot;&#39;&quot;$DEFAULT_STORAGE_CLASS&quot;&#39;&quot;}]&#39;</pre><h4><strong>Enabling the Virtualization Module</strong></h4><p>Alright, the moment of truth. Let’s enable the virtualization module:</p><pre>sudo -i d8 k create -f - &lt;&lt;EOF<br>apiVersion: deckhouse.io/v1alpha1<br>kind: ModuleConfig<br>metadata:<br>  name: virtualization<br>spec:<br>  enabled: true<br>  settings:<br>    dvcr:<br>      storage:<br>        persistentVolumeClaim:<br>          size: 50G<br>        type: PersistentVolumeClaim<br>    virtualMachineCIDRs:<br>    # Subnets from which to assign IP addresses  to the VMs<br>    - 10.66.10.0/24<br>    - 10.66.20.0/24<br>    - 10.66.30.0/24<br>  version: 1<br>EOF</pre><p>Wait for it to report as Ready:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/720/1*kpRWYtZbLCFeQ0-7HeA4pQ.png" /></figure><p>Next, wait for the Deckhouse queue to become empty. This might take some time:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*1NsQu2ZRRc5CmEbIBeM7jA.png" /></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/820/1*v65BEIx6dNR92SXMqJ0q5w.png" /></figure><p>If you went with a <em>thick</em> pool during the storage setup, make sure that all pods in the 8-virtualization namespace are in the Running state:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/733/1*W-lp11bhewg2FTTv0HJyyQ.png" /></figure><p>Setting Up Ingress and DNS</p><p>First, let’s check that the Kruise controller manager pod is up and running:</p><pre>d8 k -n d8-ingress-nginx get po -l app=kruise</pre><p>Time to get the Ingress controller installed:</p><pre>sudo -i d8 k apply -f - &lt;&lt;EOF<br># NGINX Ingress controller settings<br># https://deckhouse.io/modules/ingress-nginx/cr.html#ingressnginxcontroller<br>apiVersion: deckhouse.io/v1<br>kind: IngressNginxController<br>metadata:<br>  name: nginx<br>spec:<br>  ingressClass: nginx<br>  # The way external traffic flows into the cluster<br>  inlet: HostPort<br>  hostPort:<br>    httpPort: 80<br>    httpsPort: 443<br>  # Defines which nodes the Ingress controller will run on  <br>  # You may want to change it<br>  nodeSelector:<br>    node-role.kubernetes.io/control-plane: &quot;&quot;<br>  tolerations:<br>  - effect: NoSchedule<br>    key: node-role.kubernetes.io/control-plane<br>    operator: Exists<br>EOF</pre><p>The controller’s pod should now be Running:</p><pre>d8 k -n d8-ingress-nginx get po -l app=controller</pre><h4><strong>Creating a User and Setting Up Monitoring</strong></h4><p>Let’s create a user to access the cluster and its web interface:</p><pre>sudo -i d8 k apply -f - &lt;&lt;&quot;EOF&quot;<br>apiVersion: deckhouse.io/v1<br>kind: ClusterAuthorizationRule<br>metadata:<br> name: admin<br>spec:<br> # List of Kubernetes RBAC accounts<br> subjects:<br> - kind: User<br>   name: admin@deckhouse.io<br> # Preset access level template<br> accessLevel: SuperAdmin<br> # Allow the user to run kubectl port-forward<br> portForwarding: true<br>---<br># Static user parameters<br># Version of the Deckhouse API<br>apiVersion: deckhouse.io/v1<br>kind: User<br>metadata:<br> name: admin<br>spec:<br> # user e-mail<br> email: admin@deckhouse.io<br> # The hash for the &quot;password&quot; temporary password<br> # Generate your own or use this one for testing:<br> # echo &quot;password&quot; | htpasswd -BinC 10 &quot;&quot; | cut -d: -f2<br> # You may want to change it<br> password: $2y$10$5.7NBl2MtHbQNzpc4/NOGeBU8lO73qDrc1jMjo.DQz8.X.PuZB7Ji<br>EOF</pre><p>Now, head over to grafana.homecluster.com:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*BVeZpftVoNek-7qQgk-pQg.png" /></figure><p>As you can see, the cluster’s network is running, serving the requested pages. On the screenshot page, you can see the cluster metrics and browse specific dashboards.</p><blockquote><em>Notes:</em></blockquote><blockquote><em>If you want to access your cluster from the internet, a simple (though not perfect) trick is to use </em><a href="https://codex.so/ssh-tunnel"><em>reverse SSH port forwarding</em></a><em> with the autossh tool to keep the tunnel up. In this case, change the cluster’s domain name. You can do that by tweaking the </em><em>.spec.settings.modules.publicDomainTemplate parameter in the </em><em>mc global entity (just run </em><em>kubectl edit mc global).</em></blockquote><h4><strong>Creating a Project and a Virtual Machine</strong></h4><p>VMs run in so-called <em>projects</em>, so it’s time to create one and move on to what we’ve been working towards: creating a virtual machine.</p><p>Let’s create a test project:</p><pre>d8 k create -f - &lt;&lt;EOF<br>apiVersion: deckhouse.io/v1alpha2<br>kind: Project<br>metadata:<br> name: test-project<br>spec:<br> description: test-project<br> projectTemplateName: default<br> parameters:<br>  # Project quotas<br>  resourceQuota:<br>   requests:<br>    cpu: 16<br>   limits:<br>    cpu: 16<br>  networkPolicy: NotRestricted<br>  # Project admins<br>  administrators:<br>   - subject: User<br>     name: admin<br>EOF</pre><p>And the image:</p><pre>d8 k apply -f - &lt;&lt;EOF<br>apiVersion: virtualization.deckhouse.io/v1alpha2<br>kind: VirtualImage<br>metadata:<br>  name: ubuntu-22-04<br>  namespace: test-project<br>spec:<br>  # Save the image to DVCR<br>  storage: ContainerRegistry<br>  # Image source<br>  dataSource:<br>    type: HTTP<br>    http:<br>      url: https://cloud-images.ubuntu.com/noble/current/noble-server-cloudimg-amd64.img<br>EOF</pre><p>Verify that the image has been created and wait for it to become Ready:</p><pre>d8 k -n test-project get vi -w</pre><figure><img alt="" src="https://cdn-images-1.medium.com/max/820/1*MyJDYbhPMlQDA8ytri7_9w.png" /></figure><p>Create a virtual disk based on the image:</p><pre>d8 k apply -f - &lt;&lt;EOF<br>apiVersion: virtualization.deckhouse.io/v1alpha2<br>kind: VirtualDisk<br>metadata:<br>  name: linux-vm-root<br>spec:<br>  # Virtual disk parameters<br>  persistentVolumeClaim:<br>    # Set a size larger than the unpacked image<br>    size: 10Gi<br>    # Insert your StorageClass name here<br>    storageClassName: i-sds-replicated-thin-r2<br>  # Data source to use for the disk<br>  dataSource:<br>    type: ObjectRef<br>    objectRef:<br>      kind: VirtualImage<br>      name: ubuntu-22-04<br>EOF</pre><p>Our StorageClass’s WaitForFirstConsumer parameter basically means the disk won’t be created until something needs it (our VM). This setting makes sure the disk is created on the same node as the VM, which reduces disk latency. Create a VM:</p><pre>d8 k apply -f - &lt;&lt;&quot;EOF&quot;<br>apiVersion: virtualization.deckhouse.io/v1alpha2<br>kind: VirtualMachine<br>metadata:<br>  name: linux-vm<br>  namespace: test-project<br>spec:<br>  # The VM class name<br>  virtualMachineClassName: generic<br>  # Scripts for bootstrapping the VM<br>  provisioning:<br>    type: UserData<br>    # A sample cloud-init script that creates a &#39;cloud&#39; user (password &#39;cloud&#39;) and installs qemu-guest-agent and nginx<br>    userData: |<br>      #cloud-config<br>      package_update: true<br>      packages:<br>        - qemu-guest-agent<br>      run_cmd:<br>        - systemctl daemon-reload<br>        - systemctl enable --now qemu-guest-agent.service<br>      ssh_pwauth: True<br>      users:<br>      - name: cloud<br>        passwd: &#39;$6$rounds=4096$saltsalt$fPmUsbjAuA7mnQNTajQM6ClhesyG0.yyQhvahas02ejfMAq1ykBo1RquzS0R6GgdIDlvS.kbUwDablGZKZcTP/&#39;<br>        shell: /bin/bash<br>        sudo: ALL=(ALL) NOPASSWD:ALL<br>        lock_passwd: False<br>      final_message: &quot;The system is finally up, after $UPTIME seconds&quot;<br>  # VM resource settings<br>  cpu:<br>    # Number of CPU cores<br>    cores: 1<br>    # Request 10% of a single physical core<br>    coreFraction: 10%<br>  memory:<br>    # RAM size<br>    size: 1Gi<br>  # A list of the disks and images used by the VM<br>  blockDeviceRefs:<br>    # The order here sets the boot priority<br>    - kind: VirtualDisk<br>      name: linux-vd<br>EOF</pre><p>All you have to do is wait for the VM to start:</p><pre> d8 k -n test-project get vm -w</pre><figure><img alt="" src="https://cdn-images-1.medium.com/max/820/1*bqa55Zf2UCX7NK7SBNrXIQ.png" /></figure><p>You can now connect to the VM over SSH. I’ll do this using the d8 v tool provided by DVP:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/805/1*ehYCAK6qi5Z9YQSKhA7OYQ.png" /></figure><p>The virtual machine is up and running.</p><h4><strong>What resources are available for VMs in a cluster like this?</strong></h4><p>Each of the cluster’s worker nodes provides about 10 GB of RAM and 4 CPU cores for running virtual machines. In total, the cluster features around 20 GB of RAM and 8 CPU cores for VMs.</p><p>Master node:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*r2fZO3Vy900cEpV7Byiu-Q.png" /></figure><p>First worker node (a VM was launched and then stopped on it):</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*hMtW-kbzhVr1SeURhXSVVA.png" /></figure><p>Second worker node:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*wuOmoPWpI8kOKndpgr1XKA.png" /></figure><h3>Conclusion</h3><p>We’ve set up a home virtualization cluster and have taken the first steps in using it — with a declarative approach, monitoring, and data replication. You can also create a flexible and fault-tolerant environment that can be scaled and used for various purposes, such as testing, learning, pet projects, or home services. In total, the cluster installation took about 1.5 hours (not counting the OS setup).</p><p>Currently, I’m using this setup for a personal Nextcloud. On top of that, I plan to deploy a personal GitLab server. And the best part? Experimenting with web apps and Telegram bots is no longer a headache, since I don’t have to worry about “where the heck do I host this?” anymore.</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=6048778d417f" width="1" height="1" alt=""><hr><p><a href="https://blog.deckhouse.io/run-vms-in-containers-at-home-6048778d417f">How to build a home cluster with VMs running in containers for a couple hundred dollars</a> was originally published in <a href="https://blog.deckhouse.io">Deckhouse blog</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Nelm 1.0 released: Helm-chart compatible alternative to Helm 3]]></title>
            <link>https://blog.werf.io/nelm-cli-helm-compatible-alternative-5648b191f0af?source=rss-71bfdb9446bd------2</link>
            <guid isPermaLink="false">https://medium.com/p/5648b191f0af</guid>
            <category><![CDATA[cloud-native]]></category>
            <category><![CDATA[werf]]></category>
            <category><![CDATA[kubernetes]]></category>
            <category><![CDATA[open-source]]></category>
            <category><![CDATA[helm]]></category>
            <dc:creator><![CDATA[Flant staff]]></dc:creator>
            <pubDate>Thu, 03 Apr 2025 06:50:47 GMT</pubDate>
            <atom:updated>2025-04-08T05:04:24.343Z</atom:updated>
            <content:encoded><![CDATA[<p>Initially, <a href="https://github.com/werf/werf">werf</a>, a CNCF Sandbox tool for building containers and deploying to Kubernetes, was built upon our Helm 3 fork, which accumulated quite a few new features and fixes for Helm 3.</p><p>However, some werf users were only interested in the deployment part of werf, without building containers and other non-deployment functionality. For these users we even maintained the werf helm … set of commands, which was basically our Helm 3 fork exposed. As the werf deployment subsystem became more complex, <strong>we decided to separate it into the </strong><a href="https://github.com/werf/nelm"><strong>Nelm</strong></a><strong> project</strong>, which we initiated at the end of December 2023. And now, with the release of Nelm CLI, Nelm has reached its 1.0 milestone!</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*FOae2JLDYPRZxf39G_CGrw.png" /></figure><h3>What is Nelm?</h3><p><a href="https://github.com/werf/nelm">Nelm</a> is an Open Source CLI tool to manage Helm charts and deploy them to Kubernetes. Nelm is based on the Helm 3 codebase; it does almost everything Helm can do, improves upon it, and even adds some extra functionality. Nelm is backward-compatible with Helm charts and Helm releases, so it will be easy for Helm users to migrate to Nelm. For those familiar with werf, Nelm is werf without <a href="https://werf.io/docs/v2/usage/project_configuration/giterminism.html">giterminism</a> and without building, distributing, and cleaning up container images.</p><p>Let’s now dive into the key advantages of Nelm compared to Helm 3.</p><h3>Advanced resource ordering</h3><p>First of all, the Helm deployment subsystem in Nelm has been rewritten from scratch. During deployment, Nelm builds a Directed Acyclic Graph (DAG) of all operations we intend to perform in a cluster to do a release, which is then executed. The DAG allowed us to implement advanced resource ordering capabilities, such as:</p><ul><li>werf.io/weight annotation — it is similar to helm.sh/hook-weight, except it also works for non-hook resources, and resources with the same weight are deployed in parallel;</li><li>werf.io/deploy-dependency-&lt;id&gt; annotation that makes Nelm wait for another resource to be ready or merely present in the cluster before deploying the annotated resource. This is the most powerful and efficient way to arrange the order in which resources will be deployed by Nelm;</li><li>&lt;id&gt;.external-dependency.werf.io/resource annotation that makes Nelm wait for the readiness of non-release resources, such as resources created by third-party operators;</li><li>Helm ordering capabilities (i.e., Helm hooks and Helm hook weights), which are also supported.</li></ul><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*BeDAOxDcYrZ8UWcb" /><figcaption>Nelm weights for Kubernetes resources and deploy dependencies</figcaption></figure><h3>Server-Side Apply replaces 3-Way Merge</h3><p>In Nelm, <a href="https://kubernetes.io/docs/reference/using-api/server-side-apply/">Server-Side Apply</a> (SSA) has taken the place of the problematic <a href="https://helm.sh/docs/faq/changes_since_helm2/#improved-upgrade-strategy-3-way-strategic-merge-patches">Helm 3-Way Merge</a> (3WM).</p><p>3WM is a client-side mechanism to make a patch for updating a resource in a cluster. Its issues stem from the fact that it assumes that all previous release manifests were successfully applied to the cluster, which is not always the case. For example, if some resources weren’t updated due to being invalid or if a release was aborted too early, then upon the next release, incorrect 3WM patches might be produced. This results in a seemingly “successful” Helm release with wrong changes silently being applied to the cluster, which is a very serious issue.</p><p>In 2019, Kubernetes introduced Server-Side Apply (SSA) for resource updates, which became stable in v1.22 (released in August 2021). With SSA, the patches are made in Kubernetes itself instead of client-side in Helm. SSA effectively resolves the issues associated with 3WM, and it is widely adopted by other deployment tools, like Flux. Unfortunately, it will take a lot of work to replace 3WM with SSA in Helm. However, since in Nelm, the deployment subsystem has been rewritten from scratch, we went SSA-first from the very beginning, thus solving long-standing issues of 3-Way Merge.</p><h3>Resource state tracking</h3><p>Nelm has powerful resource tracking built from the ground up:</p><ul><li>Reliable detection of resource readiness, presence, absence, or failures;</li><li>The readiness of Custom Resources is determined heuristically by analyzing their status fields. Works for about half of Custom Resources. No false positives;</li><li>Some dependent resources, like Pods of Deployments, are automatically found and individually tracked;</li><li>The table with the current information (statuses, errors, and more) about the tracked resources is printed every few seconds during the deployment;</li><li>Tracking can be configured on a per-resource basis using annotations.</li></ul><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*b9ltW4Rbm2VsXr_Y" /><figcaption>Nelm output displayed while installing a chart</figcaption></figure><h3>Printing logs and events during deploy</h3><p>During deployment, Nelm finds Pods of the release resources being deployed and periodically prints their container logs to your console. On top of that, the werf.io/show-service-messages: &quot;true&quot; annotation lets you print resource events as well. Log/event printing can be tuned with annotations.</p><h3>Encrypted values and encrypted files</h3><p>nelm chart secret commands manage encrypted values files such as secret-values.yaml or arbitrary encrypted files like secret/mysecret.txt. Those files are decrypted in-memory during templating and can be referenced in templates as .Values.my.secret.value and {{ werf_secret_file &quot;mysecret.txt&quot; }} respectively.</p><h3>Release planning</h3><p>The nelm release plan install command explains exactly what’s going to happen in a cluster during the next release. It shows 100% accurate diffs between what resources in the cluster are right now and what they will be after the next deployment, utilizing robust dry-run Server-Side Apply instead of client-side trickery.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*eJaOC4DSvY6cr8Wi" /><figcaption>Planning your deployments with Nelm is alike to executing `terraform plan`</figcaption></figure><h3>Future plans</h3><p>Here is a sneak peek at what’s on our roadmap for Nelm:</p><ul><li>Implement an alternative to Helm templating;</li><li>Implement an option to pull charts directly from Git;</li><li>Expose a public Go API for embedding Nelm into third-party software;</li><li>Enhance the CLI experience with new commands and improve the consistency between the reimplemented commands and original Helm commands;</li><li>Overhaul the chart dependency management;</li><li>Migrate the built-in secret management to Mozilla SOPS.</li></ul><h3>Try it</h3><p><a href="https://github.com/werf/nelm?tab=readme-ov-file#install">Install Nelm</a> and follow the <a href="https://github.com/werf/nelm?tab=readme-ov-file#quickstart">Nelm quickstart</a>. Considering migrating from Helm for your deployments? Read more about <a href="https://github.com/werf/nelm/?tab=readme-ov-file#helm-compatibility">Helm compatibility</a>.</p><p>Let us know what you think! We’d love your feedback.</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=5648b191f0af" width="1" height="1" alt=""><hr><p><a href="https://blog.werf.io/nelm-cli-helm-compatible-alternative-5648b191f0af">Nelm 1.0 released: Helm-chart compatible alternative to Helm 3</a> was originally published in <a href="https://blog.werf.io">werf blog</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Blue-Green Deployments: a Guide to Deploying One or More Applications]]></title>
            <link>https://blog.werf.io/blue-green-deployments-a-guide-to-deploying-one-or-more-applications-61e2de67ad19?source=rss-71bfdb9446bd------2</link>
            <guid isPermaLink="false">https://medium.com/p/61e2de67ad19</guid>
            <category><![CDATA[deployment]]></category>
            <category><![CDATA[kubernetes]]></category>
            <category><![CDATA[werf]]></category>
            <category><![CDATA[gitlab]]></category>
            <category><![CDATA[devops]]></category>
            <dc:creator><![CDATA[Flant staff]]></dc:creator>
            <pubDate>Thu, 21 Nov 2024 10:14:51 GMT</pubDate>
            <atom:updated>2024-11-22T14:14:44.000Z</atom:updated>
            <content:encoded><![CDATA[<p>By: DevOps engineer Yuri Shakhov.</p><p>A few weeks ago, I was tasked with setting up a seamless application deployment for one of our customers. I explored different approaches for this task and settled on a strategy known as <a href="https://en.wikipedia.org/wiki/Blue%E2%80%93green_deployment">blue-green deployment</a>. Unfortunately, I couldn’t manage to find any practical examples on how to conduct this. The articles I have been fortunate enough to come across only address the general theory aspect of it. So I had to explore the blue-green deployment approach on my own. And now I’m eager to share today the learnings I took away from it.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*IJvuy-RjJogixCzBvl8AFQ.png" /></figure><p>In this piece, I will guide you through the entire process of deploying an application using the blue-green approach. That being said, I am not going to go over the various deployment strategies and the pros and cons that accompany them. See <a href="https://spacelift.io/blog/kubernetes-deployment-strategies">this article</a> to learn more about blue-green and other deployment strategies.</p><p>I’ll split this article into two parts — we’ll start by looking at how blue-green deployment works and then in the next part we’ll discuss how to deploy multiple apps from one repo using werf bundles. Note that there are multiple ways to implement this strategy: e. g., you can use third-party tools such as Service Mesh, Argo CD, etc. In my case, I will use <a href="https://werf.io/?utm_source=web&amp;utm_medium=medium&amp;utm_campaign=blue_green_141124">werf</a> for deployment, describe all the resources as Helm templates, and opt for GitLab to initiate the deployment (I assume you are familiar with those technologies). The twist here is that I stick to native Kubernetes entities and mechanisms (such as labels).</p><p>For simplicity’s sake, I will refer to the green and blue application instances as “versions”. Also, this article will not cover database migration, although for some applications that might be a necessity.</p><h3>Basic Blue-Green Deployment</h3><p>Suppose you would like to deploy a newer version of your application. Here’s how you can do it without experiencing downtime using a “blue-green” approach:</p><ol><li>First, deploy the application (the deploy_app pipeline) with the corresponding Deployment and Service.</li><li>Next, prepare everything you will need for version switching. That is, deploy an Ingress with the appropriate Service name (the deploy_ingress pipeline).</li><li>Here’s what the interface for running these two pipelines would look like in GitLab:</li></ol><figure><img alt="" src="https://cdn-images-1.medium.com/max/986/1*IqO9NlSKNKG1iBlt8G9CEA.png" /></figure><p>Define a deploy_version variable to substitute into Helm templates, which will be either blue or green (its value will be derived from the GitLab CI). Add the corresponding labels to the Deployment and Service:</p><pre>{{ $deploy_version := &quot;&quot; }}<br>{{ if .Values.werf.deploy_version }}<br>{{ $deploy_version = print &quot;-&quot; .Values.werf.deploy_version }}<br>{{ end }}<br>---<br>apiVersion: apps/v1<br>kind: Deployment<br>metadata:<br>  name: {{ .Chart.Name }}{{ $deploy_version }}<br>  labels:<br>    app: {{ .Chart.Name }}{{ $deploy_version }}<br>...<br>---<br>apiVersion: v1<br>kind: Service<br>metadata:<br>  name: {{ .Chart.Name }}{{ $deploy_version }}<br>spec:<br>  selector:<br>    app: {{ .Chart.Name }}{{ $deploy_version }}<br>...</pre><p>Now, create an Ingress for the traffic to be able to reach the pod. Service with the proper name (either blue or green) will allow you to use the specific application version. The Ingress template in this case would look as follows:</p><pre>{{ $deploy_version := &quot;&quot; }}<br>{{ if .Values.werf.deploy_version }}<br>{{ $deploy_version = print &quot;-&quot; .Values.werf.deploy_version  }}<br>{{ end }}<br>---<br>apiVersion: networking.k8s.io/v1<br>kind: Ingress<br>metadata:<br>  name: example<br>  labels:<br>    deploy-version: {{ .Values.werf.deploy_version | quote }}<br>spec:<br>  ingressClassName: nginx<br>  rules:<br>  - host: example.com<br>    http:<br>      paths:<br>      - path: /<br>        pathType: Prefix<br>        backend:<br>          service:<br>            name: {{ .Chart.Name }}{{ $deploy_version }}<br>            port:<br>              name: http<br> tls:<br>  - hosts:<br>    - example.com<br>    secretName: {{ .Chart.Name }}-tls</pre><p>Instead of routing traffic using Ingress, you can do it using Service, forwarding traffic to blue or green Deployment based on labels. However, I don’t recommend doing it this way — you will not be able to reference the Deployment by the Service name in the cluster. This may be a problem if you need to check whether the update went smoothly, since you will not be able to access the new version before all traffic is routed to it.</p><p>Another option is to create a second Ingress pointing to the inactive version with a different domain for testing. In that case, you will need to secure it with authorization to restrict access.</p><p>Now let’s take a look at the pipeline. When deploying an application, you must set the deploy_version variable to the app version to be deployed. Here’s how you can do that with werf:</p><pre>werf converge --set &quot;werf.deploy_version=${DEPLOY_VERSION}&quot;</pre><p>Also, when you’re deploying, you need to check that the version you’re deploying isn’t getting any traffic yet — this way, users won’t be affected during the rollout. To do so, retrieve information as to which Service the Ingress routes traffic to in the cluster, and search for blue or green.</p><p>See the complete gitlab-ci.yml file below:</p><pre>stages:<br> - deploy_app<br> - deploy_ingress<br><br>.check_upstreams: &amp;check_upstreams<br> - APP_CURRENT_ACTIVE=$(werf kubectl -n ${WERF_NAMESPACE} get ingress example --output=custom-columns=&#39;SVCs:..service.name&#39; --no-headers --ignore-not-found | awk -F &#39;-&#39; {&#39;print $NF&#39;})<br><br>.deploy_app:<br> stage: deploy_app<br> script:<br>   - *check_upstreams<br>   - if [[ ${KUBE_CURRENT_ACTIVE} == ${UPSTREAM} ]];<br>     then<br>       tput setaf 9 &amp;&amp; echo &quot;You are trying to deploy to the active version, the deployment process is halted!&quot; &amp;&amp; exit 1;<br>     else<br>       werf converge \<br>         --release example-${UPSTREAM} \<br>         --set &quot;werf.deploy_version=${UPSTREAM}&quot;;<br>     fi;<br> allow_failure: false<br><br>.deploy_ingress:<br> stage: converge_ingresses<br> script:<br>   - *check_upstreams<br>   - if [ ${APP_CURRENT_ACTIVE} == ${DEPLOY_VERSION} ];<br>     then<br>       tput setaf 9 &amp;&amp; echo &quot;You are trying to switch to the active version, the deployment process is halted!&quot; &amp;&amp; exit 1;<br>     else<br>       werf converge<br>       --set &quot;werf.deploy_version=${DEPLOY_VERSION}&quot;<br>     fi;<br><br>Deploy to blue:<br> extends: .deploy_app<br> environment:<br>   name: production<br> variables:<br>   UPSTREAM: &quot;blue&quot;<br><br>Deploy to green:<br> extends: .deploy_app<br> environment:<br>   name: production<br> variables:<br>   UPSTREAM: &quot;green&quot;<br><br>Switch to blue:<br> extends: .deploy_ingress<br> environment:<br>   name: production<br> variables:<br>   DEPLOY_VERSION: &quot;blue&quot;<br><br>Switch to green:<br> extends: .deploy_ingress<br> environment:<br>   name: production<br> variables:<br>   DEPLOY_VERSION: &quot;green&quot;</pre><p>So here’s the list of the steps we’ve taken so far:</p><ol><li>We’ve updated our Helm templates (Deployment, Service, and Ingress) by adding version “color” to each of them.</li><li>We created a CI that:</li></ol><ul><li>Deploys the application to blue and green.</li><li>Deploys the Ingress to route traffic to the desired version.</li><li>Checks that the version being deployed is not active.</li></ul><p>Now, let’s get to the bundle part.</p><h3>Deploying multiple applications with werf bundles</h3><p>Why would you need a bundle? Let’s say you need to deploy multiple apps at once. It makes the most sense to keep them all in one repo. The bundle mechanism allows you to publish a chart of an application and deploy it later on — no access to a specific Git repository is required. All you need is access to the container registry where that bundle is stored. That renders delivering app charts a breeze.</p><p>The werf bundle is designed to do just that. I won’t get into all the specifics here — just take a look at the <a href="https://werf.io/documentation/v1.2/usage/distribute/bundles.html">documentation</a> for more details and examples of how it works.</p><p>Bundles are created in the main application repository. Here, I’ll focus just on the deployment process. In the CI file, specify the application names and the corresponding variables for each application: the repository, the bundle tag, and the Ingress name:</p><pre>variables:<br> FIRST_REPO_BUNDLE: registry.gitlab.awesome.com/frontend/first<br> FIRST_TAG: &quot;0.1&quot;<br> FIRST_INGRESS: first<br>...<br><br># apps_for_matrix &amp; apps_for_bash must be the same!<br><br>.apps_for_matrix: &amp;apps_for_matrix<br> [&quot;FIRST&quot;, &quot;SECOND&quot;, &quot;THIRD&quot;, &quot;FOURTH&quot;, &quot;FIFTH&quot;]<br><br>.apps_for_bash: &amp;apps_for_bash<br> APPLICATIONS=(&quot;FIRST&quot;, &quot;SECOND&quot;, &quot;THIRD&quot;, &quot;FOURTH&quot;, &quot;FIFTH&quot;)</pre><p>The new pipeline will have three separate stages. When deploying multiple applications from a single repository, you have to make sure that all the applications are in the same state. To do so, let’s create a dedicated job called check_upstream that will check the version states of the applications.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*50bO8Pp9s6JnXoJUYdo7Ag.png" /></figure><p>At this stage, the system must ensure that the following basic conditions are met:</p><ul><li>All deployed applications must share the same active version (either blue or green).</li><li>If the application has not yet been deployed to the cluster, it shouldn’t have any active versions.</li></ul><pre>stages:<br> - check_upstreams<br> - deploy_apps<br> - deploy_ingresses<br><br>.base_werf: &amp;base_werf<br> - set -x<br> - type trdl &amp;&amp; source $(trdl use werf 2)<br> - werf version<br> - type werf &amp;&amp; source $(werf ci-env gitlab --verbose --as-file)<br><br>.check_upstreams: &amp;check_upstreams<br> - *base_werf<br> - *apps_for_bash<br> - |<br>   GREEN=false<br>   BLUE=false<br>   EMPTY=0<br><br>   for APP in ${APPLICATIONS[@]}<br>   do<br>     REPOSITORY_INGRESS=${APP}_INGRESS<br>     APP_CURRENT_ACTIVE=$(werf kubectl -n ${WERF_NAMESPACE} get ingress ${!REPOSITORY_INGRESS} --output=custom-columns=&#39;SVCs:..service.name&#39; --no-headers --ignore-not-found | awk -F &#39;-&#39; {&#39;print $NF&#39;})<br><br>     EMPTY=$((EMPTY+1))<br>     if [[ ${APP_CURRENT_ACTIVE} == &quot;green&quot; ]];<br>       then GREEN=true;<br>     elif [[ ${APP_CURRENT_ACTIVE} == &quot;blue&quot; ]];<br>       then BLUE=true;<br>     elif [[ -z ${APP_CURRENT_ACTIVE} ]];<br>       then EMPTY=$((EMPTY-1));<br>     else<br>       tput setaf 9 &amp;&amp; echo &quot;Something is wrong! Version status is invalid&quot; &amp;&amp; exit 1;<br>     fi;<br>   done<br><br>   if [[ ${GREEN} != ${BLUE} ]];<br>     then<br>     if [[ ${GREEN} ]]<br>       COLOR=&quot;green&quot;<br>       then tput setaf 14 &amp;&amp; echo &quot;The app version statuses are the same — green — you can proceed with the deployment&quot;;<br>     elif [[ ${BLUE} ]]<br>       COLOR=&quot;blue&quot;<br>       then tput setaf 14 &amp;&amp; echo &quot;The app version statuses are the same — blue — you can proceed with the deployment&quot;;<br>     fi;<br>   elif [[ ${EMPTY} = 0 ]]<br>     then tput setaf 14 &amp;&amp; echo &quot;No Ingress for these applications is detected in the cluster, you can proceed with the deployment&quot;;<br>   else<br>     tput setaf 9 &amp;&amp; echo &quot;The app version statuses are different, the deployment process is halted!!!&quot; &amp;&amp; exit 1;<br>   fi;<br><br>Check_upstreams:<br> stage: check_upstreams<br> script:<br>   - *check_upstreams<br> environment:<br>   name: production<br> when: always<br> allow_failure: false</pre><p>We will use the bundle mechanism to deploy the application. When running the deployment command, you must pass all the required parameters. Note that the release name for each application must be unique (use the — release parameter to specify it). This is essential, as sharing the same release name will result in the new deployment overwriting the previous one. During the deployment stage, the system will automatically create the necessary number of deployment jobs via the parallel:matrix function.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*ItFPldHL5tqJTsiaiyE5kg.png" /></figure><p>Your deployment configuration might look something like this:</p><pre>.deploy_apps: &amp;deploy_apps<br> stage: deploy_apps<br> before_script:<br>   - *base_werf<br>   - REPOSITORY_BUNBLE=${REPOSITORY_NAME}_REPO_BUNDLE<br>   - REPOSITORY_TAG=${REPOSITORY_NAME}_TAG<br>   - REPOSITORY_INGRESS=${REPOSITORY_NAME}_INGRESS<br>   - APP_CURRENT_ACTIVE=$(werf kubectl -n ${WERF_NAMESPACE} get ingress ${!REPOSITORY_INGRESS} --output=custom-columns=&#39;SVCs:..service.name&#39; --no-headers --ignore-not-found | awk -F &#39;-&#39; {&#39;print $NF&#39;})<br>   - |<br>     if [[ ${APP_CURRENT_ACTIVE} = ${DEPLOY_VERSION} ]];<br>       then tput setaf 9 &amp;&amp; echo &quot;You are trying to deploy to the active version, the deployment process is halted!!!&quot; &amp;&amp; exit 1;<br>     fi;<br> script:<br>   - werf cr login -u nobody -p ${BUNDLE_PULLER_PASSWORD} ${!REPOSITORY_BUNBLE}<br>   - werf bundle apply<br>     --release $(echo ${!REPOSITORY_BUNBLE} | cut -d / -f4)-${DEPLOY_VERSION}-${CI_ENVIRONMENT_SLUG}<br>     --repo ${!REPOSITORY_BUNBLE}<br>     --tag ${!REPOSITORY_TAG}<br>     --set &quot;werf.deploy_version=${DEPLOY_VERSION}&quot;<br> when: manual<br><br>Deploy to Green:<br> extends: .deploy_apps<br> stage: deploy_apps<br> environment:<br>   name: production<br> parallel:<br>   matrix:<br>     - REPOSITORY_NAME: *apps_for_matrix<br> variables:<br>   DEPLOY_VERSION: &quot;green&quot;</pre><p>Congrats: you’ve created a deployment pipeline that allows you to deploy different applications from a single repository using pre-published bundles.</p><h3>Conclusion</h3><p>The blue-green approach helps you deploy application updates reliably and quickly. You can test your new version before sending users to it, thereby rendering the entire process smoother. As for bundles, they really shine in cases when you need to deploy several apps at once. This renders application management and updates more visible and centralized, an essential element for large projects.</p><p>In this article, we’ve gone through blue-green deployments using GitLab-CI and demonstrated how to deploy multiple apps from a single repository. Hope this makes it easier for you to work with GitLab deployments and write your own CI scripts!</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=61e2de67ad19" width="1" height="1" alt=""><hr><p><a href="https://blog.werf.io/blue-green-deployments-a-guide-to-deploying-one-or-more-applications-61e2de67ad19">Blue-Green Deployments: a Guide to Deploying One or More Applications</a> was originally published in <a href="https://blog.werf.io">werf blog</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Kwasm review: run WebAssembly apps in Kubernetes clusters]]></title>
            <link>https://blog.deckhouse.io/kwasm-review-5c9482090161?source=rss-71bfdb9446bd------2</link>
            <guid isPermaLink="false">https://medium.com/p/5c9482090161</guid>
            <category><![CDATA[open-source]]></category>
            <category><![CDATA[wasm]]></category>
            <category><![CDATA[kubernetes]]></category>
            <category><![CDATA[kind]]></category>
            <category><![CDATA[webassembly]]></category>
            <dc:creator><![CDATA[Flant staff]]></dc:creator>
            <pubDate>Tue, 12 Nov 2024 10:11:18 GMT</pubDate>
            <atom:updated>2024-11-12T10:11:18.633Z</atom:updated>
            <content:encoded><![CDATA[<p>This article written by our DevOps engineer Dmitry Silkin is a continuation of our series on reviewing WebAssembly applications tools in Kubernetes clusters. In our <a href="https://blog.deckhouse.io/running-webassembly-applications-in-a-kubernetes-cluster-managed-by-deckhouse-42d9fbdc7056">previous piece</a>, we deployed a Wasm application to a cluster managed by <a href="https://deckhouse.io/products/kubernetes-platform/?utm_source=web&amp;utm_medium=medium&amp;utm_campaign=kwasm_121124">Deckhouse Kubernetes Platform</a> using the platform’s built-in tools. This time, we will use an off-the-shelf operator called Kwasm.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*BVBMo3x41kgEhA49hICPqQ.png" /></figure><h3>What is Kwasm?</h3><p><a href="http://kwasm.sh/">Kwasm</a> is a Kubernetes operator that supports running WebAssembly applications on Kubernetes cluster nodes. <a href="https://github.com/KWasm/kwasm-node-installer">Kwasm-node-installer</a> — a component of the operator — installs the containerd binaries and makes the necessary configuration changes. It is run on nodes labeled with kwasm.sh/kwasm-node=true.</p><p>The module then downloads the required containerd-shim binaries to the cluster nodes and makes changes to the containerd configuration (refer to <a href="https://wasmlabs.dev/articles/docker-without-containers/">this article</a> to learn more about WebAssembly). After that, you can run Wasm applications on those nodes.</p><p>Kwasm supports a plethora of cloud platforms as well as local installations ranging from kind to AWS, GCP, and Azure clouds:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*hKqfP7btfgHpsNSwkpCwCg.png" /><figcaption><a href="https://kwasm.sh/">Source</a></figcaption></figure><h3>Installing Kwasm</h3><p>Let’s install the operator in a Kubernetes cluster. First, we have to create a cluster. Let’s use the kind tool. Our cluster will consist of three nodes:</p><pre>kind: Cluster<br>apiVersion: kind.x-k8s.io/v1alpha4<br>nodes:<br>- role: control-plane<br>- role: worker<br>- role: worker</pre><p>Create a cluster:</p><pre>kind create cluster --config=./kind.yaml</pre><p>Ascertain that the cluster has been created:</p><pre>kubectl get nodes<br>NAME                 STATUS   ROLES           AGE   VERSION<br>kind-control-plane   Ready    control-plane   59s   v1.24.0<br>kind-worker          Ready    &lt;none&gt;          40s   v1.24.0<br>kind-worker2         Ready    &lt;none&gt;          40s   v1.24.0</pre><p>Once the cluster is ready, proceed to install the Kwasm operator:</p><pre>helm repo add kwasm http://kwasm.sh/kwasm-operator/<br>helm install -n kwasm --create-namespace kwasm-operator kwasm/kwasm-operator<br>kubectl annotate node --all kwasm.sh/kwasm-node=true</pre><p>Make sure that the operator has been installed:</p><pre>kubectl get pods -n kwasm -o wide<br>NAME                                       READY   STATUS      RESTARTS   AGE   IP           NODE                 NOMINATED NODE   READINESS GATES<br>kind-control-plane-provision-kwasm-cfbvg   0/1     Completed   0          55s   10.244.0.5   kind-control-plane   &lt;none&gt;           &lt;none&gt;<br>kind-worker-provision-kwasm-n5c95          0/1     Completed   0          55s   10.244.1.3   kind-worker          &lt;none&gt;           &lt;none&gt;<br>kind-worker2-provision-kwasm-zqj5z         0/1     Completed   0          55s   10.244.2.2   kind-worker2         &lt;none&gt;           &lt;none&gt;<br>kwasm-operator-7f7d456678-hxsgx            1/1     Running     0          72s   10.244.1.2   kind-worker          &lt;none&gt;           &lt;none&gt;</pre><h3>Running a sample Wasm application</h3><p>Once the operator has made changes to the containerd configuration on the nodes, create a separate runtime class to run the WebAssembly containers.</p><pre>kubectl apply -f -&lt;&lt;EOF<br>---<br>apiVersion: node.k8s.io/v1<br>kind: RuntimeClass<br>metadata:<br>  name: wasmedge<br>handler: wasmedge<br>EOF</pre><p>Now you are all set to run the Wasm application. Let’s run the wasi-demo pod with the wasmedge/example-wasi:latest image as an example.</p><pre>kubectl apply -f -&lt;&lt;EOF<br>---<br>apiVersion: v1<br>kind: Pod<br>metadata:<br>  labels:<br>    run: wasi-demo<br>  name: wasi-demo<br>spec:<br>  containers:<br>  - args:<br>    - /wasi_example_main.wasm<br>    - &quot;50000000&quot;<br>    image: wasmedge/example-wasi:latest<br>    name: wasi-demo<br>  restartPolicy: Never<br>  runtimeClassName: wasmedge<br>EOF</pre><p>Once the pod has been Completed, take a look at its logs:</p><pre>kubectl logs wasi-demo<br>Random number: 63685983<br>Random bytes: [247, 43, 129, 227, 3, 56, 148, 40, 154, 241, 96, 85, 109, 140, 104, 71, 188, 245, 165, 107, 146, 202, 215, 21, 50, 33, 54, 193, 175, 35, 142, 108, 150, 30, 229, 50, 105, 139, 110, 170, 187, 234, 41, 249, 213, 65, 146, 27, 88, 115, 30, 147, 95, 155, 203, 183, 143, 0, 139, 108, 12, 141, 255, 191, 11, 254, 40, 189, 186, 19, 196, 136, 51, 114, 103, 119, 130, 105, 99, 177, 192, 158, 122, 120, 160, 9, 241, 73, 209, 235, 22, 158, 35, 6, 223, 217, 3, 215, 114, 4, 52, 11, 49, 191, 33, 253, 80, 254, 255, 176, 137, 38, 53, 190, 18, 194, 53, 143, 251, 1, 147, 254, 206, 130, 195, 77, 93, 151]<br>Printed from wasi: This is from a main function<br>This is from a main function<br>The env vars are as follows.<br>PATH: /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin<br>KUBERNETES_SERVICE_HOST: 10.96.0.1<br>HOSTNAME: wasi-demo<br>KUBERNETES_PORT: tcp://10.96.0.1:443<br>KUBERNETES_SERVICE_PORT: 443<br>KUBERNETES_SERVICE_PORT_HTTPS: 443<br>KUBERNETES_PORT_443_TCP_ADDR: 10.96.0.1<br>KUBERNETES_PORT_443_TCP_PROTO: tcp<br>KUBERNETES_PORT_443_TCP_PORT: 443<br>KUBERNETES_PORT_443_TCP: tcp://10.96.0.1:443<br>The args are as follows.<br>/wasi_example_main.wasm<br>50000000<br>File content is This is in a file</pre><p>Cool, the WebAssembly application works!</p><h3>Conclusion</h3><p>As we’ve seen, Kwasm significantly streamlines the deployment of Wasm applications into a Kubernetes cluster. However, it is worth noting that while the operator claims support for a large number of Kubernetes distributions, the developers themselves <a href="https://github.com/KWasm/kwasm-operator/blob/main/README.md#kwasm-operator">warn</a> that the tool should only be used for evaluation purposes.</p><p>As of now, <a href="https://www.spinkube.dev/">SpinKube</a> is well worth a good look, which is powered by <a href="https://github.com/spinkube/runtime-class-manager">Kwasm technologies</a>. It’s exhibiting a more rapid pace of development. We’ll definitely cover it in one of our upcoming articles on WebAssembly technologies.</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=5c9482090161" width="1" height="1" alt=""><hr><p><a href="https://blog.deckhouse.io/kwasm-review-5c9482090161">Kwasm review: run WebAssembly apps in Kubernetes clusters</a> was originally published in <a href="https://blog.deckhouse.io">Deckhouse blog</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Running WebAssembly applications in a Kubernetes cluster managed by Deckhouse]]></title>
            <link>https://blog.deckhouse.io/running-webassembly-applications-in-a-kubernetes-cluster-managed-by-deckhouse-42d9fbdc7056?source=rss-71bfdb9446bd------2</link>
            <guid isPermaLink="false">https://medium.com/p/42d9fbdc7056</guid>
            <category><![CDATA[deckhouse]]></category>
            <category><![CDATA[kubernetes-cluster]]></category>
            <category><![CDATA[devops]]></category>
            <category><![CDATA[webassembly]]></category>
            <category><![CDATA[kubernetes]]></category>
            <dc:creator><![CDATA[Flant staff]]></dc:creator>
            <pubDate>Tue, 10 Sep 2024 09:13:43 GMT</pubDate>
            <atom:updated>2024-09-10T09:15:59.874Z</atom:updated>
            <content:encoded><![CDATA[<p>The high performance and security of <a href="https://wasmlabs.dev/articles/docker-without-containers/">WebAssembly (Wasm) technology</a> are increasingly making us all take a closer look at it. I decided to find out what it is about and how it works. The idea was to try Wasm in Kubernetes — that way, I could take advantage of all the orchestrator pros such as resource sharing, fault tolerance, scalability, and so on.</p><p>But running Wasm applications in plain vanilla Kubernetes is not as easy as it sounds, since setting up runtime environments on worker nodes is tricky. The built-in K8s tools are simply not designed to make the node customization process convenient for a casual user. Of course, you can configure a single node on your own. The problem with this approach is that if you need to try out different runtimes or run a large number of applications, you want cluster scaling to be as easy as possible. In this case, managing nodes declaratively also makes perfect sense. So I thought I’d use the <a href="https://deckhouse.io/products/kubernetes-platform/?utm_source=web&amp;utm_medium=medium&amp;utm_campaign=wasm_0924">Deckhouse Kubernetes Platform (DKP)</a> to try to run a Wasm application. This platform greatly streamlines the deployment and management of Kubernetes clusters.</p><p>My name is Yegor Lazarev, I’m a DevOps engineer at Flant. In this article, I will show you how to run Wasm applications in Kubernetes clusters managed by DKP. We will set up an environment, install the necessary components, and run a simple WebAssembly module.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*OAqbJvDoc7pupmG0Qr8PVQ.png" /></figure><h3>Configuring a NodeGroup</h3><p>I guess it makes sense to separate the regular workloads and Wasm workloads so that there is a dedicated worker for experiments. To do so, let’s create a NodeGroup that the platform will use to manage individual nodes. When configuring, you should add labels to the NodeGroup nodes. This way, you can use NodeSelector to assign the workloads to the appropriate nodes:</p><pre>kubectl create -f -&lt;&lt;EOF<br>apiVersion: deckhouse.io/v1<br>kind: NodeGroup<br>metadata:<br>  name: wasm<br>spec:<br>  cloudInstances:<br>    classReference:<br>      kind: YandexInstanceClass<br>      name: worker<br>    maxPerZone: 1<br>    minPerZone: 1<br>    zones:<br>    - ru-central1-a<br>  disruptions:<br>    approvalMode: Automatic<br>  kubelet:<br>    containerLogMaxFiles: 4<br>    containerLogMaxSize: 50Mi<br>    resourceReservation:<br>      mode: Auto<br>  nodeTemplate:<br>    labels:<br>      node.deckhouse.io/group: wasm<br>  nodeType: CloudEphemeral<br>EOF</pre><p>Once the NodeGroup is created, DKP will provision one virtual machine in the cloud of the AWSInstanceClass=worker class in the eu-west-1a zone and add the node.deckhouse.io/group=wasm label to it.</p><h3>Installing the WasmEdge runtime</h3><p>Kubernetes requires a specialized runtime, WebAssembly System Interface (WASI), to run Wasm applications. In this article, we will use WasmEdge. On top of that, we’ll need to update the containerd configuration to reflect the new runtimes. The NodeGroupConfiguration resource allows you to run bash scripts on nodes, so let’s use it to install WasmEdge and do some additional configuration.</p><p>Check if the WASI bin file is available and download it if there isn’t one. Next, <a href="https://deckhouse.io/products/kubernetes-platform/documentation/v1/modules/040-node-manager/#custom-node-settings">use bashbooster</a> to merge the main containerd config with the config from /etc/containerd/conf.d/*.toml. Modifying /etc/containerd/config.toml will result in containerd being restarted as well:</p><pre>kubectl create -f -&lt;&lt;EOF<br>apiVersion: deckhouse.io/v1alpha1<br>kind: NodeGroupConfiguration<br>metadata:<br>  name: wasm-additional-shim.sh<br>spec:<br>  bundles:<br>    - &#39;*&#39;<br>  content: |<br>    [ -f &quot;/bin/containerd-shim-wasmedge-v1&quot; ] || curl -L https://github.com/containerd/runwasi/releases/download/containerd-shim-wasmedge%2Fv0.3.0/containerd-shim-wasmedge-$(uname -m | sed s/arm64/aarch64/g | sed s/amd64/x86_64/g).tar.gz | tar -xzf - -C /bin<br><br>    mkdir -p /etc/containerd/conf.d<br>    bb-sync-file /etc/containerd/conf.d/additional_shim.toml - containerd-config-changed &lt;&lt; &quot;EOF&quot;<br>    [plugins]<br>      [plugins.&quot;io.containerd.grpc.v1.cri&quot;]<br>        [plugins.&quot;io.containerd.grpc.v1.cri&quot;.containerd]<br>          [plugins.&quot;io.containerd.grpc.v1.cri&quot;.containerd.runtimes]<br>            [plugins.&quot;io.containerd.grpc.v1.cri&quot;.containerd.runtimes.wasmedge]<br>              runtime_type = &quot;io.containerd.wasmedge.v1&quot;<br>              [plugins.&quot;io.containerd.grpc.v1.cri&quot;.containerd.runtimes.wasmedge.options]<br>                BinaryName = &quot;/bin/containerd-shim-wasmedge-v1&quot;<br>    EOF<br>  nodeGroups:<br>    - &quot;wasm&quot;<br>  weight: 30<br>EOF</pre><h3>Defining new RuntimeClasses</h3><p>Now that WasmEdge is installed, you have to define a new RuntimeClass. This will allow you to specify how to run a particular workload: use the default runtime or another one by explicitly specifying spec.runtimeClassName in the spec.pods:</p><pre>kubectl apply -f -&lt;&lt;EOF<br>---<br>apiVersion: node.k8s.io/v1<br>kind: RuntimeClass<br>metadata:<br>  name: wasmedge<br>handler: wasmedge<br>EOF</pre><h3>Running a test Wasm application</h3><p>First, make sure that platform has finished configuring the node and updated the containerd configuration:</p><pre>root@test-wasm-75934c42-5956c-l5m7f:~# grep wasm /etc/containerd/config.toml<br>        [plugins.&quot;io.containerd.grpc.v1.cri&quot;.containerd.runtimes.wasmedge]<br>          runtime_type = &quot;io.containerd.wasmedge.v1&quot;<br>          [plugins.&quot;io.containerd.grpc.v1.cri&quot;.containerd.runtimes.wasmedge.options]<br>            BinaryName = &quot;/bin/containerd-shim-wasmedge-v1&quot;</pre><p>Now you can run the test Wasm application. To do so, create a Job with a basic WebAssembly module. In the Job, specify the NodeSelector and the newly created wasmedge RuntimeClass:</p><pre>kubectl apply -f -&lt;&lt;EOF<br>apiVersion: batch/v1<br>kind: Job<br>metadata:<br>  name: wasm-test<br>spec:<br>  template:<br>    spec:<br>      containers:<br>      - image: wasmedge/example-wasi:latest<br>        name: wasm-test<br>        resources: {}<br>      restartPolicy: Never<br>      runtimeClassName: wasmedge<br>      nodeSelector:<br>        node.deckhouse.io/group: wasm<br>  backoffLimit: 1<br>EOF</pre><p>Check the pod’s status and logs to make sure everything is running smoothly:</p><pre>root@test-master-0:~# kubectl get pods<br>NAME              READY   STATUS      RESTARTS   AGE<br>wasm-test-2g5jl   0/1     Completed   0          18s<br><br>root@test-master-0:~# kubectl logs wasm-test-2g5jl<br>Random number: -700610054<br>Random bytes: [163, 184, 229, 154, 4, 145, 145, 96, 181, 77, 64, 159, 123, 45, 5, 134, 93, 193, 207, 74, 129, 113, 204, 174, 188, 152, 172, 151, 125, 78, 199, 177, 127, 112, 116, 255, 188, 180, 47, 110, 22, 241, 63, 87, 78, 168, 36, 202, 168, 90, 248, 79, 38, 59, 204, 128, 141, 92, 209, 205, 129, 51, 71, 214, 91, 237, 115, 145, 77, 136, 166, 115, 221, 66, 123, 186, 19, 39, 122, 204, 103, 221, 89, 97, 148, 57, 250, 255, 165, 53, 14, 241, 97, 138, 147, 201, 204, 29, 76, 219, 128, 48, 143, 165, 138, 231, 62, 235, 190, 94, 142, 63, 197, 37, 57, 241, 33, 99, 240, 215, 216, 33, 68, 141, 82, 21, 152, 93]<br>Printed from wasi: This is from a main function<br>This is from a main function<br>The env vars are as follows.<br>KUBERNETES_SERVICE_PORT_HTTPS: 443<br>KUBERNETES_PORT_443_TCP: tcp://10.222.0.1:443<br>KUBERNETES_PORT_443_TCP_ADDR: 10.222.0.1<br>KUBERNETES_PORT_443_TCP_PROTO: tcp<br>KUBERNETES_SERVICE_PORT: 443<br>HOSTNAME: wasm-test-2g5jl<br>PATH: /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin<br>KUBERNETES_SERVICE_HOST: 10.222.0.1<br>KUBERNETES_PORT: tcp://10.222.0.1:443<br>KUBERNETES_PORT_443_TCP_PORT: 443<br>The args are as follows.<br>/wasi_example_main.wasm<br>File content is This is in a file</pre><p>If this is the case, the pod will have a status of Completed, which means that the Job has been executed and the pod has finished its operation without errors.</p><p>In the logs, you should see a random number and lots of random bytes generated by the application just like it was intended. This also means that the application had access to the environment and file system.</p><h3>Running a test Wasm application with an init container</h3><p>Now let’s make things a little more challenging. Quite often there is a need to run init or sidecar containers in pods from regular container images. To do so, you will have to define a different runtime for each container. However, the runtimeClassName is defined at the pod level, not the container level.</p><p>Containerd supports container runtime switching, so you will need a tool that can determine which runtime to use for a particular container. The regular runc used by default in cluster doesn’t support this. Fortunately, the beta version of crun supports such a functionality.</p><p>First, you will have to build crun yourself, as it does not support WasmEdge if you install it from the official repositories using a package manager. NodeGroupConfiguration can help you with this:</p><pre>kubectl apply -f -&lt;&lt;EOF<br>apiVersion: deckhouse.io/v1alpha1<br>kind: NodeGroupConfiguration<br>metadata:<br>  name: crun-install.sh<br>spec:<br>  bundles:<br>  - &#39;*&#39;<br>  content: |<br>    if ! [ -x /usr/local/bin/crun ]; then<br>      apt-get update &amp;&amp; apt-get install -y make git gcc build-essential pkgconf libtool libsystemd-dev libprotobuf-c-dev libcap-dev libseccomp-dev libyajl-dev go-md2man autoconf python3 automake<br>      cd /root<br>      [ -f &quot;/root/.wasmedge/bin/wasmedge&quot; ] || curl -sSf https://raw.githubusercontent.com/WasmEdge/WasmEdge/master/utils/install.sh | bash<br>      git clone https://github.com/containers/crun &amp;&amp; cd crun<br>      ./autogen.sh<br>      source /root/.wasmedge/env &amp;&amp; ./configure --with-wasmedge<br>      make<br>      make install<br>      cd .. &amp;&amp; rm -rf crun<br>    fi<br>      echo &quot;crun has been installed&quot;<br>    mkdir -p /etc/containerd/conf.d<br>    bb-sync-file /etc/containerd/conf.d/add_crun.toml - containerd-config-changed &lt;&lt; &quot;EOF&quot;<br>    [plugins]<br>      [plugins.&quot;io.containerd.grpc.v1.cri&quot;]<br>        [plugins.&quot;io.containerd.grpc.v1.cri&quot;.containerd]<br>          [plugins.&quot;io.containerd.grpc.v1.cri&quot;.containerd.runtimes]<br>            [plugins.&quot;io.containerd.grpc.v1.cri&quot;.containerd.runtimes.crun]<br>              runtime_type = &quot;io.containerd.runc.v2&quot;<br>              pod_annotations = [&quot;*.wasm.*&quot;, &quot;wasm.*&quot;, &quot;module.wasm.image/*&quot;, &quot;*.module.wasm.image&quot;, &quot;module.wasm.image/variant.*&quot;]<br>              [plugins.&quot;io.containerd.grpc.v1.cri&quot;.containerd.runtimes.crun.options]<br>                BinaryName = &quot;/usr/local/bin/crun&quot;<br>    EOF<br>  nodeGroups:<br>  - wasm<br>  weight: 30<br>EOF</pre><p>The code snippet above installs WasmEdge (which is different from the WasmEdge runtime we installed earlier in this article), as well as the required dependencies, and builds crun. On top of that, you have to add a new runtime container to the /etc/containerd/config.toml configuration, just like we did earlier for the Wasm one.</p><p>Note the pod_annotations: this is a list of annotations to be passed to both the runtime environment and the container’s OCI annotations. I’ll explain why this is necessary in a minute.</p><p>Next, create a new RuntimeClass:</p><pre>kubectl apply -f -&lt;&lt;EOF<br>---<br>apiVersion: node.k8s.io/v1<br>kind: RuntimeClass<br>metadata:<br>  name: crun<br>handler: crun<br>EOF</pre><p>Now, try to run your workload:</p><pre>kubectl apply -f -&lt;&lt;EOF<br>apiVersion: batch/v1<br>kind: Job<br>metadata:<br>  name: wasm-test<br>spec:<br>  template:<br>    metadata:<br>      annotations:<br>        module.wasm.image/variant: compat-smart<br>    spec:<br>      initContainers:<br>      - name: hello<br>        image: busybox:latest<br>        command: [&#39;sh&#39;, &#39;-c&#39;, &#39;echo &quot;Hello, Medium!&quot;&#39;]<br>      containers:<br>      - image: wasmedge/example-wasi:latest<br>        name: wasm-test<br>        resources: {}<br>      restartPolicy: Never<br>      runtimeClassName: crun<br>      nodeSelector:<br>        node.deckhouse.io/group: wasm<br>  backoffLimit: 1<br>EOF</pre><p>The runtimeClassName: crun parameter indicates that crun, rather than the default runc, is now used for starting containers. On the other hand, the module.wasm.image/variant: compat-smart annotation tells crun which mode to operate in.</p><p>For this to work, you’ll have to add the following OCI annotation to the WASM image when building:</p><pre>...<br>&quot;annotations&quot;: {<br> &quot;run.oci.handler&quot;: &quot;wasm&quot;<br>},<br>...</pre><p>Crun uses pod_annotations in the containerd configuration and the compat-smart annotation on the K8s object to figure out which workload to run itself and which one to delegate to the Wasm runtime.</p><p>Examine the pod’s state and its logs. You should see the same thing in the logs as before:</p><pre>root@test-master-0:~# kubectl get pods<br>NAME              READY   STATUS      RESTARTS   AGE<br>wasm-test-pn4gv   0/1     Completed   0          32s<br><br>root@test-master-0:~# kubectl logs wasm-test-pn4gv<br>Defaulted container &quot;wasm-test&quot; out of: wasm-test, hello (init)<br>Random number: -158793507<br>Random bytes: [210, 246, 181, 132, 184, 214, 110, 71, 198, 68, 154, 182, 253, 103, 116, 207, 5, 205, 185, 81, 19, 28, 61, 61, 85, 26, 222, 111, 239, 110, 21, 68, 119, 245, 153, 190, 105, 175, 191, 163, 48, 198, 41, 207, 155, 30, 122, 166, 23, 56, 59, 168, 91, 57, 103, 213, 145, 10, 130, 224, 28, 5, 73, 176, 206, 111, 37, 241, 38, 57, 98, 158, 150, 115, 249, 233, 194, 156, 13, 109, 85, 130, 232, 91, 253, 16, 8, 233, 92, 162, 237, 197, 151, 112, 52, 140, 83, 179, 31, 48, 233, 56, 54, 75, 43, 239, 233, 169, 169, 81, 36, 52, 59, 66, 102, 40, 52, 202, 34, 56, 167, 229, 197, 25, 72, 136, 147, 254]<br>Printed from wasi: This is from a main function<br>This is from a main function<br>The env vars are as follows.<br>PATH: /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin<br>HOSTNAME: wasm-test-pn4gv<br>KUBERNETES_PORT: tcp://10.222.0.1:443<br>KUBERNETES_PORT_443_TCP: tcp://10.222.0.1:443<br>KUBERNETES_PORT_443_TCP_PROTO: tcp<br>KUBERNETES_PORT_443_TCP_PORT: 443<br>KUBERNETES_PORT_443_TCP_ADDR: 10.222.0.1<br>KUBERNETES_SERVICE_HOST: 10.222.0.1<br>KUBERNETES_SERVICE_PORT: 443<br>KUBERNETES_SERVICE_PORT_HTTPS: 443<br>HOME: /<br>The args are as follows.<br>/wasi_example_main.wasm<br>File content is This is in a file</pre><p>Check the init container’s logs:</p><pre>root@test-master-0:~# kubectl logs wasm-test-pn4gv -c hello<br>Hello, Medium!</pre><h3>Conclusion</h3><p>Running WebAssembly applications in Kubernetes may not sound like an easy task, but with <a href="https://deckhouse.io/products/kubernetes-platform/?utm_source=web&amp;utm_medium=medium&amp;utm_campaign=wasm_0924">Deckhouse Kubernetes Platform</a> it becomes a fairly straightforward process. This article delved into setting up the environment, installing the necessary components, and running a test Wasm application. I hope you will find all this information useful.</p><p>The DKP provides many features for managing a Kubernetes cluster. We will share new practices and tips in upcoming articles. Stay tuned!</p><p>Feel free to ask any questions you have and contribute suggestions in the comments below. You can also submit your question to the Deckhouse <a href="https://t.me/deckhouse">Telegram chat</a> or create an Issue in the <a href="https://github.com/deckhouse/deckhouse">Deckhouse repository on GitHub</a>. We will be happy to help you. Please <a href="https://github.com/deckhouse/deckhouse">star the project</a> if you like it.</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=42d9fbdc7056" width="1" height="1" alt=""><hr><p><a href="https://blog.deckhouse.io/running-webassembly-applications-in-a-kubernetes-cluster-managed-by-deckhouse-42d9fbdc7056">Running WebAssembly applications in a Kubernetes cluster managed by Deckhouse</a> was originally published in <a href="https://blog.deckhouse.io">Deckhouse blog</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Server-Side Apply instead of 3-Way Merge: How werf 2.0 solves Helm 3 challenges]]></title>
            <link>https://blog.werf.io/ssa-vs-3wm-in-helm-werf-nelm-4d7996354ebe?source=rss-71bfdb9446bd------2</link>
            <guid isPermaLink="false">https://medium.com/p/4d7996354ebe</guid>
            <category><![CDATA[werf]]></category>
            <category><![CDATA[helm]]></category>
            <category><![CDATA[kubernetes]]></category>
            <dc:creator><![CDATA[Flant staff]]></dc:creator>
            <pubDate>Thu, 25 Jul 2024 05:59:37 GMT</pubDate>
            <atom:updated>2024-07-25T06:03:44.066Z</atom:updated>
            <content:encoded><![CDATA[<p>In werf 1.2, a mechanism called <em>3-Way Merge</em> (3WM) was used for updating Kubernetes resources. It was inherited from Helm 3 since we used its fork in werf. While 3-Way Merge solved some of the 2<em>-Way Merge</em> issues, many of the challenges that caused incorrect resource updates remained unaddressed.</p><p>In <a href="https://blog.werf.io/werf-2-nelm-replacing-helm-a11980c2bdda">werf 2.0</a> and Nelm, we took it a step further and replaced 3-Way Merge with <a href="https://kubernetes.io/docs/reference/using-api/server-side-apply/"><em>Server-Side Apply</em></a> (SSA), a more modern and robust mechanism for updating Kubernetes resources. It solves all 3-Way Merge issues and ensures that resources in the cluster are updated correctly during deployments.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/576/1*5F5DNu87Q6JxM0i3H_Kz-A.png" /><figcaption>Manifest sample performing Server-Side Apply in Kubernetes</figcaption></figure><p>This article discusses the 3WM-related challenges users of Helm 3 have faced. Subsequently, it goes on to show how SSA can help in overcoming them.</p><p><strong><em>Note</em></strong><em>. Refer to </em><a href="https://blog.werf.io/3-way-merge-patches-helm-werf-beb7eccecdfe"><em>one of our earlier articles</em></a><em> to learn more about 3-Way Merge and 2-Way Merge.</em></p><h3>Invalid resource updates in Helm 3</h3><p>If you rely on 3WM (e.g., Helm 3 and/or werf 1.2) to update resources in a cluster, the deployed resources often do not match their description in the Helm chart. Let’s try to simulate such a scenario.</p><p>Suppose there is a Helm chart with the Deployment (chart/templates/deployment.yaml):</p><pre>apiVersion: apps/v1<br>kind: Deployment<br>metadata:<br>  name: myapp<br>spec:<br>  selector:<br>    matchLabels:<br>      app: myapp<br>  template:<br>    metadata:<br>      labels:<br>        app: myapp<br>    spec:<br>      containers:<br>      - name: main<br>        image: nginx</pre><p>… and the Job hook (chart/templates/job.yaml):</p><pre>apiVersion: batch/v1<br>kind: Job<br>metadata:<br>  name: myjob<br>  annotations:<br>    helm.sh/hook: &quot;post-install,post-upgrade&quot;<br>spec:<br>  backoffLimit: 0<br>  template:<br>    spec:<br>      restartPolicy: Never<br>      containers:<br>      - name: main<br>        image: alpine<br>        command: [&quot;echo&quot;, &quot;succeeded&quot;]</pre><p>Let’s release it using the latest Helm 3 version:</p><pre>$ helm upgrade --install myapp chart<br>Release &quot;myapp&quot; has been upgraded. Happy Helming!</pre><p>Now, in the chart’s Deployment, let’s replace the main container with two containers called <em>backend</em> and <em>frontend</em>:</p><pre>...<br>      containers:<br>      - name: backend<br>        image: nginx<br>      - name: frontend<br>        image: nginx</pre><p>… while also “accidentally” breaking the Job hook:</p><pre>...<br>      containers:<br>      - name: main<br>        image: alpine<br>        command: [&quot;fail&quot;]</pre><p>The second release will fail (just as you’d expect):</p><pre>$ helm upgrade --install myapp chart<br>Error: UPGRADE FAILED: post-upgrade hooks failed: 1 error occurred:<br>       * job myjob failed: BackoffLimitExceeded</pre><p>Given that we have an error in Job in the chart, let’s fix it:</p><pre>...<br>      containers:<br>      - name: main<br>        image: alpine<br>        command: [&quot;echo&quot;, &quot;succeeded&quot;]</pre><p>…and at the same time rename the new containers in the Deployment from <em>backend</em> and <em>frontend</em> to <em>app</em> and <em>proxy</em> (seeing as their original names aren’t really fitting):</p><pre>...<br>      containers:<br>      - name: app<br>        image: nginx<br>      - name: proxy<br>        image: nginx</pre><p>Now, let’s run a third release (a successful one this time):</p><pre>$ helm upgrade --install myapp chart<br>Release &quot;myapp&quot; has been upgraded. Happy Helming!</pre><p>And check if the Deployment in the chart and in the Helm release match the Deployment in the cluster:</p><pre>$ cat chart/templates/deployment.yaml<br>...<br>      containers:  # correct      <br>      - name: app<br>      - name: proxy<br><br>$ helm get manifest myapp<br>...<br>      containers:  # correct<br>      - name: app<br>      - name: proxy<br><br>$ kubectl get deploy myapp -oyaml<br>...<br>      containers:  # INCORRECT<br>      - name: app<br>      - name: proxy<br>      - name: backend<br>      - name: frontend</pre><p>Well, it looks like the Deployment in the chart/release has two containers, but the Deployment in the cluster has four of them for whatever reason: two valid <em>app</em> and <em>proxy</em> containers and two legacy <em>frontend</em> and <em>backend</em> ones.</p><p>What is more, repeating the release won’t rid you of the unnecessary <em>frontend</em> and <em>backend</em> containers:</p><pre>$ helm upgrade --install myapp chart<br>$ kubectl get deploy myapp -oyaml<br>...<br>      containers:<br>      - name: app<br>      - name: proxy<br>      - name: backend<br>      - name: frontend</pre><p>Rolling back to the very first revision won’t help either:</p><pre>$ helm rollback myapp 1<br>$ kubectl get deploy myapp -oyaml<br>...<br>      containers:<br>      - name: main<br>      - name: backend<br>      - name: frontend</pre><p>At this point, the easiest way to get rid of the unwanted containers is to manually delete them in the cluster via kubectl edit.</p><p>Notably, this case is not unique — pretty much the same thing can happen with most of the resources. Meanwhile, a trigger may be not only a failed release but also a canceled release (i.e., when Helm gets an INT, TERM, or KILL signal).</p><h3>The root of this phenomenon and what to do about it</h3><p>The thing is that some resource fields are missing in the chart, but they are present in the cluster. It is hard to tell whether Helm should remove those fields or not.</p><p>But why not then just delete everything that isn’t in the chart’s resource manifest? The answer is that Kubernetes or Kubernetes operators can make changes to a resource that Helm must never delete. For example, Istio may add an Istio Proxy sidecar container to the Deployment. In this case, Helm should not delete this sidecar container, even though it is not in the chart.</p><p>To figure out what to do, Helm must divide up the “extra” fields — those that only exist in the resource in the cluster — into fields it controls and those it does not control. It can delete fields it controls, but <strong>it cannot mess with the fields it does not control</strong>.</p><p>When using helm upgrade, the resource fields from the new release and the previous <em>successful</em> release are considered the fields that Helm controls. Where the issue usually emerges is what if a previous release was <em>unsuccessful</em> or was <em>canceled</em>, but it brought in some important changes, such as new controllable fields?</p><p>In the end, the greater the number of releases that fail or are canceled, the more orphaned fields are left in the cluster resources. In some cases, they are perfectly harmless. In others, <strong>they can lead to denial of service or even data corruption/loss</strong>.</p><p>The worst part is that there is no simple solution to this issue within Helm. One way would be to devise a new approach for Helm releases, where for each individual resource, its last applied state would be recorded.</p><p>However, there is a better way: replacing 3-Way Merge with Server-Side Apply.</p><h3>What is Server-Side Apply?</h3><p>Kubernetes 1.22 introduced a new way to update resources in a cluster called Server-Side Apply. Let’s now compare resource updates via 3WM versus SSA.</p><p><strong>Upgrading a resource using 3WM</strong> requires you to do the following:</p><ol><li>Retrieve the resource manifest from the latest successful release.</li><li>Retrieve the resource manifest from the chart.</li><li>Retrieve the resource manifest from the cluster.</li><li>Compose a 3WM patch based on those three manifests.</li><li>Send an HTTP PATCH request to Kubernetes containing a 3WM patch.</li></ol><p><strong>Upgrading a resource using SSA</strong> requires you to complete the following two steps:</p><ol><li>Retrieve the resource manifest from the chart.</li><li>Send an HTTP PATCH request to Kubernetes containing the resource manifest.</li></ol><p>SSA’s advantages include:</p><ul><li>Ease of use.</li><li>No need to keep track of the last applied resource manifest — Kubernetes keeps track of it itself.</li><li>No need to know which resource fields are controlled and which are not. Kubernetes stores this information in the resource’s managedFields field.</li><li>Updating the resource and storing information about the controlled fields in a single atomic operation.</li></ul><p>If it were possible to replace 3WM with SSA in Helm, there would be no need to look at manifests from previous releases—except for the cases where you need to figure out which resources need to be deleted entirely if they have already been removed from the chart. Implementing SSA would completely eliminate the issue of orphaned fields in the cluster’s resources.</p><h3>Server-Side Apply in Helm, werf, and other tools</h3><p>Flux, Argo CD, and kubectl/kustomize feature SSA support, although so far only Flux has it enabled by default. Unfortunately, <strong>SSA was never implemented in Helm 3</strong>, although SSA support was in place as early as in Kubernetes 1.16 as Alpha (you could enable it via the feature gate), while in Kubernetes 1.22, it became GA (enabled by default).</p><p>In werf 2.0, we developed and implemented a new deployment engine called <a href="https://github.com/werf/nelm">Nelm</a> that succeeded Helm 3. Not only did we add a lot of new stuff to Nelm, but we completely replaced 3WM with SSA as well.</p><p>Introducing SSA helped us resolve a number of other issues, such as <a href="https://github.com/helm/helm/issues/6969">this one</a> (which still plagues Helm despite that it originally surfaced in version 2 many years ago). SSA has also allowed us to implement a few features, such as automatically discarding resource changes made manually with kubectl edit.</p><p>SSA was introduced in werf 1.2. We have been running it in experimental mode (including in production) for over a year now. All werf 2.0 users use SSA by default. The best part is that all the issues previously associated with 3WM have now been eliminated. At this point, we recommend werf 2.0 and SSA for production use.</p><p>As for werf 1.2 users: the <a href="https://werf.io/docs/v2/resources/migration_from_v1_2_to_v2_0.html">migration</a> to werf 2.0 is very easy and, apart from having to validate the Helm charts more rigorously, little needs to be changed.</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=4d7996354ebe" width="1" height="1" alt=""><hr><p><a href="https://blog.werf.io/ssa-vs-3wm-in-helm-werf-nelm-4d7996354ebe">Server-Side Apply instead of 3-Way Merge: How werf 2.0 solves Helm 3 challenges</a> was originally published in <a href="https://blog.werf.io">werf blog</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[werf 2.0 is out with a new deployment engine Nelm replacing Helm]]></title>
            <link>https://blog.werf.io/werf-2-nelm-replacing-helm-a11980c2bdda?source=rss-71bfdb9446bd------2</link>
            <guid isPermaLink="false">https://medium.com/p/a11980c2bdda</guid>
            <category><![CDATA[werf]]></category>
            <category><![CDATA[cicd]]></category>
            <category><![CDATA[helm]]></category>
            <category><![CDATA[kubernetes]]></category>
            <category><![CDATA[cncf]]></category>
            <dc:creator><![CDATA[Flant staff]]></dc:creator>
            <pubDate>Thu, 16 May 2024 08:00:55 GMT</pubDate>
            <atom:updated>2024-05-16T08:04:14.410Z</atom:updated>
            <content:encoded><![CDATA[<p>For four years, we have been developing and improving werf 1.2. Now, we are proud to unveil werf 2.0 stable! It accumulates all changes delivered to werf throughout the last 300+ releases and comes with Nelm — our new deployment engine, replacing Helm. Nelm is backward compatible with Helm, so there’s no need to make any special changes to the charts — you can use them just like before.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*S1szfWTLdr_g9jfXyRpp4A.png" /></figure><h3>A brief reminder of what werf is</h3><p><em>Feel free to skip this part of the article if you are an existing werf user or just aware of this project!</em></p><p><a href="https://werf.io/">werf</a> is an Open Source tool for powering your Kubernetes-based CI/CD pipelines. It works alongside the CI system of your choice and handles the entire CI/CD lifecycle: builds container images, deploys them to Kubernetes clusters, and eventually deletes them.</p><p>To deploy your apps with werf, you just need a Git repository with a Helm chart, a simple werf.yaml file and a Dockerfile. With such a repo in place, run the werf converge command to build the images, publish them to the container registry, and deploy them to your Kubernetes cluster.</p><p>To make it possible, werf relies on well-known technologies such as Docker, Buildah and Helm (or Nelm since now) under the hood. But it’s more than just a mere wrapper. For example, werf brings several unique features, such as distributed caching out-of-the-box, automatic tagging based on the image content, smart container registry cleanup based on special Git policies, and a number of other niceties.</p><p>Since December 2022, werf is a <a href="https://www.cncf.io/projects/werf/">CNCF Sandbox project</a>. In the past month, we’ve seen 10,000 active projects using werf (usually, one such project equals one Git repository where werf is applied). Today, werf boasts almost 4,000 stars on <a href="https://github.com/werf/werf/">GitHub</a> and 8 years of very active and robust development.</p><h3>What Nelm is and what the future holds for it</h3><p><a href="https://github.com/werf/nelm">Nelm</a> is the biggest change in werf 2.0, so let’s have a better take on it. We can briefly describe Nelm as our (partial) reimplementation of Helm 4 — the release we all have been waiting for but have never seen.</p><p>Helm itself consists of two key components: a chart subsystem and a resource deployment subsystem. We essentially rewrote the deployment subsystem from scratch (while maintaining backward compatibility). We have also improved and continue to refine the chart subsystem.</p><p><strong><em>Note!</em></strong><em> Currently, you can try </em><a href="https://github.com/werf/nelm\"><em>Nelm</em></a><em> only as part of werf. However, in the future, it will become a standalone tool with a convenient API and the option to integrate it into other CI/CD solutions.</em></p><p>This is what Nelm brings into werf:</p><ul><li>The 3-Way Merge has been replaced by Server-Side Apply — a much more robust mechanism for updating resources in a cluster.</li><li>The werf plan command shows the changes that will be made to the cluster during the next deployment.</li><li>Resource operations (including tracking) during deployment have been efficiently parallelized.</li><li>CRDs deployment has been improved.</li><li>Resource tracking has been significantly improved and revamped.</li><li>Numerous Helm bugs and deployment-related issues (e.g., <a href="https://github.com/helm/helm/issues/6969">#6969</a>) have been fixed.</li></ul><p><em>(</em><a href="https://github.com/werf/werf/discussions/5657"><em>Here is</em></a><em> our GitHub discussion thread where most of these Nelm features were announced as we implemented them.)</em></p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*U3jW1Lpim740NrRcMIpoBQ.png" /><figcaption>Nelm in werf displaying deployment progress, logs and events</figcaption></figure><p>We’re working on a few more features, like the ability to set direct resource dependencies instead of using hooks, weights and init containers, as well as the ability for regular resources to use all the advanced hook features. We will announce them later as they become generally available.</p><p>You can learn more about Nelm in the next article we will publish soon. <em>(By the way, </em><a href="https://t.me/werf_io"><em>our Telegram group</em></a><em> is a great way to stay tuned for updates!)</em></p><h3>How to try werf v2.0</h3><p>Nelm has slightly different behavior in some cases (compared to Helm), such as stricter validation of charts. That’s why we decided to make it a default engine not in werf v1.2, but in v2.0 only.</p><p><strong>werf v2.0 is almost fully compatible with v1.2</strong> — <a href="https://werf.io/documentation/v2/resources/migration_from_v1_2_to_v2_0.html">here is the list</a> of backward-incompatible changes (it’s really tiny!). We recommend upgrading to version 2.0, which is much easier than it was when migrating from werf v1.1 to v1.2. What about werf v1.2? It goes into <em>maintenance</em> mode — no new features are planned for it.</p><p>Another big change is about version numbering — starting with werf 2.0, <strong>we will stick to semantic versioning</strong> and plan to release a major version about once a year. This will allow us to streamline and speed up the development without risking compromising backward compatibility in minor or patch versions. On the other hand, this will allow us to be more careful about backward compatibility in minor and patch updates.</p><p>Use the following command to try werf v2.0:</p><pre>source $(trdl use werf 2 stable)</pre><p>As a reminder, werf comes with several <a href="https://werf.io/about/release_channels.html">release channels</a>:</p><ul><li><strong>Alpha</strong>. Quick to deliver new features, but may be unstable.</li><li><strong>Beta</strong>. Best suited for more extensive testing of new features in order to find problems.</li><li><strong>Early-Access</strong>. Safe enough for non-critical environments and for local development; allows you to get new features earlier.</li><li><strong>Stable</strong>. Generally safe and recommended for widespread use in any environment as a default option.</li><li><strong>Rock-Solid</strong>. The most stable channel; recommended for critical environments with strict SLA demands.</li></ul><h3>The long road from werf v1.2 to v2.0</h3><p>Now, back to those very “300+ releases” mentioned earlier — werf has indeed accumulated quite a lot of new features and changes over all those years. Here are some of the most significant features that have emerged in the process of making werf 1.2 (not related to Nelm):</p><ol><li>Building Dockerfiles in werf using Buildah under Linux, Windows, and macOS.</li><li>Layered caching in registry for Dockerfiles.</li><li>Out-of-the-box support for building images for arbitrary platforms and for multiple platforms at once.</li><li>The development mode ( --dev) allowing you no longer worry about determinism and intermediate commits during debugging and developing.</li><li>A new directive for dependencies images has been added to werf.yaml (as of version 1.2.60).</li><li>The werf bundle render command to render bundle manifests for further deployment by third-party tools or for debugging.</li><li>The werf kube-run command, which is similar to werf run, but instead of a local container, it runs a pod in a K8s cluster.</li><li>Status tracking and event collection for all resource types, not just Deployments/StatefulSets/DaemonSets/Jobs.</li><li>The option to wait for an external (out-of-release) Kubernetes resource to be ready before deploying a release resource.</li><li>Migration to the new <a href="https://trdl.dev">trdl</a> update manager. By the way, it is another Open Source project cultivated by the werf team.</li></ol><h3>Official werf resources</h3><ul><li><a href="https://werf.io">werf website</a></li><li><a href="https://t.me/werf_io">werf Telegram chat</a></li><li><a href="https://github.com/werf/werf">werf repo on GitHub</a></li><li><a href="https://github.com/werf/nelm">Nelm repo on GitHub</a></li></ul><p>Let us know how your migration to werf v2.0 is going and stay tuned for more upcoming news regarding werf v2.0.x &amp; Nelm updates!</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=a11980c2bdda" width="1" height="1" alt=""><hr><p><a href="https://blog.werf.io/werf-2-nelm-replacing-helm-a11980c2bdda">werf 2.0 is out with a new deployment engine Nelm replacing Helm</a> was originally published in <a href="https://blog.werf.io">werf blog</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
    </channel>
</rss>