<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:cc="http://cyber.law.harvard.edu/rss/creativeCommonsRssModule.html">
    <channel>
        <title><![CDATA[GrepMyMind - Medium]]></title>
        <description><![CDATA[Feel free to grep &amp; grok your way through my thoughts on Kubernetes, programming, tech &amp; other random bits of knowledge. My randomness is my own &amp; not those of any company I might be working for. I may be right, I may be wrong, but as Deep Thought said, “42.” - Medium]]></description>
        <link>https://grepmymind.com?source=rss----8def803f7d93---4</link>
        <image>
            <url>https://cdn-images-1.medium.com/proxy/1*TGH72Nnw24QL3iV9IOm4VA.png</url>
            <title>GrepMyMind - Medium</title>
            <link>https://grepmymind.com?source=rss----8def803f7d93---4</link>
        </image>
        <generator>Medium</generator>
        <lastBuildDate>Tue, 09 Jun 2026 18:05:58 GMT</lastBuildDate>
        <atom:link href="https://grepmymind.com/feed" rel="self" type="application/rss+xml"/>
        <webMaster><![CDATA[yourfriends@medium.com]]></webMaster>
        <atom:link href="http://medium.superfeedr.com" rel="hub"/>
        <item>
            <title><![CDATA[Argo CD’s ApplicationSet: Dynamic Deployments Across The Fleet]]></title>
            <link>https://grepmymind.com/argo-cd-applicationset-dynamic-deployments-across-the-fleet-7b4e4607f1e4?source=rss----8def803f7d93---4</link>
            <guid isPermaLink="false">https://medium.com/p/7b4e4607f1e4</guid>
            <category><![CDATA[kubernetes]]></category>
            <category><![CDATA[deployment-automation]]></category>
            <category><![CDATA[gitops]]></category>
            <category><![CDATA[argo-cd]]></category>
            <dc:creator><![CDATA[Mike Tougeron]]></dc:creator>
            <pubDate>Mon, 11 Mar 2024 18:11:35 GMT</pubDate>
            <atom:updated>2024-03-11T18:11:35.858Z</atom:updated>
            <content:encoded><![CDATA[<figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*V9WvVgFVBO_Cwxh5KIW7Ug.jpeg" /><figcaption>A series of square cubes representing a deployment across multiple Kubernetes clusters. Generated with <a href="https://firefly.adobe.com/">https://firefly.adobe.com/</a></figcaption></figure><p>Argo CD provides a wide variety of methods to deploy your application(s) to Kubernetes clusters. An Application defines the source of the deployment and the cluster you want to deploy it to but an ApplicationSet allows you to deploy an Application across multiple clusters. Let’s start simple and break down each piece until we have a full-fledged, dynamic deployment system.</p><p>We’ll start with the following Application that deploys the <a href="https://github.com/mtougeron/k8s-pvc-tagger">k8s-pvc-tagger</a> using Helm.</p><pre>apiVersion: argoproj.io/v1alpha1<br>kind: Application<br>metadata:<br>  name: k8s-pvc-tagger<br>  namespace: argocd<br>spec:<br>  destination:<br>    namespace: k8s-pvc-tagger <br>    server: https://my-clusters-cluster.example.com<br>  project: default<br>  source:<br>    chart: k8s-pvc-tagger<br>    repoURL: https://mtougeron.github.io/helm-charts/<br>    targetRevision: 2.0.8<br>    helm:<br>      releaseName: k8s-pvc-tagger</pre><p>This is great if you’re deploying to a single cluster but what about if you have more than one? You could create multiple Application resources but that would be a pain and isn’t scalable. This is where Argo CD’s ApplicationSet comes into play.</p><p>An ApplicationSet (<a href="https://argo-cd.readthedocs.io/en/stable/operator-manual/applicationset/">docs</a>) allows you to automatically generate a list of Application resources to deploy in a templated fashion. For example, you can use an ApplicationSet to deploy the same Application to multiple clusters or use it to deploy your application based on pull requests to a repository. In this post, I’ll show you how to use the ApplicationSet generators (<a href="https://argo-cd.readthedocs.io/en/stable/operator-manual/applicationset/Generators/">docs</a>) to do several powerful deployments to ease the way you use Argo CD.</p><p>Before we get into specific examples, it’s important to understand how an ApplicationSet works. You can use a variety of different generators to determine what the deployment should look like. These generators can be based on resources like files or directories in a repo, open pull requests, or labels on the Kubernetes clusters that are registered in Argo CD. Each of these generators can be combined together with the matrix or merge generators to create a complex criteria for selection. In addition, you can use values from the files inside of a file or git generator. I’ll go through these different generators in the sections below.</p><p>In my opinion, the simplest of generators is the clusters generator (<a href="https://argo-cd.readthedocs.io/en/stable/operator-manual/applicationset/Generators-Cluster/">docs</a>) that allows you to deploy to multiple clusters based on the cluster labels assigned when a Kubernetes cluster was added to Argo CD. This generator filters the available clusters based on those labels and creates anApplication resource for each cluster that is found. Here’s an example of that approach.</p><pre>apiVersion: argoproj.io/v1alpha1<br>kind: ApplicationSet<br>metadata:<br>  name: k8s-pvc-tagger<br>  namespace: argocd<br>spec:<br>  goTemplate: true<br>  generators:<br>  - clusters:<br>      selector:<br>        matchLabels:<br>          environment: production<br>  template:<br>    metadata:<br>      name: &#39;k8s-pvc-tagger-{{.name}}&#39;<br>    spec:<br>      destination:<br>        namespace: k8s-pvc-tagger <br>        server: &#39;{{.server}}&#39;<br>      project: default<br>      source:<br>        chart: k8s-pvc-tagger<br>        repoURL: https://mtougeron.github.io/helm-charts/<br>        targetRevision: 2.0.8<br>        helm:<br>          releaseName: k8s-pvc-tagger</pre><p>It’s important to use the {{.name}} variable (or similar) so it creates a unique Application resource. Otherwise you will have conflicts and that’s never a good thing. Second, you’ll see the {{.server}} variable that defines the cluster’s server url that the Application is being deployed to. The rest looks like it did with a standard Application.</p><p>But what if you wanted to deploy different versions of k8s-pvc-tagger based on the environment; after all it’s always good to test in non-prod first, right? AnApplicationSet allows for this as well. In this example, we’re defining that the stage environment should run version 2.0.8 while production is still running 2.0.7. We’re able to use the templating options to dynamically decide which version to run.</p><pre>apiVersion: argoproj.io/v1alpha1<br>kind: ApplicationSet<br>metadata:<br>  name: k8s-pvc-tagger<br>  namespace: argocd<br>spec:<br>  goTemplate: true<br>  generators:<br>  - clusters:<br>      selector:<br>        matchLabels:<br>          environment: stage<br>      values:<br>        version: 2.0.8<br>  - clusters:<br>      selector:<br>        matchLabels:<br>          environment: production<br>      values:<br>        version: 2.0.7<br>  template:<br>    metadata:<br>      name: &#39;k8s-pvc-tagger-{{.name}}&#39;<br>    spec:<br>      destination:<br>        namespace: k8s-pvc-tagger <br>        server: &#39;{{.server}}&#39;<br>      project: default<br>      source:<br>        chart: k8s-pvc-tagger<br>        repoURL: https://mtougeron.github.io/helm-charts/<br>        targetRevision: &#39;{{.values.version}}&#39;<br>        helm:<br>          releaseName: k8s-pvc-tagger</pre><p>Let’s take this a step farther though. What if we wanted dev to use a pre-release version, stage to run 2.0.8 and production to run 2.0.7? This gets a little more complicated because Helm with Argo CD doesn’t allow you to install a chart without defining a version. This means we need to get a bit tricky and toggle between installing the Helm chart from git or the chart repository.</p><pre>apiVersion: argoproj.io/v1alpha1<br>kind: ApplicationSet<br>metadata:<br>  name: k8s-pvc-tagger<br>  namespace: argocd<br>spec:<br>  goTemplate: true<br>  generators:<br>  - clusters:<br>      selector:<br>        matchLabels:<br>          environment: dev<br>      values:<br>        version: HEAD<br>  - clusters:<br>      selector:<br>        matchLabels:<br>          environment: stage<br>      values:<br>        version: 2.0.8<br>  - clusters:<br>      selector:<br>        matchLabels:<br>          environment: production<br>      values:<br>        version: 2.0.7<br>  template:<br>    metadata:<br>      name: &#39;k8s-pvc-tagger-{{.name}}&#39;<br>    spec:<br>      destination:<br>        namespace: k8s-pvc-tagger <br>        server: &#39;{{.server}}&#39;<br>      project: default<br>      source:<br>        chart: &#39;{{if ne .values.version &quot;HEAD&quot;}}k8s-pvc-tagger{{end}}&#39;<br>        path: &#39;{{if eq .values.version &quot;HEAD&quot;}}charts/k8s-pvc-tagger{{end}}&#39;<br>        repoURL: &#39;{{if ne .values.version &quot;HEAD&quot;}}https://mtougeron.github.io/helm-charts/{{else}}https://github.com/mtougeron/k8s-pvc-tagger{{end}}&#39;<br>        targetRevision: &#39;{{.values.version}}&#39;<br>        helm:<br>          releaseName: k8s-pvc-tagger</pre><p>In that example, if the .values.version is HEAD we set an empty value for chart and instead set the path to the Helm chart in git. Similarly we toggle between the chart repository and the git repo in the repoURL field. This is handy for doing an automated pre-release to the dev clusters whenever main is updated but staggering the deployments to the stage and production environments.</p><p>If we wanted to alter the values for the Helm chart based on the cluster’s environment we can go one step farther with the templating and set the valuesObject in the Application&#39;s template. In this example, for the dev environment we’ll run it with debug: true so that we can see more details in the logs. We’ll also adjust the amount of cpu requested because we run a larger cluster in production than we do in the other environments.</p><pre>apiVersion: argoproj.io/v1alpha1<br>kind: ApplicationSet<br>metadata:<br>  name: k8s-pvc-tagger<br>  namespace: argocd<br>spec:<br>  ...<br>  template:<br>    spec:<br>      source:<br>        helm:<br>          valuesObject:<br>            debug: &#39;{{if eq .values.environment &quot;dev&quot;}}true{{end}}&#39;<br>            resources:<br>              requests:<br>                cpu: &#39;{{if eq .values.environment &quot;production&quot;}}100m{{else}}50m{{end}}&#39;</pre><p>Following the idea to the next level, let’s run a version of a deployment based on a PR. This is helpful for testing changes before those changes are merged. In this scenario we’ll use the helm-guestbook chart that Argo CD provides.</p><pre>apiVersion: argoproj.io/v1alpha1<br>kind: ApplicationSet<br>metadata:<br>  name: guestbook<br>  namespace: argocd<br>spec:<br>  goTemplate: true<br>  generators:<br>  - pullRequest<br>      github:<br>        owner: argoproj<br>        repo: argocd-example-apps<br>        labels:<br>        - ok-to-test<br>  template:<br>    metadata:<br>      name: &#39;guestbook-{{.branch_slug}}-{{.number}}&#39;<br>    labels:<br>      branch: &#39;guestbook-{{.branch}}&#39;<br>    spec:<br>      destination:<br>        namespace: &#39;guestbook-{{.branch_slug}}-{{.number}}&#39;<br>        server: https://kubernetes.default.svc<br>      project: default<br>      source:<br>        path: helm-guestbook<br>        repoURL: https://github.com/argoproj/argocd-example-apps<br>        targetRevision: &#39;{{.head_sha}}&#39;<br>        helm:<br>          releaseName: guestbook<br>          valuesObject:<br>            ingress:<br>              hosts:<br>              - &#39;https://guestbook-{{.branch_slug}}-{{.number}}.example.com&#39;<br>    syncPolicy:<br>      syncOptions:<br>      - CreateNamespace=true</pre><p>Breaking down the changes by section, you’ll see the generators is now using pullRequest (<a href="https://argo-cd.readthedocs.io/en/stable/operator-manual/applicationset/Generators-Pull-Request/">docs</a>). In our case, we’re using GitHub for the source but it supports options like GitLab, Bitbucket, and others. It’s defined the repo that the PRs are from and only creates an Application for the PR if the PR has the label ok-to-test on it. This helps prevent tests from running unless you’ve authorized them.</p><p>In the next section you’ll see that it uses .branch_slug and .number to add more information to the name so that it is more unique. You might have also noted that we added labels to the metadata so that we can filter in the Argo CD UI to all the Application resources created for a branch in the repo for guestbook. Most importantly, the targetRevision is set to the .head_sha so that it uses the code from the PR’s revision.</p><p>In the valuesObject we dynamically assign the hosts so that each PR has its own URL to test against. Other values can be customized as well so that the deployment for the PR best represents the changes being made.</p><p>Lastly, the spec.destination.namespace is unique per branch &amp; PR as well. This allows for each PR to be deployed into its own Kubernetes Namespace for isolation. In order for this to work it also needs to have the CreateNamespace=true option set.</p><p>The merge generator is pretty cool IMO because it can allow for filtering the clusters found from the clusters generator based on the values found in the git generator. Let’s take an example where you have 100 clusters but for some reason you want to only install the k8s-pvc-tagger Helm chart into 10 of them. You <em>could</em> label each cluster with a flag that defines which clusters run that app. However, if you decided to add or remove it from a cluster you have to add that new label to the cluster which is generally a more operations focused task. Wouldn’t it be easier to just drop a values file into a directory of a git repo and have it automatically installed? Or have a single file that defines which version of a Helm chart to install?</p><pre>apiVersion: argoproj.io/v1alpha1<br>kind: ApplicationSet<br>metadata:<br>  name: k8s-pvc-tagger<br>  namespace: argocd<br>spec:<br>  goTemplate: true<br>  generators:<br>  - merge:<br>      mergeKeys:<br>      - name<br>      generators:<br>      - clusters:<br>          selector:<br>            matchLabels:<br>              argocd.argoproj.io/secret-type: cluster<br>      - git:<br>          repoURL: https://github.com/mtougeron/my-deploy-repo<br>          revision: HEAD<br>          files:<br>          - path: &quot;clusters/*.yaml&quot;<br>    selector:<br>      matchExpressions:<br>      - key: k8s-pvc-tagger<br>        operator: Exists<br>  template:<br>    metadata:<br>      name: &#39;k8s-pvc-tagger-{{.name}}&#39;<br>    spec:<br>      destination:<br>        namespace: k8s-pvc-tagger <br>        server: &#39;{{.server}}&#39;<br>      project: default<br>      source:<br>        chart: k8s-pvc-tagger<br>        repoURL: https://mtougeron.github.io/helm-charts/<br>        targetRevision: &#39;{{index . &quot;k8s-pvc-tagger&quot;}}&#39;<br>        helm:<br>          releaseName: k8s-pvc-tagger</pre><p>In the mtougeron/my-deploy-repo repository in the clusters directory a set of yaml files exist that have the name of the cluster along with each chart and their version to install.</p><pre>name: my-cluster-name<br>k8s-pvc-tagger: 2.0.8<br>guestbook: HEAD<br>some-other-app: 1.2.3</pre><p>Argo CD will first get the list of clusters that exist and merge that list with the files found in that directory. It will then filter that list to the files that have a variable called k8s-pvc-tagger. Lastly, it uses the value of that variable to set the targetRevision to install.</p><p>While not specific to an ApplicationSet, a feature that I really like in Argo CD is the ability to use sources instead of source for an Application. This allows you to use more than one repository in your deployment. Why would you want this you ask? A common practice is to use an open source Helm chart but have your own configuration repository. Let’s say I had a configuration repository that contains my values file(s) for the Helm chart.</p><pre>├── guestbook<br>├── k8s-pvc-tagger<br>│   ├── dev.yaml<br>│   ├── production.yaml<br>│   └── stage.yaml<br>└── some-other-app</pre><p>Now I want to use these Helm values files when rendering the chart via Argo CD. I setup two sources (instead of using source). One for the Helm chart and one that references my-config-repo where the values file(s) live. The values files are stored in the values directory and broken down by chart. It aliases the my-config-repo repository as $values so that it can be used in the first source for where to pull the files from.</p><pre>spec:<br>  template:<br>    spec:<br>      sources:<br>      - repoURL: https://mtougeron.github.io/helm-charts/<br>        chart: k8s-pvc-tagger<br>        version: 2.0.8<br>        helm:<br>          releaseName: k8s-pvc-tagger<br>          valueFiles:<br>            - values.yaml<br>            - $values/{{.metadata.labels.environment}}.yaml<br>      - repoURL: https://github.com/mtougeron/my-config-repo<br>        path: &#39;values/k8s-pvc-tagger&#39;<br>        targetRevision: HEAD<br>        ref: values</pre><p>As you see in that example, it also dynamically points to the values file for the environment label set for the cluster in Argo CD.</p><p>When you sum it all together, as seen below, you have a powerful way to dynamically filter and set the version of the charts you want to install on each cluster.</p><pre>apiVersion: argoproj.io/v1alpha1<br>kind: ApplicationSet<br>metadata:<br>  name: k8s-pvc-tagger<br>  namespace: argocd<br>spec:<br>  goTemplate: true<br>  generators:<br>  - clusters:<br>      selector:<br>        matchLabels:<br>          environment: dev<br>      values:<br>        version: HEAD<br>  - merge:<br>      mergeKeys:<br>      - name<br>      generators:<br>      - clusters:<br>          selector:<br>            matchLabels:<br>              argocd.argoproj.io/secret-type: cluster<br>              environment: stage<br>      - git:<br>          repoURL: https://github.com/mtougeron/my-config-repo<br>          revision: HEAD<br>          files:<br>          - path: &quot;clusters/*.yaml&quot;<br>    selector:<br>      matchExpressions:<br>      - key: k8s-pvc-tagger<br>        operator: Exists<br>  - merge:<br>      mergeKeys:<br>      - name<br>      generators:<br>      - clusters:<br>          selector:<br>            matchLabels:<br>              argocd.argoproj.io/secret-type: cluster<br>              environment: production<br>      - git:<br>          repoURL: https://github.com/mtougeron/my-config-repo<br>          revision: HEAD<br>          files:<br>          - path: &quot;clusters/*.yaml&quot;<br>    selector:<br>      matchExpressions:<br>      - key: k8s-pvc-tagger<br>        operator: Exists<br>  template:<br>    metadata:<br>      name: &#39;k8s-pvc-tagger-{{.name}}&#39;<br>    spec:<br>      destination:<br>        namespace: k8s-pvc-tagger <br>        server: &#39;{{.server}}&#39;<br>      project: default<br>      sources:<br>      - repoURL: &#39;{{if ne .values.version &quot;HEAD&quot;}}https://mtougeron.github.io/helm-charts/{{else}}https://github.com/mtougeron/k8s-pvc-tagger{{end}}&#39;<br>        chart: &#39;{{if ne .values.version &quot;HEAD&quot;}}k8s-pvc-tagger{{end}}&#39;<br>        path: &#39;{{if eq .values.version &quot;HEAD&quot;}}charts/k8s-pvc-tagger{{else}}{{index . &quot;k8s-pvc-tagger&quot;}}{{end}}&#39;<br>        targetRevision: &#39;{{.values.version}}&#39;<br>        helm:<br>          releaseName: k8s-pvc-tagger<br>          valueFiles:<br>            - values.yaml<br>            - $values/{{.metadata.labels.environment}}.yaml<br>      - repoURL: https://github.com/mtougeron/my-config-repo<br>        path: &#39;values/k8s-pvc-tagger&#39;<br>        targetRevision: HEAD<br>        ref: values</pre><p>Hopefully you’ve found these examples helpful and agree that using an ApplicationSet is a powerful way to do deployments. If you have any questions, I’m available on the <a href="https://slack.cncf.io">CNCF slack</a> and I’d be happy to provide more details. You can also watch some of my talks (<a href="https://www.youtube.com/watch?v=rh95h0uOEc8">GitOps Me Some of That! Managing Hundreds of Clusters with Argo CD</a> or <a href="https://www.youtube.com/watch?v=casLvZWlIDw">Hundreds of Clusters Sitting in a Tree with Argo CD</a>) on the same subject as well.</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=7b4e4607f1e4" width="1" height="1" alt=""><hr><p><a href="https://grepmymind.com/argo-cd-applicationset-dynamic-deployments-across-the-fleet-7b4e4607f1e4">Argo CD’s ApplicationSet: Dynamic Deployments Across The Fleet</a> was originally published in <a href="https://grepmymind.com">GrepMyMind</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Kubernetes clusters for everyone using vcluster]]></title>
            <link>https://grepmymind.com/kubernetes-clusters-for-everyone-using-vcluster-8d2de91243d4?source=rss----8def803f7d93---4</link>
            <guid isPermaLink="false">https://medium.com/p/8d2de91243d4</guid>
            <category><![CDATA[kubernetes-operations]]></category>
            <category><![CDATA[devops]]></category>
            <category><![CDATA[virtualization]]></category>
            <category><![CDATA[kubernetes]]></category>
            <category><![CDATA[vcluster]]></category>
            <dc:creator><![CDATA[Mike Tougeron]]></dc:creator>
            <pubDate>Mon, 20 Dec 2021 18:37:46 GMT</pubDate>
            <atom:updated>2021-12-20T18:37:46.427Z</atom:updated>
            <content:encoded><![CDATA[<p>I recently started playing around with a powerful Kubernetes tool called <a href="https://www.vcluster.com/">vcluster</a> from <a href="https://loft.sh/">Loft Labs</a>. vcluster provides an easy way of creating virtual Kubernetes clusters inside of a regular cluster but scoped within a namespace. What’s really neat is that the resources created can still be restricted by the host cluster’s RBAC, quotas and other security policies. While I’ve only started to touch the surface of what vcluster can do I can already see some long-term, high-impact use-cases.</p><figure><img alt="a cube with other cubes attached to it" src="https://cdn-images-1.medium.com/max/1024/1*IyFJHCWTOyAOYa1D8_aYaA.jpeg" /></figure><h3>Custom Operators use-case</h3><p>In my environment tenant users are not allowed to create or modify <a href="https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/">CustomResourceDefinitions</a> (CRDs) that <a href="https://kubernetes.io/docs/concepts/extend-kubernetes/operator/">Custom Operators</a> use (they can run the operator but not manage the CRDs). They have to go through a ticketing &amp; deployment process which adds overhead for the SRE team to review and delays for the development team. It’s unfortunately a lose-lose situation but the multi-tenancy restrictions and requirements must be respected at all times. It’s even more difficult to develop for a new Custom Operator where the CRD hasn’t been fully designed yet. Using tools like <a href="https://kind.sigs.k8s.io/">kind</a> are useful for doing development on these operators but they don’t always let you test with the application full stack running along side it.</p><p>But this is where vcluster can come to the rescue! There can be a cluster where the tenant team has a namespace, with appropriate RBAC permissions, from which they can launch a virtual Kubernetes cluster using vcluster. The team can then do their development using that virtualized cluster. Where things get interesting is if this is taken to the next level and the <em>production</em> application is run inside of the vcluster. Suddenly the development team can run their custom operator however they’d like and it would still respect the multi-tenant nature of the host cluster. For me running production workloads with vcluster is still a long ways off but the potential for it has me super excited. Even if running the entire application stack isn’t feasible, running the custom operator could be.</p><h3>Development environment use-case</h3><p>Another powerful option for vcluster is to manage developer workflows. Like git allows you to easy toggle between branches vcluster could allow you to toggle between virtual clusters per feature branch. Each feature branch could be fully deployed in isolation from each other. When the feature branch work is finish the virtual cluster is removed and all resources are automatically cleaned up. Because of the way that pods are created on the nodes the host cluster’s namespace quotas still apply. This protects against the ever frustrating resource creep that can happen.</p><h3>Install and run vcluster</h3><p>Setting up a virtual cluster using vcluster is wickedly straight-forward. There’s a little bit of prep work from the cluster operator team and then the tenants can manage the vcluster themselves.</p><h4>Setting up the namespace</h4><p>Start by creating a new namespace, in this case we’ll call it team-touge, and give the tenant the necessary access.</p><pre>kubectl create namespace team-touge</pre><p>Create a RoleBinding that gives users admin access to the team-touge namespace. Depending on your environment you may want to use something more restrictive than admin but for my use-case it is appropriate.</p><pre>kind: RoleBinding<br>apiVersion: rbac.authorization.k8s.io/v1<br>metadata:<br>  name: team-admin<br>  namespace: team-touge<br>subjects:<br>  - kind: Group<br>    name: some-team-group-name<br>    apiGroup: rbac.authorization.k8s.io<br>roleRef:<br>  kind: ClusterRole<br>  name: admin<br>  apiGroup: rbac.authorization.k8s.io</pre><p>After access to the host cluster namespace has been given the tenants can take over and manage everything from this point forward.</p><h4>Setup vcluster</h4><p>After the <a href="https://www.vcluster.com/docs/getting-started/setup">vcluster cli is installed</a> we need to create a yaml config file that tells vcluster as run as non-root.</p><pre># vcluster.yaml<br>securityContext:<br>  runAsUser: 12345<br>  runAsNonRoot: true</pre><p>We then use this config file when launching the virtual cluster called touge inside of the team-touge namespace.</p><pre>$&gt; vcluster create touge -n team-touge -f vcluster.yaml<br>[info]   execute command: helm upgrade touge vcluster --repo <a href="https://charts.loft.sh">https://charts.loft.sh</a> --version 0.5.0-alpha.7 --kubeconfig /var/folders/rn/hrzkvjz5325dvtxz2ztzyf480000gp/T/1416406882 --namespace team-touge --install --repository-config=&#39;&#39; --values /var/folders/rn/hrzkvjz5325dvtxz2ztzyf480000gp/T/3591452462 --values vcluster.yaml<br>[done] √ Successfully created virtual cluster touge in namespace team-touge. Use &#39;vcluster connect touge --namespace team-touge&#39; to access the virtual cluster</pre><p>After a few seconds we now have some new pods running in the team-touge namespace.</p><pre>$&gt; kubectl get pods -n team-touge<br>coredns-7bbd4f6c46-pvqcg-x-kube-system-x-touge  1/1  Running 0 1m16s<br>touge-0                                         2/2  Running 0 3m30s</pre><p>The touge-0 pod is vcluster with the k3s control-plane and the coredns-7bbd4f6c46-pvqcg-x-kube-system-x-touge pod is the coredns deployment for inside the virtual cluster. Now that vcluster is running we can connect to that virtual cluster via port-forwarding. This means that you can configure RBAC rules for whether or not a user is allowed to connect to it at all. You could also setup <a href="https://www.vcluster.com/docs/operator/external-access">an Ingress for connectivity</a> if you’d prefer.</p><pre>$&gt; vcluster connect touge -n team-touge<br>[done] √ Virtual cluster kube config written to: ./kubeconfig.yaml. You can access the cluster via `kubectl --kubeconfig ./kubeconfig.yaml get namespaces`<br>[info]   Starting port-forwarding at 8443:8443<br>Forwarding from 127.0.0.1:8443 -&gt; 8443<br>Forwarding from [::1]:8443 -&gt; 8443</pre><p>In another window I configure my environment to use this newly generated kube config and get a list of pods in the kube-system namespace.</p><pre>$&gt; export KUBECONFIG=$(pwd)/kubeconfig.yaml<br>$&gt; kubectl get pods -n kube-system</pre><pre>NAME                       READY   STATUS    RESTARTS   AGE<br>coredns-7bbd4f6c46-pvqcg   1/1     Running   0          7m31s</pre><p>If you look back, you’ll see this pod name matches to the coredns-7bbd4f6c46-pvqcg-x-kube-system-x-touge that we saw earlier. vcluster will sync the pods (but not the deployment) and delegate scheduling to the host cluster. That way it ends up on a real node and can serve real workloads.</p><h3>Inspecting the virtual cluster</h3><p>An interesting aspect of the way the vcluster works is how it creates “fake” nodes that the workloads run on. Once a pod is scheduled on the host cluster’s node it will appear inside of the virtual cluster as well with partial data. For example, the IP address of the node will be different and the node Conditions are specific to the virtual cluster’s management of the node. The only pods it shows as running on the node are those that are running inside the virtual cluster. This keeps the data segmented from the host cluster and any other vcluster’s that may be running.</p><pre>$&gt; kubectl get nodes<br>NAME                                  STATUS   ROLES    AGE   VERSION<br>vmss-agent-worker-touge-cmixj000000   Ready    &lt;none&gt;   10m   v1.20.11+k3s2</pre><pre>$&gt; kubectl describe node vmss-agent-worker-touge-cmixj000000<br>Name:               vmss-agent-worker-touge-cmixj000000<br>Roles:              &lt;none&gt;<br>Labels:             beta.kubernetes.io/arch=amd64<br>                    beta.kubernetes.io/os=linux<br>                    kubernetes.io/arch=amd64<br>                    kubernetes.io/hostname=fake-vmss-agent-worker-touge-cmixj000000<br>                    kubernetes.io/os=linux<br>                    vcluster.loft.sh/fake-node=true<br>&lt;snip&gt;<br>Non-terminated Pods:          (1 in total)<br>  Namespace                   Name                        CPU Requests  CPU Limits  Memory Requests  Memory Limits  AGE<br>  ---------                   ----                        ------------  ----------  ---------------  -------------  ---<br>  kube-system                 coredns-7bbd4f6c46-pkdgp    100m (0%)     1 (6%)      70Mi (0%)        170Mi (0%)     10m</pre><p>Let’s create a secret in the virtual cluster’s kube-system namespace.</p><pre>$&gt; kubectl create secret generic my-secret -n kube-system --from-literal=TEST=foo<br>secret/my-secret created</pre><pre>$&gt; kubectl get secret -n kube-system my-secret<br>NAME        TYPE     DATA   AGE<br>my-secret   Opaque   1      20s</pre><p>But if we try to access this secret from host cluster it won’t exist.</p><pre>$&gt; kubectl get secret -n kube-system my-secret<br>Error from server (NotFound): secrets &quot;my-secret&quot; not found</pre><pre>$&gt; kubectl get secret -n team-touge my-secret<br>Error from server (NotFound): secrets &quot;my-secret&quot; not found</pre><p>Only resources such as pods, services and ingresses are sync’d with the host cluster by default. <a href="https://www.vcluster.com/docs/config-reference#syncer-flags">Additional resources can be sync’d</a> between the host and virtual cluster but it may need additional RBAC permissions that a standard user won’t have.</p><h3>Installing Vault Secrets Operator</h3><p>As I’ve <a href="https://grepmymind.com/kubernetes-vault-afd5d250302c">talked about before, I love the Vault Secrets Operator</a> so let’s install it on the new virtual cluster.</p><p>If you remember, in the host cluster, we gave users the ability to admin the team-touge namespace but we didn’t give them access to create new namespaces. However, they do have admin access to the virtual cluster (more on that later) so a new namespace can still be created. This new namespace exists only within the virtual cluster.</p><pre>$&gt; kubectl create namespace vault-secrets-operator<br>namespace/vault-secrets-operator created</pre><p>Next we’ll install the chart repositories and the charts. PLEASE NOTE that this is NOT how you would install Vault in a production environment.</p><pre>$&gt; helm repo add hashicorp <a href="https://helm.releases.hashicorp.com">https://helm.releases.hashicorp.com</a><br>&quot;hashicorp&quot; has been added to your repositories<br>$&gt; helm repo add ricoberger <a href="https://ricoberger.github.io/helm-charts">https://ricoberger.github.io/helm-charts</a><br>&quot;ricoberger&quot; has been added to your repositories</pre><pre>$&gt; helm repo update<br>Hang tight while we grab the latest from your chart repositories...<br>...Successfully got an update from the &quot;ricoberger&quot; chart repository<br>...Successfully got an update from the &quot;hashicorp&quot; chart repository<br>Update Complete. ⎈Happy Helming!⎈</pre><pre>$&gt; helm install --namespace vault-secrets-operator vault hashicorp/vault</pre><pre># Follow the steps from <a href="https://learn.hashicorp.com/tutorials/vault/kubernetes-minikube?in=vault/kubernetes#initialize-and-unseal-vault">https://learn.hashicorp.com/tutorials/vault/kubernetes-minikube?in=vault/kubernetes#initialize-and-unseal-vault</a> to unseal Vault</pre><pre>$&gt; helm install --namespace vault-secrets-operator --set environmentVars\[0\].name=VAULT_TOKEN --set environmentVars\[0\].value=your-vault-token vault-secrets-operator ricoberger/vault-secrets-operator</pre><p>We can now see that the application is installed along with the CRDs.</p><pre>$&gt; kubectl get crd vaultsecrets.ricoberger.de<br>NAME                         CREATED AT<br>vaultsecrets.ricoberger.de   2021-12-19T23:35:48Z</pre><p>If we go back to the host cluster, you can see that the CRD is not installed there.</p><pre>$&gt; unset KUBECONFIG<br>$&gt; kubectl get crd vaultsecrets.ricoberger.de<br>Error from server (NotFound): customresourcedefinitions.apiextensions.k8s.io &quot;vaultsecrets.ricoberger.de&quot; not found</pre><h3>A note on security</h3><p>I’m still digging into this but one of the things about vcluster that isn’t clear to me is the cluster access permissions. It appears to me that all users, using the vcluster connect, have admin access. You can restrict access to the vcluster by restricting who can port-forward or access the kubeconfig secret in the host namespace but that blanket access/restriction is limiting in scope. While for a demo/testing environment this may acceptable, you’ll want to make sure you have something more robust before using it for anything beyond that. There may be something already in place for this but I’m not seeing it at first glance.</p><p>On the plus side with security, because the virtualized cluster still has pods running within the host cluster, any rules you may have in place with a PodSecurityPolicy (note those are <a href="https://kubernetes.io/blog/2021/04/06/podsecuritypolicy-deprecation-past-present-and-future/">now deprecated</a>) or <a href="https://www.openpolicyagent.org/">Open Policy Agent</a> are still enforced. If you don’t allow a pod to be deployed on a control-plane node via an OPA policy, that pod will fail to launch when vcluster creates it.</p><pre>Events:<br>  Type     Reason     Age                 From        Message<br>  ----     ------     ----                ----        -------<br>  Warning  SyncError  17s (x14 over 59s)  pod-syncer  Error syncing to physical cluster: admission webhook &quot;mutating-webhook.openpolicyagent.org&quot; denied the request: The Pod spec.tolerations[] contains a toleration for a control plane Node taint, but the Pod is not within an approved control plane Namespace</pre><h3>Where to next?</h3><p>In this post I focused on the custom operators use-case but there are a few others that I’m playing around with. I’d really like to be able to deploy, as close as possible, the deployments from my core cluster into a virtual cluster for CI/CD. That would greatly speed up my testing as the cloud resources wouldn’t need to be created each time in order to test with a clean environment. It would also be nice to have teams test deploying their full application stack inside a virtual cluster for things like canary or blue/green deployments. Being able to fully isolate and test different versions of the applications could provide a lot of value to the application teams.</p><p>I’ll definitely be watching this project closely over the coming months!</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=8d2de91243d4" width="1" height="1" alt=""><hr><p><a href="https://grepmymind.com/kubernetes-clusters-for-everyone-using-vcluster-8d2de91243d4">Kubernetes clusters for everyone using vcluster</a> was originally published in <a href="https://grepmymind.com">GrepMyMind</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[PagerDuty OncallStatus for MacOS]]></title>
            <link>https://grepmymind.com/pagerduty-oncallstatus-for-macos-4f5b46f85572?source=rss----8def803f7d93---4</link>
            <guid isPermaLink="false">https://medium.com/p/4f5b46f85572</guid>
            <category><![CDATA[pagerduty]]></category>
            <category><![CDATA[macos]]></category>
            <category><![CDATA[go]]></category>
            <category><![CDATA[programming]]></category>
            <dc:creator><![CDATA[Mike Tougeron]]></dc:creator>
            <pubDate>Mon, 16 Aug 2021 16:08:42 GMT</pubDate>
            <atom:updated>2021-08-16T19:42:30.463Z</atom:updated>
            <content:encoded><![CDATA[<p>I’m now happy to introduce the new open source application <a href="https://github.com/mtougeron/oncall-status">OncallStatus for PagerDuty on MacOS</a>!</p><figure><img alt="drawing of an old school pager" src="https://cdn-images-1.medium.com/max/439/1*hX38eE3QPJDWnh2YEdmCLQ.png" /></figure><p>Being oncall is never fun and rarely easy. On top of that it can also be _noisy_. The phone alerts are always loud as they try to get your attention. Beyond that, when I’m already at the computer the noise from my phone bothers me immensely. Pre-covid when I was in the office it would disturb my teammates and while working from home it creates excess stress that I don’t need. I’d rather have my computer notify me instead. Rather than complain and do nothing, I still bitched but did something about it.</p><p>The OncallStatus app was my solution to this problem. This new app works with your PagerDuty oncall schedules and notifies you when a new incident has been created.</p><figure><img alt="example notification" src="https://cdn-images-1.medium.com/max/351/1*OdQN1b0FQN_UyIZdbkTPzg.png" /></figure><p>You can filter the events based on high &amp; low priority or the escalation level you are assigned to. No need to get that extra visual noise for the low priority stuff. Now I can set my phone to silent when I’m at my desk and the noise is much easier on my mind.</p><h3>Written in Go</h3><p>As I’ve been learning/working with Go lately I decided to try and use that to write this app. Using the cgo bindings it is possible to call Cocoa functions and have native MacOS functionality built into your Go app. I started with using a nice library called <a href="https://github.com/caseymrm/menuet">menuet</a> but unfortunately it hasn’t been kept up-to-date and had a tendency to crash a lot on startup. Looking for another library that could do menu bar updates I came across <a href="https://github.com/getlantern/systray">getlantern/systray</a>. While not as feature rich as menuet, it worked AND it promised cross-platform compatibility. While I haven’t been able to work on the Linux &amp; Windows support yet, I’m happy that it should be possible.</p><figure><img alt="Shows the menu options for the application" src="https://cdn-images-1.medium.com/max/405/1*1AXiTmQ9T2Nh2UyBhXSNlQ.png" /></figure><p>Unfortunately when it came to notifications the only cross-platform library <a href="https://github.com/0xAX/notificator">I could find</a> was severely lacking in functionality. So in the spirit of an MVP I decided to stick with just MacOS to start. I took a bit of code from menuet and some from a utility called <a href="https://github.com/keybase/go-notifier">go-notifier</a>. This allowed me to create a notification that was clickable and would open a web browser straight to the incident page. Because most of <a href="https://github.com/mtougeron/oncall-status/blob/main/pkg/notification/notification.m">this code was Cocoa</a> it took me _a long time_ to get it working. It’s not the greatest code but it does get the job done. I know I’m going to have to go back and refactor it at some point.</p><p>A neat idea that I learned from menuet is checking GitHub for new releases of OncallStatus. When the tag used for the current running version doesn’t match the latest release on GitHub it creates a notification letting you know there is a new version. This could be done with just a few lines of code thanks to <a href="https://github.com/google/go-github">google/go-github</a>.</p><h3>Authentication</h3><p>When it comes to authentication security is important. The login process connects to PagerDuty using <a href="https://oauth.net/2/pkce/">Proof Key for Code Exchange (PKCE)</a>. The <a href="https://developer.pagerduty.com/docs/app-integration-development/oauth-2-pkce/">PagerDuty API PKCE docs</a> did a decent job of explaining how to use it. At the moment the app is only requesting read access to PagerDuty but at some point I’d like to expand this to write access as well so that you can acknowledge the alerts directly from the app. Because PKCE is part of an oauth workflow I never had to touch the user’s username/password. The authentication and authorization all happen on the PagerDuty side.</p><p>Next I had to come up with a way to securely store the resulting API Token. I went with <a href="https://github.com/keybase/go-keychain">keybase/go-keychain</a> which allows me to save the data into the MacOS keychain. For the cross-platform work I eventually want to do it will also allow me to save to a Linux keychain as well.</p><h3>Configuring the OncallStatus.App bundle</h3><p>Because this is a MacOS application I had to setup an app bundle that can be run instead of a command line tool. I found this article <a href="https://medium.com/@mattholt/packaging-a-go-application-for-macos-f7084b00f6b5">https://medium.com/@mattholt/packaging-a-go-application-for-macos-f7084b00f6b5</a> that walked me through how to setup the OncallStatus.app directory and files. I followed the steps for the icons, plist &amp; folder structure. However I found an easier <a href="https://github.com/create-dmg/create-dmg">tool for creating DMG files</a> so I was able to setup some <a href="https://github.com/mtougeron/oncall-status/blob/main/Makefile#L11:L20">automation</a> for that step.</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/1c3e1d0e5aa1f77970d39391d34f5f13/href">https://medium.com/media/1c3e1d0e5aa1f77970d39391d34f5f13/href</a></iframe><p>I was able to get an appropriately licensed image to use as application icon from <a href="https://stock.adobe.com/">Adobe Stock</a>. At this point I could run OncallStatus as a normal MacOS app. Unfortunately the security warnings and restrictions still kept the app from being unusable.</p><h3>Build, Signing &amp; Notarizing</h3><p>In order to distribute the application without all the security warnings showing up when someone tries to run OncallStatus I had to <a href="https://developer.apple.com/programs/">pay Apple a $99 yearly fee</a> in order to <a href="https://developer.apple.com/support/code-signing/">sign</a> &amp; <a href="https://developer.apple.com/documentation/security/notarizing_macos_software_before_distribution">notarize</a> it. I really hope I start making more apps like this one in order to make the cost worth it. It’s pretty annoying that Apple makes you pay to join the Developer Program in order to sign &amp; notarize apps.</p><p>Once I paid the toll, I could use <a href="https://github.com/mitchellh/gon">gon</a> to do the signing &amp; notarizing in an automated fashion. First I setup gon to sign the OncallStatus.app and the binary OncallStatus.app/Contents/MacOS/OncallStatus. I had originally tried to do the notarizing at the same time but I had trouble getting the DMG created properly when I did it that way. The binary file kept ending up in the DMG top level instead of just the OncallStatus.app. Instead after signing the app files, I create the DMG file, then send that to Apple to get notarized.</p><p>What’s really cool is that I can do all of this via a few <a href="https://github.com/mtougeron/oncall-status/blob/main/.github/workflows/release.yaml">GitHub Actions</a>. It’s even able to automatically upload &amp; attach the newly createdDMG file to the release when I create the new version tag. The more I use GitHub Actions the deeper I fall in love with them.</p><h3>What’s coming next</h3><p>As with many things time is going to be a key factor. Right now my priority is to pass the <a href="https://www.cncf.io/certification/cks/">CKS exam</a>. But once that’s out of the way I’d like to expand the functionality so that it works on Linux desktops. Several of my friends &amp; co-workers run Linux as their primary machine and I’d like to be able to support them. I’d also like to clean up the notifications code as it’s using a deprecated library that will eventually go away. If I ever get more comfortable with Cocoa (or whatever language) for MacOS UI development I’d like to allow users to acknowledge the alert directly from the notification.</p><p>For now however, I hope you give it a try and like it. Please don’t hesitate to give feedback here or as a <a href="https://github.com/mtougeron/oncall-status/issues">GitHub issue</a>.</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=4f5b46f85572" width="1" height="1" alt=""><hr><p><a href="https://grepmymind.com/pagerduty-oncallstatus-for-macos-4f5b46f85572">PagerDuty OncallStatus for MacOS</a> was originally published in <a href="https://grepmymind.com">GrepMyMind</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[It’s all about the data; a journey into Kubernetes CSI on AWS]]></title>
            <link>https://grepmymind.com/its-all-about-the-data-a-journey-into-kubernetes-csi-on-aws-f2b998676ce9?source=rss----8def803f7d93---4</link>
            <guid isPermaLink="false">https://medium.com/p/f2b998676ce9</guid>
            <category><![CDATA[kubernetes-storage]]></category>
            <category><![CDATA[kubernetes]]></category>
            <category><![CDATA[aws-eks]]></category>
            <category><![CDATA[aws-ebs]]></category>
            <dc:creator><![CDATA[Mike Tougeron]]></dc:creator>
            <pubDate>Mon, 15 Mar 2021 17:10:32 GMT</pubDate>
            <atom:updated>2021-03-15T17:10:32.545Z</atom:updated>
            <content:encoded><![CDATA[<p>Over the last several weeks I’ve taken a trip into the world of Kubernetes storage; both the Container Storage Interface (CSI) and Container Attached Storage (CAS). I’ve talked with folks in the CAS space before but for whatever reason the power of it never really settled into my brain until recently. The idea of this journey started picking up steam when I realized that the <a href="https://kubernetes.io/blog/2019/12/09/kubernetes-1-17-feature-csi-migration-beta/">in-tree storage plugins were deprecated</a> and no new enhancements were being made to them starting with Kubernetes 1.20. When I discovered that simply switching from gp2 to gp3 volumes meant I had to start using the AWS CSI Driver I realized I was behind the times. This desire for a simple change opened the door and the next thing I knew I was on an adventure of potentially significant impact.</p><figure><img alt="photo of containers and trucks, boat, planes carrying them" src="https://cdn-images-1.medium.com/max/1024/1*EsDA3m1FGmCqBWRah7ZEjw.jpeg" /></figure><p>The journey started with the <a href="https://aws.amazon.com/about-aws/whats-new/2020/12/introducing-new-amazon-ebs-general-purpose-volumes-gp3/">new AWS EBS volume types</a>, but then sped into some code trying to fix an <a href="https://github.com/kubernetes-sigs/aws-ebs-csi-driver/issues/660">open issue</a> in the aws-ebs-csi-driver, jumped up into <a href="https://kubernetes.io/docs/concepts/storage/volume-snapshots/">VolumeSnapshots</a>, spun around to creating <a href="https://github.com/kubernetes-sigs/aws-ebs-csi-driver/tree/master/examples/kubernetes/snapshot/specs/snapshot-import">PVCs from snapshots</a>, and rounded the corner into doing an <a href="https://openebs.io/">OpenEBS</a> proof of concept. By the time I was done I was exhausted but full of excitement for the future possibilities.</p><h3>aws-ebs-csi-driver</h3><p><em>Please note that all references are assuming that you are running Kubernetes 1.17+ though I’ve only been running this configuration on a 1.19 cluster.</em></p><h4>What is it?</h4><p>The <a href="https://github.com/kubernetes-sigs/aws-ebs-csi-driver">aws-ebs-csi-driver</a> is a <a href="https://kubernetes.io/blog/2019/01/15/container-storage-interface-ga/">CSI storage plugin</a> that replaces to the in-tree storage plugin for AWS EBS volumes. These plugins are what are used when you request a new PVC and the EBS volumes get created behind the scenes. In the beginning the storage plugins were part of the base Kubernetes repo/app but over time that has evolved. The creation of the CSI spec has opened the door not only for new CAS providers but also the cloud providers. The speed at which storage drivers can be iterated on can grow significantly because it is out of band from the core Kubernetes releases.</p><p>If you’re looking at it for the first time it can be daunting to understand what each component of the app does or even why you need to run it at all! At a very high level, you run a Deployment for the controller (<a href="https://kubernetes-csi.github.io/docs/sidecar-containers.html">with its many sidecars</a>), a DaemonSet (also with sidecars) on every node, and optionally the snapshot controller.</p><p>The ebs-csi-controller has several sidecars in its deployment but main container is the ebs-plugin. This is where the code lives that interacts with the AWS APIs to create, delete, resize, etc the EBS volumes when a Persistent Volume is created. This is the code you’ll see when you go to <a href="https://github.com/kubernetes-sigs/aws-ebs-csi-driver">https://github.com/kubernetes-sigs/aws-ebs-csi-driver</a>.</p><p>The <a href="https://kubernetes-csi.github.io/docs/sidecar-containers.html">other sidecars</a> essentially contain boilerplate logic that handle the communication and coordination between the ebs-plugin and the Kubernetes API. For example, the csi-resizer sidecar watches for PVC edits and notifies the ebs-plugin, over a socket using grpc, with the necessary data so that it can resize the EBS volume via the AWS API. The sidecars allow the ebs-plugin driver to focus on just the storage volume functionality and not have to re-implement a lot of the same Kubernetes API interactions. I highly recommend reading the <a href="https://kubernetes-csi.github.io/docs/sidecar-containers.html">Kubernetes CSI Sidecar Containers</a> portion of the documentation to get a better idea of what each one does. I don’t think you necessarily need to know them in-depth but a general understanding of what each piece does is really helpful.</p><p>Along with the controller there is the ebs-csi-node that runs as a DaemonSet one each node. It is responsible <a href="https://kubernetes-csi.github.io/docs/deploying.html#node-plugin">for mounting and unmounting</a> a volume from the node when requested by the kubelet. This is what makes the volume available for the Pods to use.</p><h4>How to set it up</h4><p>The aws-ebs-csi-driver requires AWS API access in order to manage the EBS volumes. I recommend something like <a href="https://github.com/jtblin/kube2iam">kube2iam</a> to handle this and to not use access keys. The official documentation has an <a href="https://github.com/kubernetes-sigs/aws-ebs-csi-driver/blob/master/docs/example-iam-policy.json">example IAM policy</a> and it looks like this.</p><pre>{<br>  &quot;Version&quot;: &quot;2012-10-17&quot;,<br>  &quot;Statement&quot;: [<br>    {<br>      &quot;Effect&quot;: &quot;Allow&quot;,<br>      &quot;Action&quot;: [<br>        &quot;ec2:AttachVolume&quot;,<br>        &quot;ec2:CreateSnapshot&quot;,<br>        &quot;ec2:CreateTags&quot;,<br>        &quot;ec2:CreateVolume&quot;,<br>        &quot;ec2:DeleteSnapshot&quot;,<br>        &quot;ec2:DeleteTags&quot;,<br>        &quot;ec2:DeleteVolume&quot;,<br>        &quot;ec2:DescribeAvailabilityZones&quot;,<br>        &quot;ec2:DescribeInstances&quot;,<br>        &quot;ec2:DescribeSnapshots&quot;,<br>        &quot;ec2:DescribeTags&quot;,<br>        &quot;ec2:DescribeVolumes&quot;,<br>        &quot;ec2:DescribeVolumesModifications&quot;,<br>        &quot;ec2:DetachVolume&quot;,<br>        &quot;ec2:ModifyVolume&quot;<br>      ],<br>      &quot;Resource&quot;: &quot;*&quot;<br>    }<br>  ]<br>}</pre><p>Once you have the IAM role configured you can launch the controllers via the Helm chart.</p><pre>helm repo add aws-ebs-csi-driver https://kubernetes-sigs.github.io/aws-ebs-csi-driver</pre><pre>helm repo update</pre><pre>helm upgrade --install aws-ebs-csi-driver \<br>    --namespace kube-system \<br>    --set enableVolumeScheduling=true \<br>    --set enableVolumeResizing=true \<br>    --set &#39;podAnnotations.iam\.amazonaws\.com/role&#39;=ROLE_ARN \<br>    --set &#39;node.podAnnotations.iam\.amazonaws\.com/role&#39;=ROLE_ARN \<br>    aws-ebs-csi-driver/aws-ebs-csi-driver</pre><p>After it has been applied you’ll see the pods running in the kube-system namespace.</p><pre>NAME                                  READY STATUS    RESTARTS AGE<br>ebs-csi-controller-85bc6d8897-lt5xk   6/6   Running   0        3m7s<br>ebs-csi-controller-85bc6d8897-v542j   6/6   Running   0        3m7s<br>ebs-csi-node-66dt6                    3/3   Running   0        3m7s<br>ebs-csi-node-9424k                    3/3   Running   0        3m7s<br>ebs-csi-node-b9mnd                    3/3   Running   0        3m7s<br>ebs-csi-node-gd6d6                    3/3   Running   0        3m7s<br>ebs-csi-node-hr4qt                    3/3   Running   0        3m7s<br>ebs-csi-node-jjbcj                    3/3   Running   0        3m7s</pre><p>You will also see the CSIDriver installed on your cluster.</p><pre>$&gt; kubectl get csidriver</pre><pre>NAME              ATTACHREQUIRED   PODINFOONMOUNT   MODES        AGE<br>ebs.csi.aws.com   true             false            Persistent   21m</pre><p>The CSIDriver is what you use when creating the StorageClass so that Kubernetes knows which CSI storage plugin should be used. This means that you can have more than one storage plugin running on your cluster at the same time! For example, in my case I have the in-tree storage plugins, the aws-ebs-csi-driver plugin, and OpenEBS (from the POC that I’ll discuss in a future blog post) all running nicely together.</p><h4>How to use it</h4><p>Now that the controller and node pods are running and the CSIDriver is created you can create the StorageClass(es) your users will use.</p><pre>apiVersion: storage.k8s.io/v1<br>kind: StorageClass<br>provisioner: ebs.csi.aws.com # &lt;-- The same name as the CSIDriver<br>metadata:<br>  name: gp3<br>parameters: # &lt;-- parameters for this CSIDriver<br>  encrypted: &quot;true&quot;<br>  type: gp3<br>allowVolumeExpansion: true<br>volumeBindingMode: Immediate<br>---<br>apiVersion: storage.k8s.io/v1<br>kind: StorageClass<br>provisioner: ebs.csi.aws.com<br>metadata:<br>  name: gp3-6000iops<br>parameters:<br>  encrypted: &quot;true&quot;<br>  type: gp3<br>  throughput: 250<br>  iops: 6000 # &lt;-- For volumes 1TB-2TB in size or needing more iops<br>allowVolumeExpansion: true<br>volumeBindingMode: Immediate</pre><p>From an end-user perspective, the new gp3 storage class is used just like they’ve been used to doing with the in-tree storage plugins.</p><pre>apiVersion: v1<br>kind: PersistentVolumeClaim<br>metadata:<br>  name: touge-pvc<br>spec:<br>  storageClassName: gp3<br>  accessModes:<br>    - ReadWriteOnce<br>  resources:<br>    requests:<br>      storage: 10Gi</pre><p>Let’s follow the process and inspect the results.</p><pre>$&gt; kubectl apply -f touge-pvc.yaml<br>persistentvolumeclaim/touge-pvc created</pre><pre>$&gt; kubectl get pvc touge-pvc<br>NAME        STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE<br>touge-pvc   Bound    pvc-a2cc33c6-f5d5-425f-bd1e-0902b82bbcec   10Gi       RWO            gp3            10s</pre><pre>$&gt; kubectl describe pv pvc-a2cc33c6-f5d5-425f-bd1e-0902b82bbcec<br>Name:              pvc-a2cc33c6-f5d5-425f-bd1e-0902b82bbcec<br>Labels:            &lt;none&gt;<br>Annotations:       pv.kubernetes.io/provisioned-by: ebs.csi.aws.com<br>Finalizers:        [kubernetes.io/pv-protection]<br>StorageClass:      gp3<br>Status:            Bound<br>Claim:             default/touge-pvc<br>Reclaim Policy:    Delete<br>Access Modes:      RWO<br>VolumeMode:        Filesystem<br>Capacity:          10Gi<br>Node Affinity:     <br>  Required Terms:  <br>    Term 0:        topology.ebs.csi.aws.com/zone in [us-west-2c]<br>Message:           <br>Source:<br>    Type:              CSI (a Container Storage Interface (CSI) volume source)<br>    Driver:            ebs.csi.aws.com<br>    FSType:            ext4<br>    VolumeHandle:      vol-0f06e363f467b87bd<br>    ReadOnly:          false<br>    VolumeAttributes:      storage.kubernetes.io/csiProvisionerIdentity=1615144050357-8081-ebs.csi.aws.com<br>Events:                &lt;none&gt;</pre><pre>$&gt; aws ec2 describe-volumes --volume-ids vol-0f06e363f467b87bd --region us-west-2<br>{<br>    &quot;Volumes&quot;: [<br>        {<br>            &quot;Attachments&quot;: [],<br>            &quot;AvailabilityZone&quot;: &quot;us-west-2c&quot;,<br>            &quot;CreateTime&quot;: &quot;2021-03-07T21:42:30.268Z&quot;,<br>            &quot;Encrypted&quot;: true,<br>            &quot;KmsKeyId&quot;: &quot;arn:aws:kms:us-west-2:REDACTED:key/REDACTED&quot;,<br>            &quot;Size&quot;: 10,<br>            &quot;SnapshotId&quot;: &quot;&quot;,<br>            &quot;State&quot;: &quot;available&quot;,<br>            &quot;VolumeId&quot;: &quot;vol-0f06e363f467b87bd&quot;,<br>            &quot;Iops&quot;: 3000,<br>            &quot;Tags&quot;: [<br>                {<br>                    &quot;Key&quot;: &quot;kubernetes.io/created-for/pv/name&quot;,<br>                    &quot;Value&quot;: &quot;pvc-a2cc33c6-f5d5-425f-bd1e-0902b82bbcec&quot;<br>                },<br>                {<br>                    &quot;Key&quot;: &quot;kubernetes.io/created-for/pvc/namespace&quot;,<br>                    &quot;Value&quot;: &quot;default&quot;<br>                },<br>                {<br>                    &quot;Key&quot;: &quot;kubernetes.io/created-for/pvc/name&quot;,<br>                    &quot;Value&quot;: &quot;touge-pvc&quot;<br>                },<br>                {<br>                    &quot;Key&quot;: &quot;CSIVolumeName&quot;,<br>                    &quot;Value&quot;: &quot;pvc-a2cc33c6-f5d5-425f-bd1e-0902b82bbcec&quot;<br>                }<br>            ],<br>            &quot;VolumeType&quot;: &quot;gp3&quot;,<br>            &quot;MultiAttachEnabled&quot;: false,<br>            &quot;Throughput&quot;: 125<br>        }<br>    ]<br>}</pre><h3>VolumeSnapshots</h3><p>VolumeSnapshots are a pretty cool feature that’s possible with CSI. You can do things like taking a snapshot of a volume and then restore the PVC with that snapshot if your data becomes corrupt. An interesting use-case would be to create a nightly snapshot of your dev database and allow users to create a PersistentVolumeClaim (PVC) from that snapshot to use in their personal testing. The snapshot doesn’t even need to come from inside Kubernetes!</p><h4>Enabling VolumeSnapshots</h4><p>It’s an add-on to the default setup so the first thing you need to do is install the <a href="https://github.com/kubernetes-csi/external-snapshotter/tree/master/client/config/crd">CSI Snapshotter CRDs</a>. After installing the Snapshotter CRDs you can add --set enableVolumeSnapshot=true to the Helm install command from above and a new StatefulSet, ebs-snapshot-controller, will be running.</p><p>It uses a VolumeSnapshotClass to know which CSI Plugin the snapshot requests go to so let’s create one.</p><pre>apiVersion: snapshot.storage.k8s.io/v1beta1<br>kind: VolumeSnapshotClass<br>metadata:<br>  name: ebs-csi-aws<br>driver: ebs.csi.aws.com # &lt;-- The CSIDriver we defined previously<br>deletionPolicy: Delete</pre><h4>Creating VolumeSnapshots</h4><p>To create a new VolumeSnapshot create a resource on the cluster for it.</p><pre>apiVersion: snapshot.storage.k8s.io/v1beta1<br>kind: VolumeSnapshot<br>metadata:<br>  name: touge-snapshot<br>spec:<br>  volumeSnapshotClassName: ebs-csi-aws<br>  source:<br>    persistentVolumeClaimName: touge-pvc</pre><p>This will trigger the snapshotting process, the aws-ebs-csi-driver will be notified, and it will create a snapshot in AWS for the EBS volume that is backing the PVC. Once again, let’s follow the process and inspect the results.</p><pre>$&gt; kubectl apply -f touge-snapshot.yaml <br>volumesnapshot.snapshot.storage.k8s.io/touge-snapshot created</pre><pre>$&gt; kubectl describe volumesnapshot touge-snapshot<br>Name:         touge-snapshot<br>Namespace:    default<br>Labels:       &lt;none&gt;<br>Annotations:  API Version:  snapshot.storage.k8s.io/v1beta1<br>Kind:         VolumeSnapshot<br>Metadata:<br>  Creation Timestamp:  2021-03-07T21:50:40Z<br>  Finalizers:<br>    snapshot.storage.kubernetes.io/volumesnapshot-as-source-protection<br>    snapshot.storage.kubernetes.io/volumesnapshot-bound-protection<br>  Generation:  1<br>  Resource Version:  135554<br>  Self Link:         /apis/snapshot.storage.k8s.io/v1beta1/namespaces/default/volumesnapshots/touge-snapshot<br>  UID:               7d32eca6-2015-4a6e-a5b6-3ec48ca68005<br>Spec:<br>  Source:<br>    Persistent Volume Claim Name:  touge-pvc<br>  Volume Snapshot Class Name:      ebs-csi-aws<br>Status:<br>  Bound Volume Snapshot Content Name:  snapcontent-7d32eca6-2015-4a6e-a5b6-3ec48ca68005<br>  Creation Time:                       2021-03-07T21:51:13Z<br>  Ready To Use:                        true<br>  Restore Size:                        10Gi</pre><p>If we look at the Status we will see Bound Volume Snapshot Content Name: snapcontent-7d32eca6–2015–4a6e-a5b6–3ec48ca68005. This tells us which VolumeSnapshotContent is created for our VolumeSnapshot. The VolumeSnapshotContent is a resource that is created by the snapshot controller that represents the data the CSI Plugin created.</p><pre>$&gt; kubectl describe volumesnapshotcontents snapcontent-7d32eca6-2015-4a6e-a5b6-3ec48ca68005<br>Name:         snapcontent-7d32eca6-2015-4a6e-a5b6-3ec48ca68005<br>Namespace:    <br>Labels:       &lt;none&gt;<br>Annotations:  &lt;none&gt;<br>API Version:  snapshot.storage.k8s.io/v1beta1<br>Kind:         VolumeSnapshotContent<br>Metadata:<br>  Creation Timestamp:  2021-03-07T21:50:40Z<br>  Finalizers:<br>    snapshot.storage.kubernetes.io/volumesnapshotcontent-bound-protection<br>  Generation:  1<br>  Resource Version:  135553<br>  Self Link:         /apis/snapshot.storage.k8s.io/v1beta1/volumesnapshotcontents/snapcontent-7d32eca6-2015-4a6e-a5b6-3ec48ca68005<br>  UID:               0049e16f-2196-456b-9460-319ee24b3a15<br>Spec:<br>  Deletion Policy:  Delete<br>  Driver:           ebs.csi.aws.com<br>  Source:<br>    Volume Handle:             vol-0f06e363f467b87bd<br>  Volume Snapshot Class Name:  ebs-csi-aws<br>  Volume Snapshot Ref:<br>    API Version:       snapshot.storage.k8s.io/v1beta1<br>    Kind:              VolumeSnapshot<br>    Name:              touge-snapshot<br>    Namespace:         default<br>    Resource Version:  133849<br>    UID:               7d32eca6-2015-4a6e-a5b6-3ec48ca68005<br>Status:<br>  Creation Time:    1615153873000000000<br>  Ready To Use:     true<br>  Restore Size:     10737418240<br>  Snapshot Handle:  snap-05bc0ec2f3a65b7be</pre><p>Here is where we see Snapshot Handle: snap-05bc0ec2f3a65b7be that tells us the SnapshotID in AWS.</p><pre>$&gt; aws ec2 describe-snapshots --snapshot-ids snap-05bc0ec2f3a65b7be --region us-west-2<br>{<br>    &quot;Snapshots&quot;: [<br>        {<br>            &quot;Description&quot;: &quot;Created by AWS EBS CSI driver for volume vol-0f06e363f467b87bd&quot;,<br>            &quot;Encrypted&quot;: true,<br>            &quot;KmsKeyId&quot;: &quot;arn:aws:kms:us-west-2:REDACTED:key/REDACTED&quot;,<br>            &quot;OwnerId&quot;: &quot;REDACTED&quot;,<br>            &quot;Progress&quot;: &quot;100%&quot;,<br>            &quot;SnapshotId&quot;: &quot;snap-05bc0ec2f3a65b7be&quot;,<br>            &quot;StartTime&quot;: &quot;2021-03-07T21:51:13.115Z&quot;,<br>            &quot;State&quot;: &quot;completed&quot;,<br>            &quot;VolumeId&quot;: &quot;vol-0f06e363f467b87bd&quot;,<br>            &quot;VolumeSize&quot;: 10,<br>            &quot;Tags&quot;: [<br>                {<br>                    &quot;Key&quot;: &quot;CSIVolumeSnapshotName&quot;,<br>                    &quot;Value&quot;: &quot;snapshot-7d32eca6-2015-4a6e-a5b6-3ec48ca68005&quot;<br>                }<br>            ]<br>        }<br>    ]<br>}</pre><h4>Creating a PVC from an Existing AWS Snapshot</h4><p>Let’s go through the example use-case of taking an existing AWS snapshot and creating a PVC from it for someone to use inside Kubernetes.</p><p>First we need to create the VolumeSnapshotContent that references the AWS snapshot. Using the AWS console I created snap-002e544b538087ec1 from an EBS volume that I had. To show the power of this, the volume &amp; snapshot were <em>created outside of Kubernetes</em>.</p><pre>apiVersion: snapshot.storage.k8s.io/v1beta1<br>kind: VolumeSnapshotContent<br>metadata:<br>  name: my-imported-snapshot<br>spec:<br>  volumeSnapshotRef:<br>    kind: VolumeSnapshot<br>    name: my-imported-snapshot<br>    namespace: default <br>  source:<br>    snapshotHandle: snap-002e544b538087ec1 # &lt;-- snapshot to import<br>  driver: ebs.csi.aws.com<br>  deletionPolicy: Delete<br>  volumeSnapshotClassName: ebs-csi-aws</pre><p>Then we need to create the VolumeSnapshot that uses that VolumeSnapshotContent.</p><pre>apiVersion: snapshot.storage.k8s.io/v1beta1<br>kind: VolumeSnapshot<br>metadata:<br>  name: my-imported-snapshot<br>  namespace: default <br>spec:<br>  volumeSnapshotClassName: ebs-csi-aws<br>  source:<br>    volumeSnapshotContentName: my-imported-snapshot</pre><p>And apply it to the cluster.</p><pre>$&gt; kubectl apply -f touge-import-snapshot.yaml <br>volumesnapshotcontent.snapshot.storage.k8s.io/my-imported-snapshot created<br>volumesnapshot.snapshot.storage.k8s.io/my-imported-snapshot created</pre><pre>$&gt; kubectl get volumesnapshotcontent<br>NAME                                               AGE<br>my-imported-snapshot                               4m31s</pre><pre>$&gt; kubectl get volumesnapshot<br>NAME                   AGE<br>my-imported-snapshot   4m56s</pre><p>The VolumeSnapshot is now available to be used to create the PersistentVolumeClaim.</p><pre>apiVersion: v1<br>kind: PersistentVolumeClaim<br>metadata:<br>  name: my-imported-snapshot-pvc<br>spec:<br>  accessModes:<br>    - ReadWriteOnce<br>  storageClassName: gp3<br>  resources:<br>    requests:<br>      storage: 10Gi<br>  dataSource:<br>    name: my-imported-snapshot<br>    kind: VolumeSnapshot<br>    apiGroup: snapshot.storage.k8s.io</pre><p>This is then applied to the cluster and we will have a new PVC that we can mount on our Pod.</p><pre>$&gt; kubectl apply -f touge-pvc-from-snapshot.yaml <br>persistentvolumeclaim/my-imported-snapshot-pvc created</pre><pre>$&gt; kubectl get pvc my-imported-snapshot-pvc<br>NAME                       STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE<br>my-imported-snapshot-pvc   Bound    pvc-1ff63250-5a4f-442f-9907-171b69569c2b   10Gi       RWO            gp3            26s</pre><pre>$&gt; kubectl describe pv pvc-1ff63250-5a4f-442f-9907-171b69569c2b<br>Name:              pvc-1ff63250-5a4f-442f-9907-171b69569c2b<br>Labels:            &lt;none&gt;<br>Annotations:       pv.kubernetes.io/provisioned-by: ebs.csi.aws.com<br>Finalizers:        [kubernetes.io/pv-protection]<br>StorageClass:      gp3<br>Status:            Bound<br>Claim:             default/my-imported-snapshot-pvc<br>Reclaim Policy:    Delete<br>Access Modes:      RWO<br>VolumeMode:        Filesystem<br>Capacity:          10Gi<br>Node Affinity:     <br>  Required Terms:  <br>    Term 0:        topology.ebs.csi.aws.com/zone in [us-west-2a]<br>Message:           <br>Source:<br>    Type:              CSI (a Container Storage Interface (CSI) volume source)<br>    Driver:            ebs.csi.aws.com<br>    FSType:            ext4<br>    VolumeHandle:      vol-03b42ee74d7fd4f4e<br>    ReadOnly:          false<br>    VolumeAttributes:      storage.kubernetes.io/csiProvisionerIdentity=1615144050357-8081-ebs.csi.aws.com</pre><h3>Where I plan to go from here…</h3><p>Once I had played around with the aws-ebs-csi-driver for a few days I went ahead and implemented it in the production clusters. I haven’t yet gotten around to migrating the in-tree volumes to the new CSI based ones but there’s a bit of time for that.</p><p>Following my excitement I ended up doing a proof of concept using OpenEBS. I’m really excited about the possibilities there and will be writing about that setup soon.</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=f2b998676ce9" width="1" height="1" alt=""><hr><p><a href="https://grepmymind.com/its-all-about-the-data-a-journey-into-kubernetes-csi-on-aws-f2b998676ce9">It’s all about the data; a journey into Kubernetes CSI on AWS</a> was originally published in <a href="https://grepmymind.com">GrepMyMind</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Introducing the k8s-aws-ebs-tagger]]></title>
            <link>https://grepmymind.com/introducing-the-k8s-aws-ebs-tagger-3ec2502cf40e?source=rss----8def803f7d93---4</link>
            <guid isPermaLink="false">https://medium.com/p/3ec2502cf40e</guid>
            <category><![CDATA[kubernetes]]></category>
            <category><![CDATA[software-development]]></category>
            <category><![CDATA[aws-ebs]]></category>
            <category><![CDATA[aws]]></category>
            <category><![CDATA[infrastructure-as-code]]></category>
            <dc:creator><![CDATA[Mike Tougeron]]></dc:creator>
            <pubDate>Tue, 12 Jan 2021 20:26:04 GMT</pubDate>
            <atom:updated>2022-07-09T20:13:58.220Z</atom:updated>
            <content:encoded><![CDATA[<figure><img alt="silhouettes of taggers drawing graffiti" src="https://cdn-images-1.medium.com/max/1024/1*Hmd3kCjfb1Rk1lben4r-Lg.jpeg" /><figcaption><a href="https://stock.adobe.com/images/silhouettes-of-taggers-drawing-graffiti/67460951">https://stock.adobe.com/images/silhouettes-of-taggers-drawing-graffiti/67460951</a></figcaption></figure><p>EDIT: The k8s-aws-ebs-tagger was renamed to <a href="https://github.com/mtougeron/k8s-pvc-tagger">k8s-pvc-tagger</a> as the scope of the project was expanded to include more than just aws-ebs volumes. Don’t worry, it’s still backwards compatible.</p><p>The <a href="https://github.com/mtougeron/k8s-aws-ebs-tagger">k8s-aws-ebs-tagger</a> brings tagging to the AWS EBS volumes created by <a href="https://kubernetes.io/docs/concepts/storage/persistent-volumes/">Kubernetes PersistentVolumeClaims</a> (PVC). This new utility enables you to set arbitrary tags on the EBS volume so that you can better categorize and report on the state of your AWS resources. Having proper cost control tags can help you keep a handle on your AWS billing and resource utilization.</p><p>Let’s dive into how to install and use it.</p><h4>Install the k8s-aws-ebs-tagger</h4><p>The container images are released both on <a href="https://hub.docker.com/repository/docker/mtougeron/k8s-aws-ebs-tagger">DockerHub</a> and <a href="https://github.com/users/mtougeron/packages/container/k8s-aws-ebs-tagger/versions">GitHub Container Registry</a> and are built for both linux/amd64 and linux/arm64.</p><p>The first thing needed is an AWS IAM Role that is allowed to add &amp; delete tags from EBS volumes. I recommend using <a href="https://github.com/jtblin/kube2iam">kube2iam</a> for assigning the role to the Pod(s) instead of using AWS access key/secrets.</p><pre>{<br>    &quot;Version&quot;: &quot;2012-10-17&quot;,<br>    &quot;Statement&quot;: [<br>        {<br>            &quot;Sid&quot;: &quot;&quot;,<br>            &quot;Effect&quot;: &quot;Allow&quot;,<br>            &quot;Action&quot;: [<br>                &quot;ec2:CreateTags&quot;,<br>                &quot;ec2:DeleteTags&quot;<br>            ],<br>            &quot;Resource&quot;: [<br>                &quot;arn:aws:ec2:*:*:volume/*&quot;<br>            ]<br>        }<br>    ]<br>}</pre><p>Once you have the IAM Role you can get the app running via its Helm chart. Be sure to check the default values and adjust as appropriate for your environment.</p><pre>helm repo add mtougeron https://mtougeron.github.io/helm-charts/<br>helm repo update<br>helm install k8s-aws-ebs-tagger mtougeron/k8s-aws-ebs-tagger</pre><p>If you want it to only watch a single namespace you can set the watchNamespace value for the chart but it still needs a ClusterRole in order to get the volume ID from the PersistentVolume. Currently it only supports watching a single namespace or all namespaces (<a href="https://github.com/mtougeron/k8s-aws-ebs-tagger/issues/9">#9</a>) but I plan on updating this soon.</p><h4>Configuring the tags to set</h4><p>The first approach is to use the (optional) --default-tags command line flag that takes a json encoded string of key/value pairs. It uses these tags as the base set of tags to add to all EBS Volumes when a PersistentVolumeClaim is added or updated. This is useful if you always want to add a fixed tag to all EBS volumes created in the cluster (or namespace). For example, you may want all volumes to have the tag Environment=Production.</p><p>The default tags can be extended by the aws-ebs-tagger/tags annotation on the PVC. This annotation also takes a json encoded string of key/value pairs and uses them for tags on the volume. This can be used to extend the list of tags you want set as well as override the default values.</p><p>Take for example, this deployment and PVC</p><pre>apiVersion: apps/v1<br>kind: Deployment<br>metadata:<br>name: k8s-aws-ebs-tagger<br>spec:<br>...<br>  template:<br>    spec:<br>      containers:<br>        - name: k8s-aws-ebs-tagger<br>          args:<br>            - --default-tags={&quot;Environment&quot;: &quot;Production&quot;}<br>...<br>---<br>apiVersion: v1<br>kind: PersistentVolumeClaim<br>metadata:<br>  name: example1<br>  annotations:<br>    aws-ebs-tagger/tags: |<br>      {&quot;Database&quot;: &quot;true&quot;}<br>...</pre><p>The resulting EBS volume will have the tags Environment=Production and Database=true.</p><p>Let’s say that for databases you have a different Environment tag. You could use the same Deployment as above but on the Database PVCs you override the Environment tag in the annotation.</p><pre>apiVersion: v1<br>kind: PersistentVolumeClaim<br>metadata:<br>  name: db1<br>  annotations:<br>    aws-ebs-tagger/tags: |<br>      {&quot;Database&quot;: &quot;true&quot;, &quot;Environment&quot;: &quot;DBProduction&quot;}<br>...</pre><p>And the result tags will be Environment=DBProduction and Database=true</p><p>If you have a PVC that you <em>don’t</em> want tagged at all you can use the aws-ebs-tagger/ignore annotation and no tags will be processed for that volume.</p><p>Currently you can only use fixed values for the tags. However I’m working on updating that in the near future to allow for templated tag values (<a href="https://github.com/mtougeron/k8s-aws-ebs-tagger/issues/15">#15</a>).</p><h4>JSON vs multiple annotations vs comma delimited values</h4><p>I went back and forth over this a lot when I was first thinking about writing this app. There are pros &amp; cons to each approach but the deciding factor was that I needed a tag name with a : in it and I couldn’t use that in an annotation name. I also considered using a comma delimited list of tags but that made it difficult to allow the , in the value of the tag. With those restrictions the greatest flexibility was to use a json string of key/value pairs. That unfortunately leaves the user with the hassle of hand-coding json key/value pairs for their tags (which I really hate) but it’s the best I could think of.</p><h4>Thoughts on the future</h4><p>While the CSI Drivers are the latest &amp; future of storage on Kubernetes there are still a lot of us not using them yet. Even if you are using the aws-ebs-csi-driver it still has an <a href="https://github.com/kubernetes-sigs/aws-ebs-csi-driver/issues/180">open issue</a> to allow adding arbitrary tags. Until the CSI driver is updated this utility app should provide the desired tagging.</p><p>Completing <a href="https://github.com/mtougeron/k8s-aws-ebs-tagger/issues/9">#9</a> &amp; <a href="https://github.com/mtougeron/k8s-aws-ebs-tagger/issues/15">#15</a> will allow handling of the last scenarios I can think of to make this usable by the masses. However, if you have any suggestions for improvement I’m happy to hear them!</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=3ec2502cf40e" width="1" height="1" alt=""><hr><p><a href="https://grepmymind.com/introducing-the-k8s-aws-ebs-tagger-3ec2502cf40e">Introducing the k8s-aws-ebs-tagger</a> was originally published in <a href="https://grepmymind.com">GrepMyMind</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[5 ways to handle AWS API rate-limiting]]></title>
            <link>https://grepmymind.com/5-ways-to-handle-aws-api-rate-limiting-863a4f0c55e2?source=rss----8def803f7d93---4</link>
            <guid isPermaLink="false">https://medium.com/p/863a4f0c55e2</guid>
            <category><![CDATA[software-development]]></category>
            <category><![CDATA[tips-and-tricks]]></category>
            <category><![CDATA[aws]]></category>
            <category><![CDATA[programming]]></category>
            <category><![CDATA[error-handling]]></category>
            <dc:creator><![CDATA[Mike Tougeron]]></dc:creator>
            <pubDate>Mon, 14 Dec 2020 20:33:31 GMT</pubDate>
            <atom:updated>2020-12-14T20:33:31.545Z</atom:updated>
            <content:encoded><![CDATA[<p>When dealing with <a href="https://docs.aws.amazon.com/AWSEC2/latest/APIReference/throttling.html">AWS API rate-limiting</a> there are a few tips &amp; tricks that I find helpful. If your environment is like mine and you have a lot of code interacting with the AWS APIs, sometimes poorly, handling the default rate-limiting without errors is important.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/708/1*FTev6SkA_I1DBWM1JHhR6w.png" /><figcaption>Top AWS API calls in a typical hour</figcaption></figure><h3>Python’s Tenacity</h3><p>I’ve found that <a href="https://tenacity.readthedocs.io/en/latest/">Tenacity</a> for Python is a life saver. Tenacity is a general purpose library that automates retry logic. By decorating your functions Tenacity will automatically retry, with behavior determined by the decoration, when an exception is raised. In the code below it automatically retries the API call 10 times after waiting with a random exponential backoff.</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/d1ab55cb1b4f1bf352570583f2656564/href">https://medium.com/media/d1ab55cb1b4f1bf352570583f2656564/href</a></iframe><p>The great thing is that it only takes a little bit of effort to refactor your code to take advantage of it. All you have to do is make each AWS API call a function and put the decorator on it. With the reraise=True option your existing error handling will continue to work as it is coded now.</p><h3>AWS GoSDK’s CustomRetryer</h3><p>The AWS Go SDK also has some default retry logic built-in. In addition to the defaults, it allows you to custom set when to do a retry and how often to do it. Once you initialize your session with the CustomRetryer it will automatically be used.</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/dae3e47296fd5df1dc1706d0c2c9072a/href">https://medium.com/media/dae3e47296fd5df1dc1706d0c2c9072a/href</a></iframe><p>The nice thing about this approach is that it also lets you set custom logic for when it should do the retries.</p><h3>Caching</h3><p>Sometimes the data you query from AWS can be fairly static. For example, the KubernetesCluster or Environmenttags on my Kubernetes EC2 instances never change. Instead of making an API call every time I need to know the tag values I can save it to a local file or to ElastiCache and reference it from there first. If it doesn’t exist, the script can fall back to making the API calls.</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/89a02a9c1aca7b9f382b9560a186d2d6/href">https://medium.com/media/89a02a9c1aca7b9f382b9560a186d2d6/href</a></iframe><h3>Instance Metadata API</h3><p>If you haven’t been keeping up with what’s available via the local metadata API features it’s probably time for a look. With the newer instance types (e.g., the c/m/r 5-series) more data is available than there was before. It unfortunately still doesn’t have my most queried resource (tags) but it still does have useful information. For example, on a 5-series instance, you no longer have to run aws ec2 describe-instance-status to find out if there are upcoming maintenance events. Instead you can can query the metadata API for that information at http://169.254.169.254/latest/meta-data/events/maintenance/scheduled. This change alone saved many thousands of API calls an hour across the fleet.</p><h3>Requesting a limit increase</h3><p>This one is kind of cheating but it is actually possible and the <a href="https://docs.aws.amazon.com/AWSEC2/latest/APIReference/throttling.html#throttling-increase">docs even say so</a>. In practice though you’ll almost always get a response about doing backoffs and retries first. If you can make a good business case for why you should get a limit increase they can do that. For example, I ran into a situation where the Kubernetes <a href="https://github.com/kubernetes-sigs/external-dns">external-dns provider</a> was making too many requests per second when it was running on all of my clusters. There wasn’t a way for me to adjust it so AWS had to increase the limit (slightly) on the account.</p><h3>In the end…</h3><p>Unfortunately it all boils down to whether or not you should retry or if you should even make the API calls. Thankfully the approaches I described are fairly easy to implement. As you may have guessed from the stats shown we’re still getting rate-limited for some applications. This is currently a whack-an-app process where we are reducing the calls across quite a few applications a little at a time.</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=863a4f0c55e2" width="1" height="1" alt=""><hr><p><a href="https://grepmymind.com/5-ways-to-handle-aws-api-rate-limiting-863a4f0c55e2">5 ways to handle AWS API rate-limiting</a> was originally published in <a href="https://grepmymind.com">GrepMyMind</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Kubernetes Cluster Autoscaler — In for a penny, in to infinity]]></title>
            <link>https://grepmymind.com/kubernetes-cluster-autoscaler-in-for-a-penny-in-to-infinity-b54a3807dad?source=rss----8def803f7d93---4</link>
            <guid isPermaLink="false">https://medium.com/p/b54a3807dad</guid>
            <category><![CDATA[aws]]></category>
            <category><![CDATA[cloud-computing]]></category>
            <category><![CDATA[cluster-autoscaler]]></category>
            <category><![CDATA[programming]]></category>
            <category><![CDATA[kubernetes]]></category>
            <dc:creator><![CDATA[Mike Tougeron]]></dc:creator>
            <pubDate>Wed, 11 Nov 2020 18:01:03 GMT</pubDate>
            <atom:updated>2020-11-11T18:01:03.216Z</atom:updated>
            <content:encoded><![CDATA[<h3>Kubernetes Cluster Autoscaler — In for a penny, in to infinity</h3><p>I had an interesting conversation with a coworker in another business unit the other week where we were talking about instance types and planning for unknown workload sizes in our Kubernetes clusters. They asked what memory-to-cpu ratios my team used to decide the instance types to run for our clusters. I had to call timeout and talk about why ratios didn’t matter because I was using the <a href="https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler">cluster-autoscaler</a>. I realized we needed to take a step back and go over the philosophy I use when running workloads on Kubernetes. I figured it might make an interesting blog post so here we are.</p><p>Like a lot of people, the Kubernetes clusters I manage look like they were setup by <a href="https://kubernetes.io/docs/reference/setup-tools/kubeadm/">kubeadm</a> for the control-plane with multiple apiservers, etcd, etc. Maybe you run etcd on a separate tier of VMs, maybe you don’t, but in the grand scheme of things we’re all doing things pretty much the same way. Where it starts to get interesting is with the worker nodes. Some people run one instance type, others run multiple types. Some people have a fixed size of the cluster and others run the <a href="https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler">cluster-autoscaler</a>. As you probably got from the title, I’m one of those who use the cluster-autoscaler and rely on it heavily for my workloads.</p><p>From an overly simplistic perspective, as a cluster operator I don’t really care what someone requests for memory &amp; cpu for their applications. I care a great deal from a SRE and/or business perspective but let’s ignore those hats for the moment.</p><blockquote>What matters is that the Scheduler can schedule the requested workload.</blockquote><p>What matters to me is that the cluster has the resources to run the workloads that rely on it. Obviously we can’t run thousands of nodes just in case something needs it one day; that’s where the cluster-autoscaler comes in. The cluster-autoscaler watches for when Pods fail to schedule due to unavailable resources and scales the cluster nodes so that those resources become available. It can also reschedule workloads from under-utilized nodes so that the cluster scales down to a smaller size. On AWS it can do this scaling via node templates or AWS AutoScalingGroups and in my environments we heavily rely on AutoScalingGroups.</p><figure><img alt="A simple animated diagram showing how new pods can scale up to new nodes before the pods can be scheduled" src="https://cdn-images-1.medium.com/max/822/1*KXoM1nT2Clw3CR8uqADjww.gif" /></figure><p>Much like my colleague, when I first created Kubernetes clusters I thought the ratio for cpu &amp; memory was important. We were not sure what the workloads requirements were going to be like so we went with the generic m5d.8xlarge instance type. This of course worked and the cluster ran fine. The problem was that we were having trouble getting the cluster utilization numbers above 10-15%. The <a href="https://en.wikipedia.org/wiki/Bin_packing_problem">binpacking</a> wasn’t fitting the Pods in a way that used most of the resources for a node. I ended up spending a bunch of time working with teams to make sure they were setting appropriate resources requests/limits and educating people about how to figure that out on their own. The education portion was time well spent but overall the effort barely moved the needle. We got up to ~23% memory &amp; 7% cpu utilization which means all that work barely made an impact.</p><p>After about a year of running this way we wanted to start running the majority of the development cluster on spot instances. The cluster costs were rising and it needed attention before it spiraled out of control. Enabling a pool of spot nodes wasn’t hard as the code already had the concept of multiple worker tiers. I had to add some information about spot pricing but otherwise it was good to go.</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/2149d8bc6dd9579a086fca7a569d2c37/href">https://medium.com/media/2149d8bc6dd9579a086fca7a569d2c37/href</a></iframe><p>But a few days later alerts started firing that the cluster was not scaling due to a lack of spot capacity. The cluster-autoscaler was doing its job and trying to scale; there just wasn’t spot capacity of the instance type we were running. The quick &amp; dirty solution was to add another pool of spot instances of a different size. This worked for a day or two before the alerts started again.</p><blockquote>Should I create another node pool to fix it?</blockquote><p>This is when I started to realize I was looking at the design of the node tiers wrong. I didn’t care if the workload ran on a c5d.4xlarge, m5d.4xlarge, or r5d.16xlarge. I just wanted it to run! So I copy/pasted some Terraform code around and I had 12 worker tiers; one for each instance type running spot. I tested this out in dev and the spot capacity problem went away.</p><p>It wasn’t long until it was noticed that the dev cluster’s memory utilization was up to 43% for memory. After discussing it in chat for a bit with my team the light bulb went off. The cluster-autoscaler was picking different instance types based on the size of the workload that needed to be scheduled. The least-waste <a href="https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md#what-are-expanders">config option</a> was doing the work by picking what instance type would utilize most of its resources. Sometimes it went on a m5d.4xlarge and sometimes a c5d.4xlarge based on the <a href="https://github.com/kubernetes/autoscaler/blob/cluster-autoscaler-1.19.0/cluster-autoscaler/expander/waste/waste.go#L37:L74">blended score</a> of CPU &amp; memory.</p><blockquote>Why isn’t this being done in production?</blockquote><p>I don’t remember who first asked the question about the production cluster’s utilization but it was quickly decided that the same approach, except using non-spot, should be done in production too. Less than 24 hours and 36 AutoScalingGroups later (12 instance types * 3 AZs) the production clusters were running the same sort of configuration. We didn’t force workloads to reschedule immediately so it took a bit of time for the impact to be seen. The charts eventually started showing over 57% memory utilization and ~25% cpu utilization. It wasn’t perfect but it was 2–3x improvement over what was there before!</p><figure><img alt="A simple chart showing memory 57% max memory and 23% max cpu used over an hour timeframe" src="https://cdn-images-1.medium.com/max/643/1*ikk94w_1yizs5e6FrTknMQ.png" /><figcaption>The max memory &amp; CPU usage over an hour</figcaption></figure><p>Once spot workloads were added into production there were 72 AutoScalingGroups being managed by the cluster-autoscaler in each cluster. The switch to using so many AutoScalingGroups made a few shifts in thought necessary in order for it to not be overwhelming.</p><blockquote>Treat the worker nodes as pieces of compute not cattle.</blockquote><p>I found that it can be a different concept to treat something as a piece of compute instead of a <a href="http://cloudscaling.com/blog/cloud-computing/the-history-of-pets-vs-cattle/">herd of cattle</a>. Even if each VM can be replaced, destroyed, or auto-remediated on a whim they have a type and are associated with a grouping. This makes them less ephemeral than something abstract like cpu or memory. Once the thought patterns shifted the code shifted as well. Our deployment script doesn’t need to check that the AutoScalingGroup replaced a node, instead, it can check whether or not the cluster has Pods that can’t be scheduled. No longer does the script need to check whether or not there is existing capacity for rescheduling, it can simply evict the Pods, respecting the <a href="https://kubernetes.io/docs/tasks/run-application/configure-pdb/">PodDisruptionBudget</a>, and trust that the cluster-autoscaler will scale up as necessary. As an added bonus the cluster deployment time went from around 6 hours down to 2–3 hours without noticeable impact on the running workloads.</p><p>I can now put back on my other hats and start caring about things like why the cpu usage is so much lower than the cpu requested.</p><figure><img alt="A dashboard showing that cpu usage is only 36% of the requested cpu" src="https://cdn-images-1.medium.com/max/341/1*kIkdebeY76ub1mYPeLYFig.png" /></figure><p>I’m able focus on the things that can impact the bottom line and <a href="https://medium.com/grep-my-mind/initial-thoughts-on-kubecost-383d93368e7a">costs of running these workloads</a>. I can put my attention to application performance and spend my time optimizing the way the apps run on Kubernetes.</p><p>Essentially the cluster-autoscaler lets you care about the workloads and not worry about the type of compute it runs on. In the end, isn’t that what really important?</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=b54a3807dad" width="1" height="1" alt=""><hr><p><a href="https://grepmymind.com/kubernetes-cluster-autoscaler-in-for-a-penny-in-to-infinity-b54a3807dad">Kubernetes Cluster Autoscaler — In for a penny, in to infinity</a> was originally published in <a href="https://grepmymind.com">GrepMyMind</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Live migrating a Kubernetes cluster across VPCs without downtime]]></title>
            <link>https://grepmymind.com/live-migrating-a-kubernetes-cluster-across-vpcs-without-downtime-bbccc1a26c9f?source=rss----8def803f7d93---4</link>
            <guid isPermaLink="false">https://medium.com/p/bbccc1a26c9f</guid>
            <category><![CDATA[infrastructure]]></category>
            <category><![CDATA[terraform]]></category>
            <category><![CDATA[programming]]></category>
            <category><![CDATA[aws]]></category>
            <category><![CDATA[kubernetes]]></category>
            <dc:creator><![CDATA[Mike Tougeron]]></dc:creator>
            <pubDate>Thu, 22 Oct 2020 18:22:35 GMT</pubDate>
            <atom:updated>2020-10-22T18:22:35.723Z</atom:updated>
            <content:encoded><![CDATA[<figure><img alt="graphical version of the blog title" src="https://cdn-images-1.medium.com/max/438/1*DFBI4GiD-dUlPxfsK17vBA.jpeg" /></figure><p>Recently I ran into a situation where we had an IP conflict with another team’s Kubernetes cluster where they had a pod network CIDR block that conflicted with the CIDR block of the VPC my cluster (as well as legacy EC2 instances) was in. My team’s cluster could talk to their cluster over VPC peering but they couldn’t talk to me the same way. We didn’t want to put any of the application ingresses on the public internet and for internal limitations we couldn’t extended my VPC’s CIDR block. The only solution that could be found was to setup a VPC with a different CIDR block. This is easy enough to handle for the EC2 instances outside the Kubernetes cluster but live migrating a cluster without downtime was a bit of a challenge. Due to the application deployment pipeline the clusters have become pets to the engineering teams. That introduces a set of problems where spinning up &amp; migrating to a different cluster isn’t possible without a significant time investment across many teams. Doing this migration without downtime seemed daunting but as the scope was defined if started to become a reasonable goal.</p><p><strong>NOTE</strong>: At the last minute it was decided to not do this migration. The steps I describe below were executed several times in the lab environment without issue but it never went to the production migration stage. :(</p><p>Not all work ends up in production and this turned out to be one of those times. However, I still think the process is worth sharing. :)</p><h3><strong><em>What the cluster looked like</em></strong></h3><p>I’m a big fan of the Hashicorp’s <a href="https://en.wikipedia.org/wiki/Infrastructure_as_code">Infrastructure As Code</a> toolset and they work well for my environment. For cluster deployments I build an <a href="https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/AMIs.html">AMI</a> using <a href="https://www.packer.io/">Packer</a>, deploy it to the AWS AutoScalingGroup(s) with <a href="https://terraform.io">Terraform</a>, and then run a custom Python script to cycle the nodes. The Terraform code has 3 main components; a module (worker-common) for shared resources (security groups, LBs, DNS, etc), a module (control-plane) for the control-plane, and a module (worker-{pool}) for the worker nodes. There are around 72 AutoScalingGroups (one per instance-type per AZ plus on-demand vs spot), about 200 worker nodes, and many thousands of Pods running on the cluster I needed to migrate.</p><p>The Terraform code essentially looked like this:</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/b93b68ee259d5b92403e8c842be96a46/href">https://medium.com/media/b93b68ee259d5b92403e8c842be96a46/href</a></iframe><h3><strong>The problems</strong></h3><p>With Terraform the configuration is declarative and because of assumptions made in the code I had a few problems to solve.</p><ul><li>Can’t duplicate names for InstanceProfiles, AutoScalingGroups, and LoadBalancers</li><li>Can’t register instances to a TargetGroup that is in a different VPC</li><li>SecurityGroups can’t be used across VPCs; though they can be referenced in rules</li><li>The DNS entries were CNAMEs to the LBs not A records</li></ul><h3>Walking through the solution steps</h3><p>I broke the problem into pieces and tackled each one individually. The basic order of operations was something like this. I’ll be walking through each piece below.</p><ol><li>Create the new VPC</li><li>Setup the shared resources in both VPCs</li><li>Create additional worker nodes in the new VPCs</li><li>Route traffic to both VPCs</li><li>Route traffic to just the new VPC</li><li>Migrate the workloads to the nodes in the new VPC</li><li>Migrate the control-plane nodes to the new VPC</li><li>Clean up the old resources</li></ol><p>Again, this seems like a lot to do but it turned out to not be as much as I had originally thought it would be. One of the things that made it easier was a rich set of tags on all our AWS resources. I was able to clearly reference them in the Terraform code and do data resources accordingly.</p><h3>Creating the new VPC</h3><p>First thing was creating the new VPC. This was pretty straight forward except for <a href="https://xkcd.com/910/">coming up with a name</a> that everyone was happy with. Both VPCs were peered with each other so that everything could talk privately to each other.</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/e135889b75173d2da312358866f58818/href">https://medium.com/media/e135889b75173d2da312358866f58818/href</a></iframe><h3>Setting up shared resources</h3><p>Once that was created, I started to tackle the resource naming pattern and that too turned out to be pretty easy to solve. I added a new parameter to the worker-common Terraform module that allowed me to set a suffix to each resource. By default the variable was empty so no prefix was added and the existing resources were not impacted. I could then setup a worker-common-migration module that creates the new resources in the new VPC. Because I didn’t need to move the InstanceProfile (it’s not VPC specific) I added a flag for whether or not the code should create it.</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/f3f7bfeb4ebf88913b8c398771f98f0e/href">https://medium.com/media/f3f7bfeb4ebf88913b8c398771f98f0e/href</a></iframe><p>I put in a bit of a hack for the SecurityGroups so that during the migration the “shared” SecurityGroup references both the old &amp; the new IDs. This is possible because the new VPC is peered to the old VPC.</p><p>Now I could run Terraform so the LBs &amp; SecurityGroups would be created in the new VPC. Once the Terraform run is completed the new “common” resources exist but nothing is using them yet.</p><h3>The worker nodes</h3><p>Next up was creating worker nodes in the new VPC. Unlike with worker-common, I didn’t need to create a migration module for the workers. I just needed to add a new one (x24!) that referenced the newly created resources.</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/40c40b9b0e942795f070e11145a883e2/href">https://medium.com/media/40c40b9b0e942795f070e11145a883e2/href</a></iframe><p>Before I could run terraform apply though I needed to update the Ingress controller. I had to change the Service to externalTrafficPolicy: Cluster so when the ingress controller started running on the nodes in the new VPC the traffic would still be routed to them. Remember, I can’t add the new workers into the same TargetGroup as the existing ones because they are in a different VPC. This created an increase in latency for every request through the ingress controller because the NodePort had to be proxied but it was low enough, and for such a short period of time, that it was considered acceptable. After the new workers were created, I added a taint to the old AutoScalingGroups so that no new Pods would be scheduled onto them.</p><pre>for node in $(aws ec2 describe-instances --filters &quot;Name=tag:KubernetesCluster,Values=CLUSTER_NAME&quot; &quot;Name=vpc-id,Values=OLD_VPC_ID&quot; | jq -r &#39;.Reservations[].Instances[].PrivateDnsName&#39;)l do<br>    kubectl taint nodes $node migration=migration:NoSchedule<br>done</pre><p>Now that I had new workers in the new VPC and they were all able to talk to each other in both VPCs, I was ready to cutover DNS and point it to the new LBs. So far, the total migration time <em>has been only ~20 minutes</em> and I’m meeting my goal of zero downtime.</p><h3>Migrating the workloads</h3><p>At this stage, I need to start shifting the workloads from workers in the old VPC to the workers in the new VPC. I manually evicted the Ingress controller pods which moved them onto workers in the new VPC. This allowed me to switch back to externalTrafficPolicy: Local and get the ingress latency back to normal. A couple minutes later this was done and I could start moving the live workloads. Luckily for me, our Python deployment script allows for cycling nodes based on a label filter. I kicked off the job and it started draining the old worker nodes. As each node was drained the workloads automatically shifted only onto the new worker nodes thanks to the taint I had added. Moving workloads while respecting the PodDisruptionBudgets can be slow and for the live cluster it was expected to take about 4 hours. Working with the lab cluster had this step done in ~30 minutes.</p><figure><img alt="animated gif of the migration steps" src="https://cdn-images-1.medium.com/max/410/1*ajCbwsG9p_cCg1C_1NUIxA.gif" /></figure><h3>Code cleanup</h3><p>While I still had the control-plane to migrate I wanted to clean up the Terraform code to start removing the dueling modules. I flipped the worker-common-migration module to use create_instance_profile = &quot;true&quot; and did the opposite in the original worker-common module. I then moved the resources in the Terraform state from one module to the other.</p><pre>terraform state mv module.worker-common.aws_iam_instance_profile.this module.worker-common-migration.aws_iam_instance_profile.this</pre><pre>terraform state mv module.worker-common.aws_iam_role.readonly module.worker-common-migration.aws_iam_role.readonly</pre><pre>terraform state mv module.worker-common.aws_iam_role.this module.worker-common-migration.aws_iam_role.this</pre><pre>terraform state mv module.worker-common.aws_iam_role_policy.this module.worker-common-migration.aws_iam_role_policy.this</pre><pre># etc, etc</pre><p>I pointed all the worker modules to the new migration module’s output using sed.</p><pre>find . -type f -name &quot;*.tf&quot; -not -path &#39;*/\.terraform&#39; -exec gsed -i &#39;s/module.worker-common.instance_profile_id/module.worker-common-migration.instance_profile_id/g&#39; {} +</pre><p>I removed the original worker-common.tf and changed the module source that the worker-common-migration.tf was pointing to.</p><pre>module &quot;worker-common-migration&quot; {<br>  source = &quot;../modules/worker-common&quot;<br>  # the rest of the code is the same<br>}</pre><p>Even though the module was now called worker-common-migration the code it is using is now the same as all of the other clusters. The parameters still point to the new VPC but the code used is the same and that’s the important part for future development &amp; maintenance. The next Terraform run removed the old AutoScalingGroups and SecurityGroups as they were no longer needed. Now, all that is left is to do the control-plane.</p><h3>The control-plane</h3><p>With the control-plane things start to get tricky. I can’t have more than one node using the same etcd volume and writing to it at the same time. This means that I am going to have to stop one of the nodes, recreate it in the new VPC and then bring it live with the other nodes still running. I can only do one at a time if I don’t want to lose quorum on etcd. Because of the way the control-plane security groups were created inside of the same module that creates the control-plane nodes I wasn’t able to do the same sort of trick as I did with the workers. Instead, I used a variable called extra_security_groups that could be used to attach an extra SecurityGroup to the control-plane nodes. I broke a rule and manually created a SecurityGroup in the existing VPC that had the same rules and manually attached it to each control-plane node. This meant it was now safe for Terraform to delete the original SecurityGroup and recreate it in the new VPC.</p><p>The code calling the module was then updated to point to the new VPC and subnets (e.g., data.aws_subnet.private-migration.*.id). Instead of running a general terraform apply I needed to run each migration step using Terraform’s -target flag for the resources that I wanted to migrate first.</p><pre>terraform apply -target module.control-plane.aws_security_group.control-plane -target module.control-plane.aws_security_group_rule.control-plane-egress -target ... -target ... # etc etc</pre><p>However, I ran into the problem of the LB that runs in front of the control-plane nodes. I needed to be able to balance across 2 different VPCs and that isn’t possible with a TargetGroup. Managing the DNS entries and changing them at precisely the right time during the migration was difficult with our code setup so I decided to break another rule and manually updated the DNS entry. I changed it from pointing to the LB and to be the A records of the first control-plane node in the new VPC. This enabled me to keep full uptime on the api calls made from outside the cluster.</p><figure><img alt="animated gif of the control-plane migration steps" src="https://cdn-images-1.medium.com/max/377/1*_2dOPKWHxRbpCnKkJNdCWw.gif" /></figure><p>I removed the extra_security_groups parameter and ran another targeted apply to recreate just a single set of control-plane resources.</p><pre>terraform apply -target module.control-plane.aws_autoscaling_group.this.2 -target module.control-plane.aws_autoscaling_group.etcd.2 -target module.control-plane.aws_launch_template.this.2 -target module.control-plane.data.template_file.user_data.2 # etc etc</pre><p>This left me with one part of the control-plane in the new VPC and the rest in the old. Now that I had part of the control-plane running in the new VPC I could safely have Terraform, through another -target apply command, recreate the LB in the new VPC. It would have just the single apiserver node in it but that’s okay because external calls to the Kubernetes API are pretty low and it could handle the load. The internal calls use the kubernetes.default Service and are unaffected by these changes. Once the LB was been recreated, I was able to switch the DNS back to the configuration that is a CNAME to the LB.</p><p>I ran the terraform apply -target ... -target ... again for the next piece of the control-plane and that was moved as well. Rinse &amp; repeat one more time and the control-plane was running in the new VPC!</p><p>The migration of the control-plane was a lot more manual than I prefer but it got the job done in about 30 minutes. The majority of that time was waiting on the AWS resources to be created and to come online. In general, because of the way we build our AMIs, it takes 4–7 minutes from when a node starts to boot and become ready in the Kubernetes cluster.</p><h3>More cleanup</h3><p>At this stage the the cluster is fully migrated but the code was messy. I made another pass through the code and got rid of all the data.aws_subnet.private-migration code. It was updated to use the new VPC only in the lookups and the references were pointed back to the original data.aws_subnet.private. The worker-common-migration name for the module in the app will always be there. Well… unless it starts to bug me too much and I do all the terraform state mv commands to move it but that seems like a lot of risk for no real value.</p><p>However, all the AutoScalingGroup still have names with the -migration suffix and that could be considered confusing if someone was looking at the AWS resources. I setup another set of workers without the asg_suffix and set a taint on the ones with -migration the same way I did for during the migration. All new workloads will now go onto these new ASGs and slowly drain off the old over time or whenever the next deployment is released; there’s no benefit to cycling the cluster again now. Once all the workloads are off the -migration tier I’ll remove that code from Terraform as well.</p><h3>Final thoughts</h3><p>In short, this was a huge pain to go through but I’m kind of glad that I had to do it. I’m disappointed that it never went all the way through to production but that’s the way things work sometimes. I think I learned more about the way the code worked than I did writing it in the first place. Sounds strange to say that but it’s true. When I wrote most of the original code I had to think about how things related to each other but in the 2 years since then I’ve never really had to think about it.</p><p>In the future, I’m looking forward to when IPv6 is inside our VPCs and data centers. Assuming you don’t do something … unique … that’ll make IP conflicts a thing of the past. I’m not sure when we’ll end up doing this but thankfully Kubernetes has made this possible when the time comes.</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=bbccc1a26c9f" width="1" height="1" alt=""><hr><p><a href="https://grepmymind.com/live-migrating-a-kubernetes-cluster-across-vpcs-without-downtime-bbccc1a26c9f">Live migrating a Kubernetes cluster across VPCs without downtime</a> was originally published in <a href="https://grepmymind.com">GrepMyMind</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Mental health & tech]]></title>
            <link>https://grepmymind.com/mental-health-tech-cb88d7e12698?source=rss----8def803f7d93---4</link>
            <guid isPermaLink="false">https://medium.com/p/cb88d7e12698</guid>
            <category><![CDATA[mental-health]]></category>
            <category><![CDATA[technology]]></category>
            <dc:creator><![CDATA[Mike Tougeron]]></dc:creator>
            <pubDate>Sat, 10 Oct 2020 17:32:42 GMT</pubDate>
            <atom:updated>2020-10-10T17:32:41.810Z</atom:updated>
            <content:encoded><![CDATA[<p>For a variety of reasons people in the US don’t like talking about mental health. There’s a fear that there will be negative reactions and impact from talking about it. From the simple “I’m stressed out today” to the complex “I’m feeling super depressed today” it is all something that we don’t talk about. I think that’s a load of BS and I would love to see it changed. Mental health is no different from a broken arm or a twisted knee. It’s all about your body’s overall health and there are ways in which a medical professional can help you deal with it or adapt to it. If there’s no stigma needing a pill for low blood pressure then there isn’t one for depression.</p><p>Okay, end-rant. :)</p><p>I suffer from depression, bipolar and ADD. My mental health issues were not officially diagnosed until I was in my late 20s but in hindsight they were there all my life. I remember having good days &amp; bad days but I always associated them with being situational and related to something that recently happened. As an adult, I learned that the frequency of good &amp; bad days (or multiple times a day) were a sign of being bipolar.</p><p>Frequently when I talk about my mental health in a public forum I get responses like:</p><ul><li>Be careful what you say, it might come back &amp; hurt you.</li><li>Are you sure you should tell people that?</li><li>Just don’t say anything about me when you talk about it.</li></ul><p>But what makes it all worth it when I put my vulnerabilities out there, is when I also hear:</p><ul><li>Thank you, it helped me realize that I can could get help too.</li><li>I wanted to let you know that I saw a doctor about it last week because of what you said.</li><li>Can I get your help today?</li></ul><p>It also helps <em>me</em> remember that I’m not alone. It makes me feel good that I might have contributed to someone’s life and possibly even made it better.</p><p>What does this have to do with technology? Well, I thought I’d share some of the tech I’ve used over the years to help me manage. It may or may not help others but the more information that’s out there and the more mental health is discussed in general the better off we will all be.</p><p><a href="https://www.rescuetime.com/"><strong>RescueTime</strong></a> (<a href="https://www.rescuetime.com/ref/1175437">referral link</a>): I use RescueTime to help keep track of what I’m spending my time on at work.</p><figure><img alt="A example RescueTime dashboard showing I spent time shopping and on social media" src="https://cdn-images-1.medium.com/max/1011/1*3devOrnrU3icrfh5ccabnA.jpeg" /></figure><p>Am I coding or am I visiting social media? Am I spending my time in meetings this week or did I work on something where I felt a sense of accomplishment? RescueTime’s reports help me get some good visibility into where I’ve been so that I can understand the why of where I’m at.</p><p><a href="https://www.fitbit.com/"><strong>Fitbit</strong></a>: I use Fitbit to help keep track of how I’m sleeping.</p><figure><img alt="My fitbit sleep dashboard" src="https://cdn-images-1.medium.com/max/800/1*FyAY8WKXEQlkW3ATS_hIFQ.jpeg" /></figure><p>A lot of people use Fitbit to track their steps &amp; exercise but it also does a pretty good job of tracking sleep at night. I don’t sleep very well in general but some nights are worse than others. Knowing how well I slept the night before helps me better determine if I’m having a mood swing due to lack of sleep and should take a nap. Or am I cranky due to lack of sleep vs a fit of depression.</p><p><strong>Alarm Clock</strong>: I have 3 alarms set each day for mental health.</p><figure><img alt="example alarms I set to take my medicine &amp; breaks throughout the day" src="https://cdn-images-1.medium.com/max/496/1*8TT41Wtf6E8QEcxPPsYyDQ.jpeg" /></figure><p>My obsessive nature and ADD makes it easy for me to lose track of time. Using these basic alarms I remind myself to self-care and check-in on my current state of well-being.</p><p><strong>Other recommendations</strong>: I haven’t personally used these tools (yet!) but they came highly recommended from people I trust &amp; respect. The tools look useful so I’m sharing them too. The first is <a href="https://mobile.va.gov/app/mindfulness-coach">Mindfulness Coach</a> from the VA and the other is <a href="https://www.talkspace.com/">Talkspace</a>. If you use either of them let me know what you think.</p><p>Hopefully this will help someone feel more comfortable with themselves and feel like they’re not alone. Mental health is important and like anything else it takes care, feeding &amp; attention to maintain a consistent level of health. The more we talk &amp; share our challenges, successes and general state the better the world will be.</p><p>If you ever need to talk or want to know more about my experiences, feel free to reach out to me and I’ll be there to listen.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*_hfydkP_HOVeDmJQrPn46g.jpeg" /></figure><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=cb88d7e12698" width="1" height="1" alt=""><hr><p><a href="https://grepmymind.com/mental-health-tech-cb88d7e12698">Mental health &amp; tech</a> was originally published in <a href="https://grepmymind.com">GrepMyMind</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Initial thoughts on Kubecost]]></title>
            <link>https://grepmymind.com/initial-thoughts-on-kubecost-383d93368e7a?source=rss----8def803f7d93---4</link>
            <guid isPermaLink="false">https://medium.com/p/383d93368e7a</guid>
            <category><![CDATA[cost-savings]]></category>
            <category><![CDATA[kubecost]]></category>
            <category><![CDATA[kubernetes]]></category>
            <dc:creator><![CDATA[Mike Tougeron]]></dc:creator>
            <pubDate>Mon, 05 Oct 2020 15:26:17 GMT</pubDate>
            <atom:updated>2020-10-05T15:26:17.076Z</atom:updated>
            <content:encoded><![CDATA[<p>Recently I had the opportunity to install <a href="https://kubecost.com/">Kubecost</a> on several of the AWS clusters I manage. The tldr is that it was a very helpful and useful system. But to be honest, my initial thoughts were leaning towards the negative until I got it all setup. IMO, like a lot of start-up products, the documentation isn’t the greatest. I felt kind of overwhelmed by what needed to be done and the names of the project vs the docs vs the Github repo didn’t exactly match up (kubecost vs cost-model vs cost-analyzer).</p><p><strong><em>BUT</em></strong>, and this is pretty huge, the Kubecost team was great to work with. They got on a video call, walked us through what I was doing wrong, and helped bridge the gap around what I didn’t understand. And they have a <a href="http://docs.kubecost.com/support-channels">Slack channel</a> to help as well. Once I understood how the components worked together I was good to go. I sent them my feedback about the documentation and hopefully that’ll help the next person who comes along.</p><p>As part of my setup, I kept Kubecost isolated from the system <a href="https://prometheus.io/">Prometheus</a> data so I had to setup a dedicated Prometheus. This meant that even though Kubecost was setup and running, the data wasn’t very useful until there was a day or two worth loaded into the system. I didn’t spend the time to use the exact pricing (<a href="http://docs.kubecost.com/getting-started.html#ri-committed-discount">RIs</a>, <a href="http://docs.kubecost.com/getting-started.html#spot-nodes">spot</a>, <a href="http://docs.kubecost.com/getting-started.html#out-of-cluster">etc</a>); the default pricing model was sufficient for this POC. Once I had the data history the usage patterns instantly became clear. I knew within minutes what namespaces I needed to look at in order to cost-optimize.</p><p>At a glance I knew that there was no way that <a href="https://kubernetes.github.io/ingress-nginx/">nginx-ingress</a> should account for 15% of a cluster’s costs.</p><figure><img alt="nginx ingress for public traffic accounts for 15% of cluster costs" src="https://cdn-images-1.medium.com/max/309/1*3T37QQCPRjLmrcP8WfUFfQ.jpeg" /><figcaption>A screenshot of reporting data from the Kubecost dashboard</figcaption></figure><p>When I looked at the deployment I saw it was requesting 10cpu and 3 pods for an internal cluster that barely gets any traffic. I adjusted the resources requested and the cluster autoscaler quickly reduced the number of workers running. I already had a win and I was just getting started!</p><p>Diving deeper into a different cluster I found that a staging namespace was costing <em>more</em> than the production namespace by more than 3x. Another quick click and I saw that the namespace was using 17TB of disk!</p><figure><img alt="showing storage allocated of 17 terrabytes" src="https://cdn-images-1.medium.com/max/305/1*FHLtqbhIQZ5ZgEew1pBrhw.jpeg" /><figcaption>Screenshot from the namespace specific details</figcaption></figure><p>Turns out there was some performance testing done where the number of replicas was increased super high but when the test was stopped the PVCs were overlooked and not cleaned up.</p><p>Some of the applications are pretty dialed-in with resource requests &amp; limits while others are not. Thankfully, Kubecost was able to help with this as well.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/535/1*Ze6znM2VJ2kAdrlkdbhX9Q.jpeg" /></figure><p>Using the namespace details dashboard I was able to see Kubecost’s resource recommendations vs what was set in the manifests. In this screenshot I was able to identify that a <a href="https://thanos.io/">Thanos</a> sidecar for <a href="https://prometheus.io/">Prometheus</a> was misconfigured with 30Gi instead of 30Mi of memory. 30Gi of memory for each Prometheus pod across a dozen or more clusters and you’re talking about some real money.</p><p>One of the features I like is the ability to allocate “shared” costs proportionally across all applications. We run an ELK (with fluentd) stack for the container logs as a system service that all teams can use. We wouldn’t be running these Kubernetes specific ELK stacks so we consider them part of the cost of doing business on the platform.</p><figure><img alt="how to share system resources proportionately" src="https://cdn-images-1.medium.com/max/1024/1*MgZxUskCTuVeK_umm5WJ2A.jpeg" /><figcaption>Settings page</figcaption></figure><p>With Kubecost I’m able to distribute the costs of the control-plane and ELK, based on their labels, to each of the applications using the Kubernetes platform. I considered allocating the kube-system namespace the same way but I want to know how much these system components &amp; daemonsets were costing us to run.</p><p>To set up Kubecost, I used <a href="https://helm.sh">Helm</a> and the <a href="https://github.com/kubecost/cost-analyzer-helm-chart">cost-analyzer-helm-chart</a> chart from Kubecost. Because I was planning on running this in several clusters (at least 7) I created a wrapper-chart where I could set custom default values for all my deployments as well as a few custom resources for our setup. I kept the default values for the resource requests &amp; limits but that was a mistake. For my larger clusters I was constantly being evicted &amp; OOM’d. If you can, deploy it without limits first to see what’s actually being used and then set the requests/limits accordingly. In the wrapper-chart we also create our Ingress and <a href="https://cert-manager.io/docs/">cert-manager certificates</a>. We create the Ingress in the wrapper so that we can use a shared <a href="https://helm.sh/docs/topics/library_charts/">library chart</a> we have. Lastly we try and set as many of the kubecostProductConfigs variables as possible; though we found some were not able to be set via the Helm chart. One thing to watch out for with this chart is that the Kubecost team modified the values.yaml for the sub-charts distributed with it.</p><p><strong>The good:</strong></p><ul><li>Easy to use once setup</li><li>Provides quick insights into where you’re spending $$</li><li>Able to distribute shared costs proportionally across applications</li></ul><p><strong>The bad:</strong></p><ul><li>Can be hard to find what you need in the documentation</li></ul><p><strong>Verdict:</strong></p><p>Install it and try it! If your experience is anything like mine, the cost savings will more than makeup for the price.</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=383d93368e7a" width="1" height="1" alt=""><hr><p><a href="https://grepmymind.com/initial-thoughts-on-kubecost-383d93368e7a">Initial thoughts on Kubecost</a> was originally published in <a href="https://grepmymind.com">GrepMyMind</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
    </channel>
</rss>