<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:cc="http://cyber.law.harvard.edu/rss/creativeCommonsRssModule.html">
    <channel>
        <title><![CDATA[Graffity - Medium]]></title>
        <description><![CDATA[We’re tech startup based in Southeast Asia. We create an AR Cloud Platform using our VPS technologies. Make city-scaled AR happen in the real world. - Medium]]></description>
        <link>https://medium.com/graffity-technologies?source=rss----65ca799aafca---4</link>
        <image>
            <url>https://cdn-images-1.medium.com/proxy/1*TGH72Nnw24QL3iV9IOm4VA.png</url>
            <title>Graffity - Medium</title>
            <link>https://medium.com/graffity-technologies?source=rss----65ca799aafca---4</link>
        </image>
        <generator>Medium</generator>
        <lastBuildDate>Tue, 02 Jun 2026 00:01:38 GMT</lastBuildDate>
        <atom:link href="https://medium.com/feed/graffity-technologies" rel="self" type="application/rss+xml"/>
        <webMaster><![CDATA[yourfriends@medium.com]]></webMaster>
        <atom:link href="http://medium.superfeedr.com" rel="hub"/>
        <item>
            <title><![CDATA[How to deploy gRPC service to AWS EKS]]></title>
            <link>https://medium.com/graffity-technologies/how-to-deploy-grpc-service-to-aws-eks-8fb48145d987?source=rss----65ca799aafca---4</link>
            <guid isPermaLink="false">https://medium.com/p/8fb48145d987</guid>
            <category><![CDATA[grpc]]></category>
            <category><![CDATA[kubernetes]]></category>
            <category><![CDATA[aws]]></category>
            <dc:creator><![CDATA[Bank Wang]]></dc:creator>
            <pubDate>Sat, 25 Dec 2021 16:29:10 GMT</pubDate>
            <atom:updated>2022-08-22T15:24:33.737Z</atom:updated>
            <content:encoded><![CDATA[<figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*oHUsF3bp-Nn_Ux6-QDksBA.png" /><figcaption>gRPC meets AWS EKS</figcaption></figure><p><a href="https://grpc.io/about/">gRPC</a> takes a major role in the high-performance communication protocol. It’s the top layer of HTTP/2 also with <a href="https://grpc.io/blog/grpc-stacks/#wrapped-languages">QUIC or HTTP/3 in the future</a>. We take advantage of Bidirectional streaming, e.g., transferring images between client and ML server in a single connection.</p><p>Deploying the gRPC service for K8s to AWS EKS is still new for the community. So, we would like to share the process to deploy gPRC to the EKS cluster.</p><h3>Prerequisites</h3><p>The following bits of knowledge are required before deployment:</p><ul><li>EKS cluster with <a href="https://docs.aws.amazon.com/eks/latest/userguide/getting-started-eksctl.html"><em>eksctl</em></a><em> &amp; </em><a href="https://docs.aws.amazon.com/eks/latest/userguide/install-kubectl.html"><em>kubectl</em></a></li><li>Helm V3</li><li>AWS IAM Permissions</li><li>Route 53 Domain “test-grpc.com” in this case</li><li>AWS Certificate Manager</li></ul><h3>Get Started</h3><h4>Create EKS cluster</h4><p>Create EKS cluster with your name &amp; region. This takes approximately 15 minutes.</p><pre>export AWS_CLUSTER_NAME=grpc-cluster<br>export AWS_REGION=us-west-2<br>export K8S_VERSION=1.21</pre><pre>eksctl create cluster \<br>  --name=${AWS_CLUSTER_NAME} \<br>  --version=${K8S_VERSION} \<br>  --managed --nodes=1 \<br>  --region=${AWS_REGION} \<br>  --node-type t3.small \<br>  --node-labels=&quot;lifecycle=OnDemand&quot;</pre><h4>Setup AWS Load Balancer Controller (ALB)</h4><p>Due to the installation not yet settled, please follow <a href="https://docs.aws.amazon.com/eks/latest/userguide/aws-load-balancer-controller.html">this guideline</a> for the latest update from AWS team to install ALB on EKS.</p><h4>Deploy sample gRPC server manifests</h4><p>Deploy all the manifests from GitHub.</p><pre>kubectl apply -f <a href="https://raw.githubusercontent.com/kubernetes-sigs/aws-load-balancer-controller/main/docs/examples/grpc/grpcserver-namespace.yaml">https://raw.githubusercontent.com/kubernetes-sigs/aws-load-balancer-controller/main/docs/examples/grpc/grpcserver-namespace.yaml</a> </pre><pre>kubectl apply -f <a href="https://raw.githubusercontent.com/kubernetes-sigs/aws-load-balancer-controller/main/docs/examples/grpc/grpcserver-service.yaml">https://raw.githubusercontent.com/kubernetes-sigs/aws-load-balancer-controller/main/docs/examples/grpc/grpcserver-service.yaml</a><br> <br>kubectl apply -f <a href="https://raw.githubusercontent.com/kubernetes-sigs/aws-load-balancer-controller/main/docs/examples/grpc/grpcserver-deployment.yaml">https://raw.githubusercontent.com/kubernetes-sigs/aws-load-balancer-controller/main/docs/examples/grpc/grpcserver-deployment.yaml</a></pre><p>Confirm that all resources were created in READY and STATUS column.</p><pre>kubectl get -n grpcserver all</pre><h4>Custom ingress service</h4><p>Download the ingress manifest.</p><pre>wget <a href="https://raw.githubusercontent.com/kubernetes-sigs/aws-load-balancer-controller/main/docs/examples/grpc/grpcserver-ingress.yaml">https://raw.githubusercontent.com/kubernetes-sigs/aws-load-balancer-controller/main/docs/examples/grpc/grpcserver-ingress.yaml</a></pre><p><strong>An important step </strong>to custom ingress manifest we download before</p><ol><li>spec &gt; rules &gt; host — Change host from <em>grpcserver.example.com</em> to <em>test-grpc.com</em> in this case.</li><li>metadata &gt; annotations — Add your cluster <strong>public subnets</strong> to <em>alb.ingress.kubernetes.io/subnets </em>key for example:</li></ol><pre>metadata:<br>  annotations:<br>    alb.ingress.kubernetes.io/subnets: subnet-xx,subnet-xx,subnet-xx<br>    alb.ingress.kubernetes.io/actions.ssl-redirect: …<br>…</pre><p>Deploy ingress we custom before</p><pre>kubectl apply -f grpcserver-ingress.yaml</pre><p>Wait a few minutes for provisioning and check the ALB address with</p><pre>kubectl get ingress -n grpcserver grpcserver</pre><pre><br># sample output<br>NAME       CLASS  HOSTS  <strong>ADDRESS</strong>                     <br>grpcserver &lt;none&gt; *      k8s-grpcserv-xx.us-west-2.elb.amazonaws.com</pre><pre>PORTS   AGE<br>80      2m32s</pre><h4>Add A record in Route 53</h4><p><strong>Important step:</strong></p><ol><li>Copy ALB address (k8s-grpcserv-xxxx) from previous step</li><li>Add ALB address to <strong>A record</strong> for “test-grpc.com” in Route 53</li><li>Do not choose “dualstack.k8s-grpcserv-xxxx” just put plain text from the previous output for example</li></ol><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*HdHYIrK1pAv7oSynDOfxWg.png" /><figcaption>Add ALB address to A record</figcaption></figure><p>Finally, test this sample gRPC server (you may wait for DNS provisioning)</p><pre>docker run --rm --it --env BACKEND=test-grpc.com placeexchange/grpc-demo:latest python greeter_client.py</pre><pre># sample response<br>Greeter client received: Hello, you!</pre><h3>The End</h3><p>If there’re any problems feel free to leave a comment below. You may apply this concept to the Spot node group that autoscale from zero to desire your workload to optimize your cost as we describe in this post</p><p><a href="https://medium.com/graffity-technologies/how-we-reduce-60-cost-for-ml-cluster-with-k8s-60317eae5c9b">How we reduce 60% cost for ML cluster with K8s</a></p><h3>About us</h3><p>We’re a tech startup based in Southeast Asia. We create an AR Cloud Platform using our VPS technologies. Make The Metaverse happen in the real world.</p><p><a href="https://graffity.tech">Graffity Technologies</a></p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=8fb48145d987" width="1" height="1" alt=""><hr><p><a href="https://medium.com/graffity-technologies/how-to-deploy-grpc-service-to-aws-eks-8fb48145d987">How to deploy gRPC service to AWS EKS</a> was originally published in <a href="https://medium.com/graffity-technologies">Graffity</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[How we reduce 60% cost for ML cluster with K8s]]></title>
            <link>https://medium.com/graffity-technologies/how-we-reduce-60-cost-for-ml-cluster-with-k8s-60317eae5c9b?source=rss----65ca799aafca---4</link>
            <guid isPermaLink="false">https://medium.com/p/60317eae5c9b</guid>
            <category><![CDATA[aws]]></category>
            <category><![CDATA[kubernetes]]></category>
            <category><![CDATA[kubeflow]]></category>
            <category><![CDATA[machine-learning]]></category>
            <dc:creator><![CDATA[Bank Wang]]></dc:creator>
            <pubDate>Mon, 18 Oct 2021 11:33:11 GMT</pubDate>
            <atom:updated>2022-04-13T17:19:20.642Z</atom:updated>
            <content:encoded><![CDATA[<figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*NbhXsGerY0EBitDyPadzRg.png" /></figure><h3><strong>Intro</strong></h3><blockquote>Running a GPU cluster for ML jobs with On-Demand instances can burn all your funding, especially for a seed-scale startup like us. So, we need to optimize every dollar we pay but still serve our needs.</blockquote><p>In this post, we’ll talk about concepts and guidelines on how we ran the Machine Learning cluster cost-effectively.</p><h3>Concepts</h3><p>The concept is setting up K8s with an On-Demand instance for <strong>managed node group</strong> and GPU Spot instances (Preemptive for GCP) as <strong>worker nodes</strong>. So, your computing nodes stay in Spot pools, saving up to 90% of your cost compared to On-Demand.</p><p>However, your Spot instances can be terminated anytime. That’s why K8s came in this concept to automate interruption handling, provisioning, and autoscaling Spot instances for your workload. Once you complete this setup, you can leave K8s to automate Ops works for you.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*89Ssvwd8Uccd3IaLd4WD8g.png" /><figcaption>On-Demand instance at least one for <strong>managed node group</strong> and GPU Spot instances as <strong>worker nodes</strong>.</figcaption></figure><p>The key component is <strong>Cluster Autoscaler</strong> can be provisioned as a single-pod deployment to an On-Demand instance. It can be used to manage scaling activities by changing the Auto Scaling group’s DesiredCapacity and directly terminating instances.</p><h4>About Spot Instances (or Preemptible VM)</h4><p>Spot instances are spare unused standard VMs suited for a stateless, fault-tolerant application. When compared to On-Demand instances, Spots are usually available at a 60–90% discount. However, if the provider wants to reclaim those resources for other use, these instances can also be terminated within a minute of notice.</p><p>Read more: <a href="https://aws.amazon.com/ec2/spot/?cards.sort-by=item.additionalFields.startDateTime&amp;cards.sort-order=asc">AWS EC2 Spot</a> and <a href="https://cloud.google.com/compute/docs/instances/preemptible">GCP Preemptible VM</a></p><h3>Guidelines</h3><p>The main goal of this concept is to make sure you <strong>reserve enough Spot capacity (Spot Pools)</strong> to reduce interruption and decrease provisioning time.</p><blockquote>Spot Pools = (Availability Zones) * (Instance Types)</blockquote><p>AWS recommends picking the same size of instances for each node group for example with a 1:4 vCPU to memory ratio:</p><ul><li><strong><em>4vCPU / 16GB Node Group</em></strong> : m5.xlarge, m5d.xlarge, m5n.xlarge, m5dn.xlarge</li><li><strong><em>8vCPU / 32GB Node Group</em></strong> : m5.2xlarge, m5d.2xlarge, m5n.2xlarge, m5dn.2xlarge, m5a.2xlarge, m4.2xlarge</li></ul><p>But in the GPU cluster, there’s no type of those instances much to shopping. So, we found that picking the same GPU type is best for us due to our observation. For example:</p><ul><li><strong><em>NVIDIA T4 Node Group</em></strong> : g4dn.xlarge, g4dn.2xlarge, g4dn.4xlarge, g4dn.8xlarge</li><li><strong><em>NVIDIA V100 Node Group</em></strong> : p3.2xlarge, p3.8xlarge, p3.16xlarge</li></ul><p>For hands-on, you can follow this article to try this concept on your own in a “<strong>Walkthrough</strong>” step below.</p><p><a href="https://aws.amazon.com/blogs/compute/cost-optimization-and-resilience-eks-with-spot-instances/">Building for Cost optimization and Resilience for EKS with Spot Instances | Amazon Web Services</a></p><h4>Fault-Tolerant vs. High Availability</h4><p>Last but not least, you can leverage between lower your cost or higher cluster availability due to requirements that make Fault-Tolerant or High Availability cluster respectively.</p><p>To achieve those varieties, you can vary these three variables to reserve the number of instances in a node group that is <strong><em>minSize, maxSize, and desiredCapacity.</em></strong></p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*C58VH2pR0O9E_Di5YdOwvQ.png" /><figcaption>Example of reserving Spot instances for each node group</figcaption></figure><p><em>Our Notice: We found that GPU Spot instances on AWS are not terminated as often as CPU Spots.</em></p><p>This blog is about ideas and guidelines which are so brief to keep it readable. So, you can contact me for further information.</p><h3>About us</h3><p>We’re tech startup based in Southeast Asia. We create an AR Cloud Platform using our VPS technologies. Make The Metaverse happen in the real world.</p><p><a href="https://graffity.in">Graffity Technologies</a></p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=60317eae5c9b" width="1" height="1" alt=""><hr><p><a href="https://medium.com/graffity-technologies/how-we-reduce-60-cost-for-ml-cluster-with-k8s-60317eae5c9b">How we reduce 60% cost for ML cluster with K8s</a> was originally published in <a href="https://medium.com/graffity-technologies">Graffity</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
    </channel>
</rss>