How to deploy gRPC service to AWS EKS

Bank Wang — Sat, 25 Dec 2021 16:29:10 GMT

gRPC meets AWS EKS

gRPC takes a major role in the high-performance communication protocol. It’s the top layer of HTTP/2 also with QUIC or HTTP/3 in the future. We take advantage of Bidirectional streaming, e.g., transferring images between client and ML server in a single connection.

Deploying the gRPC service for K8s to AWS EKS is still new for the community. So, we would like to share the process to deploy gPRC to the EKS cluster.

Prerequisites

The following bits of knowledge are required before deployment:

EKS cluster with eksctl & kubectl
Helm V3
AWS IAM Permissions
Route 53 Domain “test-grpc.com” in this case
AWS Certificate Manager

Get Started

Create EKS cluster

Create EKS cluster with your name & region. This takes approximately 15 minutes.

export AWS_CLUSTER_NAME=grpc-cluster
export AWS_REGION=us-west-2
export K8S_VERSION=1.21

eksctl create cluster \
  --name=${AWS_CLUSTER_NAME} \
  --version=${K8S_VERSION} \
  --managed --nodes=1 \
  --region=${AWS_REGION} \
  --node-type t3.small \
  --node-labels="lifecycle=OnDemand"

Setup AWS Load Balancer Controller (ALB)

Due to the installation not yet settled, please follow this guideline for the latest update from AWS team to install ALB on EKS.

Deploy sample gRPC server manifests

Deploy all the manifests from GitHub.

kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/aws-load-balancer-controller/main/docs/examples/grpc/grpcserver-namespace.yaml

kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/aws-load-balancer-controller/main/docs/examples/grpc/grpcserver-service.yaml
 
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/aws-load-balancer-controller/main/docs/examples/grpc/grpcserver-deployment.yaml

Confirm that all resources were created in READY and STATUS column.

kubectl get -n grpcserver all

Custom ingress service

Download the ingress manifest.

wget https://raw.githubusercontent.com/kubernetes-sigs/aws-load-balancer-controller/main/docs/examples/grpc/grpcserver-ingress.yaml

An important step to custom ingress manifest we download before

spec > rules > host — Change host from grpcserver.example.com to test-grpc.com in this case.
metadata > annotations — Add your cluster public subnets to alb.ingress.kubernetes.io/subnets key for example:

metadata:
  annotations:
    alb.ingress.kubernetes.io/subnets: subnet-xx,subnet-xx,subnet-xx
    alb.ingress.kubernetes.io/actions.ssl-redirect: …
…

Deploy ingress we custom before

kubectl apply -f grpcserver-ingress.yaml

Wait a few minutes for provisioning and check the ALB address with

kubectl get ingress -n grpcserver grpcserver


# sample output
NAME       CLASS  HOSTS  ADDRESS                     
grpcserver  *      k8s-grpcserv-xx.us-west-2.elb.amazonaws.com

PORTS   AGE
80      2m32s

Add A record in Route 53

Important step:

Copy ALB address (k8s-grpcserv-xxxx) from previous step
Add ALB address to A record for “test-grpc.com” in Route 53
Do not choose “dualstack.k8s-grpcserv-xxxx” just put plain text from the previous output for example

Add ALB address to A record

Finally, test this sample gRPC server (you may wait for DNS provisioning)

docker run --rm --it --env BACKEND=test-grpc.com placeexchange/grpc-demo:latest python greeter_client.py

# sample response
Greeter client received: Hello, you!

The End

If there’re any problems feel free to leave a comment below. You may apply this concept to the Spot node group that autoscale from zero to desire your workload to optimize your cost as we describe in this post

How we reduce 60% cost for ML cluster with K8s

About us

We’re a tech startup based in Southeast Asia. We create an AR Cloud Platform using our VPS technologies. Make The Metaverse happen in the real world.

Graffity Technologies

How to deploy gRPC service to AWS EKS was originally published in Graffity on Medium, where people are continuing the conversation by highlighting and responding to this story.

How we reduce 60% cost for ML cluster with K8s

Bank Wang — Mon, 18 Oct 2021 11:33:11 GMT

Intro

Running a GPU cluster for ML jobs with On-Demand instances can burn all your funding, especially for a seed-scale startup like us. So, we need to optimize every dollar we pay but still serve our needs.

In this post, we’ll talk about concepts and guidelines on how we ran the Machine Learning cluster cost-effectively.

Concepts

The concept is setting up K8s with an On-Demand instance for managed node group and GPU Spot instances (Preemptive for GCP) as worker nodes. So, your computing nodes stay in Spot pools, saving up to 90% of your cost compared to On-Demand.

However, your Spot instances can be terminated anytime. That’s why K8s came in this concept to automate interruption handling, provisioning, and autoscaling Spot instances for your workload. Once you complete this setup, you can leave K8s to automate Ops works for you.

On-Demand instance at least one for managed node group and GPU Spot instances as worker nodes.

The key component is Cluster Autoscaler can be provisioned as a single-pod deployment to an On-Demand instance. It can be used to manage scaling activities by changing the Auto Scaling group’s DesiredCapacity and directly terminating instances.

About Spot Instances (or Preemptible VM)

Spot instances are spare unused standard VMs suited for a stateless, fault-tolerant application. When compared to On-Demand instances, Spots are usually available at a 60–90% discount. However, if the provider wants to reclaim those resources for other use, these instances can also be terminated within a minute of notice.

Read more: AWS EC2 Spot and GCP Preemptible VM

Guidelines

The main goal of this concept is to make sure you reserve enough Spot capacity (Spot Pools) to reduce interruption and decrease provisioning time.

Spot Pools = (Availability Zones) * (Instance Types)

AWS recommends picking the same size of instances for each node group for example with a 1:4 vCPU to memory ratio:

4vCPU / 16GB Node Group : m5.xlarge, m5d.xlarge, m5n.xlarge, m5dn.xlarge
8vCPU / 32GB Node Group : m5.2xlarge, m5d.2xlarge, m5n.2xlarge, m5dn.2xlarge, m5a.2xlarge, m4.2xlarge

But in the GPU cluster, there’s no type of those instances much to shopping. So, we found that picking the same GPU type is best for us due to our observation. For example:

NVIDIA T4 Node Group : g4dn.xlarge, g4dn.2xlarge, g4dn.4xlarge, g4dn.8xlarge
NVIDIA V100 Node Group : p3.2xlarge, p3.8xlarge, p3.16xlarge

For hands-on, you can follow this article to try this concept on your own in a “Walkthrough” step below.

Building for Cost optimization and Resilience for EKS with Spot Instances | Amazon Web Services

Fault-Tolerant vs. High Availability

Last but not least, you can leverage between lower your cost or higher cluster availability due to requirements that make Fault-Tolerant or High Availability cluster respectively.

To achieve those varieties, you can vary these three variables to reserve the number of instances in a node group that is minSize, maxSize, and desiredCapacity.

Example of reserving Spot instances for each node group

Our Notice: We found that GPU Spot instances on AWS are not terminated as often as CPU Spots.

This blog is about ideas and guidelines which are so brief to keep it readable. So, you can contact me for further information.

About us

We’re tech startup based in Southeast Asia. We create an AR Cloud Platform using our VPS technologies. Make The Metaverse happen in the real world.

Graffity Technologies

How we reduce 60% cost for ML cluster with K8s was originally published in Graffity on Medium, where people are continuing the conversation by highlighting and responding to this story.

Graffity - Medium

How to deploy gRPC service to AWS EKS

Prerequisites

Get Started

Create EKS cluster

Setup AWS Load Balancer Controller (ALB)

Deploy sample gRPC server manifests

Custom ingress service

Add A record in Route 53

The End

About us

How we reduce 60% cost for ML cluster with K8s

Intro

Concepts

About Spot Instances (or Preemptible VM)

Guidelines

Fault-Tolerant vs. High Availability

About us