Topo workflows are run on a AWS EKS Cluster using Argo Workflows. The detailed configuration is available in this repo.
To get set up you need access to the Argo Workflows user role inside the EKS cluster, you will need to contact someone from the Geospatial Data Engineering to get access, all Imagery maintainers will already have access.
If creating your own workflow, or interested in the details of a current workflow please also read the CONFIGURATION.md.
You will need
Ensure you have kubectl aliased to k
alias k=kubectlTo connect to the EKS cluster you need to be logged into AWS
aws-azure-login
Then to setup the cluster, only the first time using the cluster you need to run this
aws --region=ap-southeast-2 eks update-kubeconfig --name=Workflowsto validate the cluster is connected,
k get nodes
NAME STATUS ROLES AGE VERSION
ip-255-100-38-100.ap-southeast-2.compute.internal Ready <none> 7d v1.21.12-eks-5308cf7
ip-255-100-39-100.ap-southeast-2.compute.internal Ready <none> 7d v1.21.12-eks-5308cf7to make the cli access easier you can set the default namespace to argo
k config set-context --current --namespace=argoOnce the cluster connection is setup a job can be submitted with the cli or accessed via the running argo-server
argo submit --watch workflows/raster/standardising.yamlTo open the web interface:
# Create a connection to the Argo server
k port-forward deployment/argo-workflows-server 2746:2746
xdg-open http://localhost:2746In the Workflows page:
SUBMIT NEW WORKFLOWEdit using full workflow optionsUPLOAD FILE- (Locate File -> Open)
+ CREATE
Elasticsearch is an analytics engine, it allows us to store, search and analyse AWS logs.
Elasticsearch can be accessed through https://myapplications.microsoft.com/.
workflow data view and set the correct time filter.
All Logs for a Workflow:
kubernetes.labels.workflows.argoproj.io/workflow : "imagery-standardising-v0.2.0-60-9b7dq"
All Logs for a pod:
Click on the pod in the Argo UI and scroll through the summary table to find the pod name.
kubernetes.annotations.workflows.argoproj.io/node-name.keyword : "imagery-standardising-v0.2.0-60-9b7dq.create-config"
List Failed Stac Validation Logs:
kubernetes.labels.workflows.argoproj.io/workflow : "imagery-standardising-v0.2.0-60-9b7dq" and data.valid : False
Find a Basemaps URL:
kubernetes.labels.workflows.argoproj.io/workflow : "imagery-standardising-v0.2.0-60-9b7dq" and data.url : *
or
data.title : "Wellington Urban Aerial Photos (1987-1988) SN8790" and data.url : *
kubernetes.container_hash field, available in Elasticsearch, gives the container hash that was used to run the task. It allows to get the version from the container registry for further investigations.
All workflow outputs and logs are stored in the artifacts bucket, in the linz-workflows-scratch bucket on the li-topo-prod account.
All outputs follow the same naming convention:
s3://linz-workflows-scratch/YYYY-mm/dd-workflow.name/pod.name/
For each pod the logs are saved as a main.log file within the related pod.name prefix.
Unless a different location is specified within the workflow code, output files will be uploaded to the corresponding pod.name prefix.
Note: This bucket has a 90 day expiration lifecycle.
kubectl get node -n argo
ip-12-345-67-890.ap-southeast-2.compute.internal Ready <none> 227d v1.30.1-eks-e564799
ip-98-765-43-210.ap-southeast-2.compute.internal Ready <none> 227d v1.30.1-eks-e564799In the template, use the nodeSelector to specify a node to run in:
- name: my-template
nodeSelector:
kubernetes.io/hostname: ip-98-765-43-210.ap-southeast-2.compute.internalSee the workflows/test/sleep.yml workflow for an example.
List pods:
k get pods --namespace=argo
# note: if the default namespace is set to argo, `--namespace=argo` is not required.In the output next to the NAME of the pod, the READY column indicates how many Docker containers are running inside the pod. For example, 1/1 indicates there is one Docker container.
The output of the follow command includes a Containers section. The first line in this section is the container name, for example, argo-server.
k describe pods *pod_name* --namespace=argoTo access a container in a pod run:
k exec --namespace=argo --stdin=true --tty=true *pod_name* -- bashOnce inside the container you can run a number of commands. For example, if trouble shooting network issues, you could run the following:
mtr linz-workflows-scratch.s3.ap-southeast-2.amazonaws.commtr sts.ap-southeast-2.amazonaws.comwatch --errexit nslookup linz-workflows-scratch.s3.ap-southeast-2.amazonaws.comSee Concurrency for details on how to set limits on how many workflow instances can be run concurrently.
error: exec plugin: invalid apiVersion "client.authentication.k8s.io/v1alpha1"
Upgrade aws cli to > 2.7.x
Some tasks in the Workflows or WorkflowsTemplates use a container to run from. These containers are build from other repository, such as https://github.com/linz/topo-imagery, https://github.com/linz/argo-tasks or https://github.com/linz/basemaps.
Different tags are published for each of these containers:
latestvX.Y.ZvX.YvX
The container version are managed by a workflow parameter that needs to be specified when submitting the workflow. The default value is the last major version of the container.
Using the major version tag (vX) with imagePullPolicy: Always ensures that all minor versions are included when running a workflow using these containers.
This tag should never be used in production as it points to the latest build of the container which could be an unstable version. We reserve this tag for testing purposes.
These tags are intended to be use in production as they will be published for each stable release of the container.
:vX.Ywill change dynamically asZwill be incremented.:vXwill change dynamically asYandZwill be incremented.
For each Workflow and WorkflowTemplate, there is a parameter version_* that allows to specify the version of the LINZ container to use.

