random bits of computer info: containers

Showing posts with label containers. Show all posts

Tuesday, March 7, 2023

Reading downloaded logs from Quay.io

Quay is the container registry service sponsored by Red Hat inc. based on the projectquay.io free/open source project.

It supports also building images. One issue though is that logs are downloaded in a custom JSON format.

A simple example is:

> {"logs":[{"data":{"datetime":"2023-03-07 15:19:36.159268"},"message":"build-scheduled","type":"phase"}]}

In this short post I give you a very simple way to read those logs in terminal:

> jq -r '.logs[].message' < /tmp/98afc879-8ef1-4cc6-9425-cf5e77712a5f.json

Monday, April 26, 2021

Rsync between volumes on two different OpenShift clusters

This is a short HOWTO about rsync-ing data between 2 distinct OpenShift clusters.

You always have the option to oc rsync the data from source OpenShift cluster to your local workstation and then oc rsync from your workstation to target cluster. But if you have halt a terabyte of data you may not have enough space or it may take several days because of network bandwidth limitation.

The method I describe below avoids any such inefficiencies as well the rsync process is restarted in case some network or system glitch kills it.

It basically works by having:

a kubeconfig file with access to the target OpenShift cluster inside a secret on the source OpenShift cluster
a pod on target OpenShift cluster with target volume mounted
a pod on source OpenShift cluster with source volume and kubeconfig secret mountedand an entrypoint running oc rsync

So lets start with generating a proper kubeconfig secret.

$ touch /tmp/kubeconfig
$ chmod 600 /tmp/kubeconfig
$ oc login --config=/tmp/kubeconfig # make sure to use target cluster API endpoint
$ oc project my-target-cluster-namespace --config=/tmp/kubeconfig

Note that command below will run against source OpenShift cluster.

$ oc login # use source cluster API endpoint
$ oc create secret generic kubeconfig --from-file=config=/tmp/kubeconfig

I will assume that you have your target pod already running inside target cluster. Otherwise you can create one similar to the pod in source cluster below, just use some entrypoint command to keep it permanently running. For example /bin/sleep 1000000000000000000.

Now all we need to do is run a proper pod in source cluster to do the rsync task. Here is an example pod YAML with comments to make clear what to use in your situation:

apiVersion: v1
kind: Pod
metadata:
  name: rsync-pod
  namespace: my-namespace-on-source-cluster
spec:
  containers:
    # use client version ±1 of target OpenShift cluster version
    - image: quay.io/openshift/origin-cli:4.6
      name: rsync
      command:
      - "oc"
      args:
      - "--namespace=my-target-cluster-namespace"
      - "--kubeconfig=/run/secrets/kube/config"
      # insecure TLS is not recommended but is a quick hack to get you going
      - "--insecure-skip-tls-verify=true"
      - "rsync"
      - "--compress=true"
      - "--progress=true"
      - "--strategy=rsync"
      - "/path/to/data/dir/"
      - "target-pod-name:/path/to/data/dir/"
      volumeMounts:
        - mountPath: /path/to/data/dir
          name: source-data-volume
        - mountPath: /run/secrets/kube
          name: kubeconfig
          readOnly: true
  # restart policy will keep restarting your pod until rsync completes successfully
  restartPolicy: OnFailure
  terminationGracePeriodSeconds: 30
  volumes:
    - name: source-data-volume
      persistentVolumeClaim:
        claimName: source-persistant-volume-claim-name
    - name: kubeconfig
      secret:
        defaultMode: 420
        secretName: kubeconfig

And last needed command is to run this pod inside the source cluster:

$ oc create -f rsync-pod.yaml

Now check what state is your pod in:

$ oc describe pod rsync-pod

If it start properly, then monitor your progress:

$ oc logs -f rsync-pod

Friday, November 23, 2018

Running Logstash container under OpenShift

What is the issue?

Main problem for running random images under OpenShift is that OpenShift starts containers as a random user. This is done for security reasons (isolation of workloads). A user can be given permissions to run `privileged` containers but this is not recommended if it can be avoided.

You can check my earlier blog Creating docker images suitable for OpenShift (ssh-git image HowTo) for openshift for more information and a more complicated example.

Logstash official container image

Official logstash image can be found on dockerhub and is built off logstash-docker github project. It is not specifically built to run in OpenShift but it is still straightforward to run it unmodified. There are only 2 issues:

it tries to run as user 1000 and expects to find logstash code in user's home directory
some configuration files lack needed permissions to be modified by a randim user id

Get running it

Depending on what you're trying to do, you can approach in a somehow different way. I will give a specific example by mostly retaining original configuration (beats input and stdout output) but adding `config` file with Kubernetes audit setup and disabling elasticsearch monitoring as I don't have an elasticsearch backend. I hope this will provide enough of an example so you can setup your instance the way you desire.

Creating configuration

To store our custom configuration files, we will create a config map with the file content.

$ cat logstash-cfgmap.yml
apiVersion: v1
data:
  logstash-wrapper.sh: |-
      set -x -e
      rm -vf "/usr/share/logstash/config/logstash.yml"
      echo "xpack.monitoring.enabled: false" > "/usr/share/logstash/config/logstash.yml"
      exec /usr/local/bin/docker-entrypoint "$@"
  config: |-
    input{
        http{
            #TODO, figure out a way to use kubeconfig file to authenticate to logstash
            #https://www.elastic.co/guide/en/logstash/current/plugins-inputs-http.html#plugins-inputs-http-ssl
            port=>8888
            host=>"0.0.0.0"
        }
    }
    filter{
        split{
            # Webhook audit backend sends several events together with EventList
            # split each event here.
            field=>[items]
            # We only need event subelement, remove others.
            remove_field=>[headers, metadata, apiVersion, "@timestamp", kind, "@version", host]
        }
        mutate{
            rename => {items=>event}
        }
    }
    output{
        file{
            # Audit events from different users will be saved into different files.
            path=>"/var/log/kube-audit-%{[event][user][username]}/audit"
        }
    }
kind: ConfigMap
metadata:
  name: logstash
$ oc create -f logstash-cfgmap.yml
configmap/logstash created

With the above config map we define two files that will be available inside the container.

logstash-wrapper.sh - this we need to run some custom commands before we delegate back to image original entry point. Namely to remove original `logstash.yml` that lacks group write permissions. As well disable elasticsearch monitoring that is enabled by default. The write permissions are needed in case logstash image startup script notice env variables that need to be converted to configuration entries and put into it. See env2yaml.go and docker-config docs.
config - this file contains logstash configuration file and is a copy of what I presently see in kubernetes auditing docs.

Note: at this step you can create full Logstash configuration inside the config map together with `logstash.yml`,`log4j2.properties`, `pipelines.yml`, etc. If you choose so, then we can ignore default configuration files from the official image.

Creating deployment config

$ oc run logstash  --image=logstash:6.5.0 --env=LOGSTASH_HOME\=/usr/share/logstash --command=true bash -- /etc/logstash/logstash-wrapper.sh -f /etc/logstash/config
deploymentconfig.apps.openshift.io/logstash created

A few things to explain:

we are setting LOGSTASH_HOME environment variable to `/usr/share/logstash` because we are running as a random user thus user home directory will not work
we override container start command to our wrapper script

we add `-f /etc/logstash/config` to point at our custom config
in case we wanted to put all our configuration in the config map, then we can set instead `--path.settings /etc/logstash/`
once pull/113 is merged, the custom startup script wrapper will not be needed, but we may still want to provide additional arguments like `-f` and `--path.settings`

Further we need to make sure our custom configuration is mounted under `/usr/share/logstash`

$ oc set volume --add=true --configmap-name=logstash --mount-path=/etc/logstash dc/logstash
deploymentconfig.apps.openshift.io/logstash volume updated

Finally, because our custom config selects /var/log for writing logs, we need to mount a volume at that path.

oc set volume --add=true --mount-path=/var/log dc/logstash

What we did is create an emptyDir volume that will go away when pod dies. If you want to persist these logs, then a Persistent Volume needs to be used instead.

Exposing logstash service to the world

First we need to create a service that will allow other project pods and Kubernetes to reach Logstash.

$ oc expose dc logstash --port=8888
service/logstash exposed

Port 8888 is what we have set as an HTTP endpoint in `config`. If you expose other ports, then you'd have to create one service per each port that you care about.

We can easily expose HTTP endpoints to the great Internet so that we can collect logs from services external of the OpenShift environments. We can also expose non-HTTP endpoints to the internet with the node port service type but there are more limitations. Or for OpenShift 4.x Ingress Controller can be used for exposing non-HTTP endpoints. Below see how to do with the HTTP traffic.

$ oc expose service logstash --name=logstash-http-input
route.route.openshift.io/logstash-http-input exposed

Important: Only expose secured endpoints to the Internet! In the above example the endpoint is insecure and no authentication is required. Thus somebody can DoS your Logstash service easily.

That's all.