From the course: Data Pipeline Automation with GitHub Actions Using R and Python

Introduction to GitHub Actions

- [Instructor] In this video, we will review GitHub Actions core functionality and features. Let's start by explaining what GitHub Action is. GitHub Actions, or Actions, is a CI/CD tool for software workflow automation. It is integrated part of GitHub features, and it is, by default, available on any GitHub repository. Workflow defines a process that we automate on Actions. It can derive from a simple bash script with a list of commands, to a complex software build. There are mainly two types of automation methods in Actions. First is Triggered Workflow, which is simply define a job that is set to start when some action takes place. For example, this workflow is triggered to run a unit test whenever a git-commit or portal request takes place. The second type of automation is Scheduled Workflow, which run a job based on a timing or a cron job. For example, this workflow on the right side is set to run every hour. In this course, we will focus on scheduling jobs with GitHub Actions. Let's review GitHub Actions key features. First and foremost, it's fully integrated with GitHub and its features. It supports multiple OS systems such as Windows, macOS, and Ubuntu. It fully supports deployment with Docker, which is the method we will use in this course, provides logs, and it also provide a service for storing sensitive data such as API keys and credentials. We will use this service to store our EIA API key. Last but not least, the service has bought free and enterprise versions. We will use the free version for the deployment of our data pipeline. Let's now review the general requirements for setting a workflow on Actions. First, you need a YAML file. A workflow is set and configured with a YAML file, and we will build those workflow using this method. Actions enable you to select which OS or OS versions to run your workflow. Before you get started with defining your environment or docker container settings, please make sure Actions supports this version. We will use Ubuntu 2020 .04. The best practice for code deployment in scheduling system, particularly with Actions, is with Docker Container. Generally, you can deploy your code without the use of an image by setting your environment upon the launch time of the job. But it is not a recommended practice. You will need to define your actions that you want to deploy. Typically, it is a script or multiple scripts. If your workflow is using credentials or any other sensitive inputs, you should set them as a secret. You can also set environment variables if needed. Last but not least, workflows are stored in the workflows folder under the .GitHub folder. In the next video, we will review the Docker Container settings.

Contents