Bash Backup script for password DB - Improved

Question

This is a follow up on this question. Most comments have been addressed or were not applicable. Feel free to point out any bad practice. My comments on the previous review:

Command chaining and subprocesses: Is it a good coding practice to bunch up commands in a subshells just to chain them ?

echo "$1" | ts is unsafe: Can you please clarify why ?

Git isn't a backup system / DB should not be in git: if I lose the DB, I lose my backups because I don't know by heart the FTP accesses to get them back(nor the encryption key). Hence sending the DB somewhere it is always accessible. And in case there is a local data corruption followed by a synch/backup, having it in a VCS guarantees I can recover. Is there a better way to do this ?

#!/bin/bash

# Compares local and remote copies of the keepass db. If there are any diff, the local replaces remote, as local is the
# master.
# KeepassXC tends to make some meta-data changes (DB preferences, last opened group...)
# which will be picked up by this script. Therefore, a sync might happen even if no entry has been
# added/modified/deleted
#
# This script is run weekly by anacron (see ~/.anacron/cron.weekly). Output is sent by mail if MAIL_TO is defined
# in anacrontab.

# It requires:
# - ts (apt install moreutils) for timestamps in the logs
#
# It should be placed in the bin directory of the user so that it automatically appears in $PATH
#
# Usage:
#    backupKeepassDB.sh

log () {
   echo "$1" | ts '[%F %H:%M:%.S]' 2>&1 | tee -a "$log_file"
}

clone_remote_repo () {
   ( cd "$temp_dir" 2>> "$log_file" &&
      log "INFO - Cloning remote repo" &&
      git clone [email protected]:notfound/notfound.git 2>> "$log_file" ) ||
      ( log "ERROR - Failed to clone remote repository" && false )
}

update_and_push () {
   ( rm "$remote_keepassdb_file" 2>> "$log_file" &&
      cp "$local_keepassdb_file" "$local_repository_dir" 2>> "$log_file" ) ||
   ( log "ERROR - Failed to remove $remote_keepassdb_file or copy $local_keepassdb_file to $local_repository_dir" && false )

   ( cd "$local_repository_dir" 2>> "$log_file" &&
      git add . 2>> "$log_file" &&
      git commit -m "Update from $HOSTNAME" 2>> "$log_file" &&
      git push origin main 2>> "$log_file" ) ||
   ( log "ERROR - Failed to push remote repository" && false )
}

cleanup() {
    ( [ -d "$temp_dir" ] && rm -rf "$temp_dir" ) || log "WARN - Failed to clean up temp directory"
}

check_files_are_readable () {
   [ -r "$local_keepassdb_file" ] || ( log "ERROR - $local_keepassdb_file not found or not readable" && false )
   [ -r "$remote_keepassdb_file" ] || ( log "ERROR - $remote_keepassdb_file not found or not readable" && false )
}

trap cleanup EXIT
temp_dir=$(mktemp -d)
log_file="$HOME/Logs/backupkeepassdb.log"
local_keepassdb_file="$HOME/Documents/Secret/Passwords/KeepassXC/Passwords.kdbx"
local_repository_dir="$temp_dir/backup"
remote_keepassdb_file="$local_repository_dir/Passwords.kdbx"
repo_identity_file="$HOME/.ssh/notfoundToGitlab_id_ed25519"
export GIT_SSH_COMMAND="SSH_AUTH_SOCK='/run/user/1000/keyring/ssh' ssh -i $repo_identity_file -o IdentitiesOnly=yes -F /dev/null"

log "INFO - Starting Password db backup"
clone_remote_repo || ( echo "ERROR - Clone failed." && exit 1 )
check_files_are_readable || ( log "Either or both DB are not readable" && exit 1 )
cmp "$local_keepassdb_file" "$remote_keepassdb_file" &&
    log "INFO - Local Keepass DB and Remote Keepass DB are identical. No update needed" &&
    exit 0
( log "INFO - DB files are different. Updating remote..." &&
     update_and_push &&
     log "INFO - Remote has been updated." ) || exit 1

Ed Morton · Accepted Answer · 2025-12-15 16:59:05Z

That's a LOT of unnecessary subshells! You might want to use more {...}s than (...)s, see https://www.gnu.org/software/bash/manual/html_node/Command-Grouping.html.
The output of log goes to stdout but it should go to stderr instead.
You have at least 3 different ways of reporting error messages, e.g.:

git clone [email protected]:notfound/notfound.git 2>> "$log_file"
log "ERROR - Failed to clone remote repository"
echo "ERROR - Clone failed."

where the first one writes to $log_file without a timestamp, the second one writes to $log_file with a timestamp, and the third one doesn't write to $log_file. Come up with 1 method of error reporting (e.g. calling your log function but making sure its output goes to stderr) and then use that consistently, e.g. maybe something like (untested):

log() {
    {
        if (( $# > 0 )); then
            printf '%s\n' "$*"
        else
            cat
        fi
    } |
    ts '[%F %H:%M:%.S]' 2>&1 |
    tee -a "$log_file" >&2
}
...
git clone [email protected]:notfound/notfound.git 2> >(log); wait
log 'ERROR - Failed to clone remote repository'
log 'ERROR - Clone failed.'

The wait after >(log); is necessary because log is being executed in a subprocess and, courtesy of the pipes, 3 subprocesses are being executed within log and we wouldn't want the line after log is called to execute before log completes or we could see output lines in the wrong order in the log file.

Kate · Accepted Answer · 2025-12-15 19:06:05Z

Overall, I find that the process is not very efficient, because you clone the file locally to determine if it differs from your local version. This would not scale well with large files.

You could as well push the file based on different criteria, for example the last updated timestamp. Yes, you've written in the comments that the file could change without entries being actually added/removed, but the changes could perhaps be still meaningful. The file should be very small anyway, so you don't have to think a lot about tradeoffs.

My approach:

From time to time, I backup my home directory to an external hard drive, that includes my KeepassXC file (I am the old school type)
I also use Rclone for remote backups, and Rclone can encrypt data on the fly. So in theory the data is useless to anybody who could come into possession of it.

Also, Rclone skips files that are unchanged:

Copy the source to the destination. Does not transfer files that are identical on source and destination, testing by size and modification time or MD5SUM.

Using a git-based mechanism provides you with a full history, but do you really need that and for what? A rolling backup should be sufficient, keeping the most recent versions of the file.

KeePassXC also has a sync mechanism called KeeShare. I don't use it at the moment, but maybe it could be useful in some use cases?

Variables

You have a number of variables like file paths etc, I would move them to the top of your script, so they are set as soon as possible. This can be a source of nasty bugs sometimes. If the variable is not yet initialized when a function needs it, then it will be treated as empty.

Git/ssh

Your custom git command is like this:

export GIT_SSH_COMMAND="SSH_AUTH_SOCK='/run/user/1000/keyring/ssh' ssh -i $repo_identity_file -o IdentitiesOnly=yes -F /dev/null"

What man ssh says:

-F configfile Specifies an alternative per-user configuration file. If a configuration file is given on the command line, the system-wide configuration file (/etc/ssh/ssh_config) will be ignored. The default for the per-user configuration file is ~/.ssh/config. If set to “none”, no configuration files will be read.

Accordingly, you could write -F none instead of /dev/null. Or you could really add an entry in your SSH config file for gitlab.com, specifying which key to use, and some other options. If you are thinking about the portability of the script, you still need the appropriate private key anyway, and your script cannot work without certain user-specific prerequisites.

Return codes

Your functions could use the return statement to provide a success/failure code. I have noticed that you use false, I find that unusual but I am not the expert.

Logging

You can use the Linux logger as an alternative. Also, you can use level (severity) to filter entries. This can be an interesting option on systems that have a log monitoring system in place eg. syslog-ng, ELK, Loki. These systems can trigger alerts when anomalies are found in the logs (because nobody reads them anyway).

If you are writing a systemd service, then you can use journalctl to query your logs.

Stack Exchange Network

Bash Backup script for password DB - Improved

2 Answers 2

Variables

Git/ssh

Return codes

Logging

You must log in to answer this question.

Linked

Hot Network Questions

Bash Backup script for password DB - Improved

2 Answers 2

Variables

Git/ssh

Return codes

Logging

You must log in to answer this question.

Linked

Related

Hot Network Questions