7
\$\begingroup\$

Can you please comment on the script below? It backs up my local password database to a remote repository if a change is detected. It works as intended. I'd like some comments in terms of syntax, security, portability, readability, etc.

#!/bin/bash

# Compares local and remote copies of the keepass db. If there are any diff, the local replaces remote, as local is the
# master.
# KeepassXC tends to make some meta-data changes (DB preferences, last opened group...)
# which will be picked up by this script. Therefore, a sync might happen even if no entry has been
# added/modified/deleted
#
# This script is run periodically by cron (crontab -l to view the schedule). Below shows it runs Mondays at 10am
# 0 10 * * * /home/notfound/bin/backupKeepassdb.sh

# It requires:
# - bash as shell (bash initialises $HOSTANME)
# - ts from moreutils package for timestamps in the logs
#
# It should be placed in the bin directory of the user so that it automatically appears in $PATH
#
# Usage:
#    backupKeepassDB.sh

log () {
   echo $1 | ts '[%F %H:%M:%.S]' >> /home/notfound/Logs/backupkeepassdb.log
}

log_and_mail () {
   log "$2"
   echo "$2" | mailx -s "$HOSTNAME - $(basename "$0") - $1" $notification_recipient
}

log_and_mail_and_exit () {
   log_and_mail "$1" "$2"
   exit
}

clone_remote_repo_or_exit () {
   cd $temp_dir

   export GIT_SSH_COMMAND="SSH_AUTH_SOCK='/run/user/1000/keyring/ssh' ssh -i $repo_identity_file_path -o IdentitiesOnly=yes -F /dev/null"
   git clone [email protected]:notfound/notfound.git &> /dev/null
   if [ "$?" != 0 ]; then
       log_and_mail_and_exit "$email_subject_failure" "Failed to clone remote repository"
   fi
}

check_db_is_readable_or_exit () {
   if [ ! -f "$1" ]; then
       log_and_mail_and_exit "$email_subject_failure" "$1 not found or not readable"
   fi
}

push_to_remote () {
   rm -rf "$remote_keepassdb_path"
   cp "$local_keepassdb_path" "$local_repository_path"
   cd "$local_repository_path"
   git add . &> /dev/null
   git commit -m "Update from $HOSTNAME" &> /dev/null
   git push origin main &> /dev/null
}

temp_dir=`mktemp -d`
local_keepassdb_path=/home/notfound/Documents/Secret/Passwords/KeepassXC/Passwords.kdbx
local_repository_path=$temp_dir/backup
remote_keepassdb_path=$local_repository_path/Passwords.kdbx
[email protected]
repo_identity_file_path=/home/notfound/.ssh/notfoundToGitlab_id_ed25519
email_subject_failure="Failed Password backup"

log "Starting Password db backup"
clone_remote_repo_or_exit
check_db_is_readable_or_exit "$local_keepassdb_path"
check_db_is_readable_or_exit "$remote_keepassdb_path"
remote_db_hash=($(sha256sum $remote_keepassdb_path))
local_db_hash=($(sha256sum $local_keepassdb_path))

if [ "$remote_db_hash" != "$local_db_hash" ]; then
    (push_to_remote &&
      log_and_mail "Successfully Updated Remote Keepass DB" "Local Keepass DB different from Remote. Remote has been updated.") ||
      log_and_mail_and_exit "$email_subject_failure" "Failed to push remote repository"
else
    log "Local Keepass DB and Remote Keepass DB are identical. No update needed"
fi
\$\endgroup\$
2
  • 4
    \$\begingroup\$ Welcome to Code Review! Please don't modify the code in your question once it has been answered. You could post improved code as a new question, as an answer, or as a link to an external site - as described in I improved my code based on the reviews. What next?. I have rolled back the edit, so that it's clear exactly what version has been reviewed. \$\endgroup\$ Commented Dec 2 at 16:51
  • \$\begingroup\$ I have to remind that git is not a backup system \$\endgroup\$ Commented Dec 2 at 23:22

3 Answers 3

10
\$\begingroup\$

quotes

log () {
   echo $1 | ts ...

You want "$1" there, and at some other unquoted places.

shellcheck

You didn't use it to lint this code, and you should.

repeated constant

The /home/notfound directory is important to this script, repeated several times. It might change, for example when another team member does some testing. Consider defining a variable for it, or just use ${HOME}.

Public API

log_and_mail () {
   log "$2"
   ...

Give us a hint, please, about the formal parameters -- a comment, or:

log_and_mail () {
   subj="${1:-email subject}"
   msg="${2:-email message}"
   log "${msg}"
   ...

trivial

log_and_mail_and_exit () {

Certainly this does what it says. But maybe caller could be responsible for invoking exit? It seems like we have very little value-add here.

boolean OR

ifs are nice enough, but consider taking advantage of the OR operator:

    git clone ... || log_...
    ...
    test -f "$1"  || log_...

gratuitous -r

   rm -rf "$remote_keepassdb_path"

Near as I can tell, Passwords.kdbx is a file rather than a folder. So no need to recurse.

The worry is that after maintenance perhaps the variable doesn't quite hold what you intended, and rm -rf runs amok where it shouldn't. I'm just trying to limit the blast radius.

clarity

You have a bunch of ..._path variables, which are a mix of folders and files. Consider using a ..._dir suffix on some of them.

suppressing errors and error messages

   cd "$local_repository_path"
   git add . &> /dev/null

Some folks are fond of set -e to bail on first error, and some folks aren't. We're not using that setting here.

Suppose the cd failed. We can't plausibly continue. Standard idiom would be

   cd "$local_repository_path" || exit 1

Or verbosely log it or whatever. Just don't go on to the git add.

Now suppose the cd worked great, but we're not within a repo. It seems a little dicey to suppress stderr here. Better to stick with just
git add . > /dev/null, and similarly for push and commit. Or filter "expected" diagnostics, e.g.
git add . 2>&1 | egrep -v 'already added' or whatever you anticipate.

BTW, maybe that repo has no branches, and during debugging we never git checkout SHA1 some hash commit? In the FS we cd foo; in version control we might want to git checkout main before add, commit, push.

If multiple hosts push, then maybe do a pull before the push?

local diff testing

Consider consulting git status, and suppressing redundant network push operations when there's nothing new to push.

\$\endgroup\$
2
  • \$\begingroup\$ Thanks for your detailed review. regarding the last one "local diff testing", this is not an option at this stage. \$\endgroup\$ Commented Dec 2 at 16:51
  • 1
    \$\begingroup\$ echo "$1" | ts ... is still unsafe. It is better to use printf %s\\n "$1" | ts ... \$\endgroup\$ Commented Dec 3 at 13:21
7
\$\begingroup\$

Remote commands

Some SSH servers permit the execution of remote shell commands. So, rather than clone the repository locally, you could execute the sha256sum command remotely, provided that the server allows it. This would be a better approach when dealing with somewhat large files.

Keepass files are normally very small, so there is no penalty for doing more backups than necessary.

History

Also, it should be noted that Keepass (or at least KeepassXC, the fork that I use) can keep a history of password changes, and also has a recycle bin for deleted entries. So I find limited value in a git workflow.

Security

Using Gitlab to back up sensitive data is worrying. In theory, the file should be useless in the wrong hands, as long as it is protected by strong password, but it is still highly sensitive.

Personally, I use Rclone for backups, with an appropriate hosting plan. Because there are plenty of other files that should be backed up too.

I suggest that you clean up the temp dir when you are done. The trap command can be used for this purpose. On Linux systems, the files present in the /tmp directory are normally accessible to other users, maybe in read-only mode at least depending on the permissions. So I find it more appropriate to use your home directory as a base directory.

Mail

Crontab can send mail if the MAILTO variable is set, see: https://www.cyberciti.biz/faq/linux-unix-crontab-change-mailto-settings/

shellcheck

I have run shellcheck on the script, there is a lot of useful comments. The first is:

^-- SC1128 (error): The shebang must be on the first line. Delete blanks and move comments.

I would encourage you to use shellcheck to review and improve your scripts, and learn a few things.

Error handling

You should check the exit codes of the commands being invoked, especially git push... to make sure that the process indeed succeeded.

After all, the most critical feature for a backup application is reliability. There could be network issues when the script runs, or other unexpected issues not handled by your script.

Regardless of what your script reports, there is no proof that the backup even took place. The /dev/null redirection may discard useful output, which is essential for troubleshooting unattended scripts.

Ironically, you do some checks when cloning the repo, but the push is more critical.

In any event, your script will probably always return an exit code of 0, because the last executed command succeeds ie log_and_mail_and_exit. So it does not reliably provide us with an exit code we could check at the end of the process.

SSH authentication

I am not the expert here, but I assume that your private key may be protected by a passphrase, otherwise SSH_AUTH_SOCK should not be required.

\$\endgroup\$
6
\$\begingroup\$

Testing $? is a common antipattern, seen here:

   git clone [email protected]:notfound/notfound.git &> /dev/null
   if [ "$?" != 0 ]

That can be replaced by

if ! git clone [email protected]:notfound/notfound.git &> /dev/null

Or even by git clone … ||


There's a severe shortage of error checking. For example:

   cd "$local_repository_path"
   git add . &> /dev/null
   git commit -m "Update from $HOSTNAME" &> /dev/null

If any of these commands fail, does it really make sense to execute the remainder? Consider joining with && and/or adding || exit to specific commands.


Instead of comparing SHA-256 hashes, why not simply compare content with cmp? That's more efficient, and completely eliminates the (admittedly tiny) chance of hash collisions.


Instead of sending errors by email, why not write them to standard error stream? We can then use this script (e.g. from cron) in the usual way and let those environments choose how to deal with the error stream. That way, we do one thing well, rather than adding multiple responsibilities to the code.


I have no comment on the advisability of copying sensitive data to hosts you don't control. But it's not something I would do.

\$\endgroup\$

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.