Stale commands in htop?

MannDude · Aug 4, 2013

First off, pardon my ignorance. Working on a little project last night and I notice this today.

I made a bash script that runs on a cron schedule, that works properly and does what it needs to do (as of last run, anyway). However when viewing htop, I see a ton of instances of the script in the command column, as well as portions of the script (commands within it). The script hasn't ran for hours, and these are still shown as processes.

Most of these are stale commands, ones that were issues 12+ hours ago. I know this because I can see that the command is either no longer a part of the script or containing something that has since been changed since the last run.

1.) Is this normal?
2.) Should I add something to the bash script that clears this after it's ran?

For example, right now in htop I see:

2 instances of 'sudo sh scriptname.sh'

3 instances of 'sudo sh scriptname_1.sh' (a revision, also no longer a file as it was removed)

7 instances that contain sections of the script/commands

Now, the script was not ran by root, but in htop these instances are shown as belonging to root. It was ran initially manually, by a different user on the system (sudo sh scriptname.sh), and now is ran without sudo by adding, "user ALL= NOPASSWD: /home/user/scriptname.sh" in the sudoers file. (Which seems to work as of last cron run)

So, yeah. Now what? Any tips?

drmike · Aug 4, 2013

No this isn't normal.

I run a bunch of cron jobs, but am a novice still

Biggest issue I have and likely relates is the underlying script can/will fail. For instance I have a cron job that does recording of streams. That is accomplished via a popular third party commandline tool. At times, remote servers will fail massively or be offline, resulting in the script going bonkers. I've found multiple jobs stuck and still running, but doing nothing, days later.

Obviously, the solution is to wrap the script with something else where you can pass timeout on the job --- meaning closure of it after reasonable set period of time. If a job takes 30 seconds, that might mean giving it 1 minute and having it auto-terminate.

Everything I do can be handled with such a timeout (different based on each cron job).

The permissions stuff you need to look at more (which user is spawning the job, when/where/if downstream changes user).

MannDude · Aug 4, 2013

#!/bin/bash
date=`date +"%m_%d_%Y"`
time=`date +"%H:%M"`

rm -rf /the/backup/path/DB*
rm -rf /the/backup/path/www*

mysqldump -hlocalhost -uDBuser -pDBpass DB | gzip -9 > /the/backup/path/DB/website_$(date +"%m_%d-%Y_%H:%M").sql.gz

tar -zcf /the/backup/path/www/website_www_$(date +"%m_%d-%Y_%H:%M").tar.gz /home/user/website/public_html

ssh -p 1234 -i '/home/user/.ssh/id_rsa' remote@server-IP mkdir -p /home/user/backups/website/DB/$date
ssh -p 1234 -i '/home/user/.ssh/id_rsa' remote@server-IP mkdir -p /home/user/backups/website/www/$date

rsync -auvz -e "ssh -p 1234 -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -i /home/user/.ssh/id_rsa" /the/backup/path/DB/ remote@server-IP:/home/user/backups/website/DB/$date
rsync -auvz -e "ssh -p 1234 -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -i /home/user/.ssh/id_rsa" /the/backup/path/www/ remote@server-IP:/home/user/backups/website/www/$date

It is ran via a cron (sudo crontab -e shows):

0 0-23/2 * * * sudo /home/user/backup.sh
Visudo shows:

#includedir /etc/sudoers.d
user ALL= NOPASSWD: /home/user/backup.sh

So, the, script actually works and runs fine, it does what it is intended to do (I'm sure it could be written better, but it is what it is).

It seems that it's not a big deal, but I was curious about why they were lingering around and owned by 'root' despite the script never being executed by that user.

drmike · Aug 5, 2013

Backup script there looks fine. Pretty straight forward.

What process is left lingering? rsync?

Stale commands in htop?

MannDude

Just a dude

drmike

100% Tier-1 Gogent

MannDude

Just a dude

drmike

100% Tier-1 Gogent