Unfortunately no one can be told what fun_plug is - you have to see it for yourself.
You are not logged in.
I have been using a DNS-321 to pull data for a backup of my work system for over a year now. Everything was working fine. In the last month or two the operation stopped working from crontab.
Here is my crontab -l:
*/10 * * * * /usr/sbin/offl_chk two &
3 * * * * /usr/sbin/sntp -r -P no us.pool.ntp.org &
30 19 * * 0-5 /ffp/bin/dailybackup.sh
0 19 * * 0 /ffp/bin/rsnapshot weekly
30 16 1 * * /ffp/bin/rsnapshot monthly
*/10 * * * * /usr/sbin/ddns-start&
Now the unit runs dailybackup.sh every night, but I'm getting "Error: Returned 12 whiule processing..."
If I execute the same command when logged in the operation completes successfully.
As a side note I also noticed that my clock is drifting, which it shouldn't be if I run sntp every hour. If I execute the sntp command then it updates the clock properly.
I can't figure out why it would work under a login, but not automatically from crontab.
Can anyone suggest a solution?
bozob
Offline
That's almost always caused by a difference in the PATH variable used in your login shell compared to the cron's process.
If I put in my crontab:
<time spec> env > /tmp/envvars.txt
I get:
USER=root
HOME=/home/root
TERM=vt102
PATH=/usr/bin:/bin:/usr/sbin:/sbin
SHELL=/bin/sh
PWD=/
But the path for my login shell has
PATH=/ffp/sbin:/usr/sbin:/sbin:/ffp/bin:/usr/bin:/bin
This PATH got set by the /ffp/etc/profile script.
The cron process doesn't execute /ffp/etc/profile.
I am still running the crond from /sbin, not from /ffp/sbin. I don't know if /ffp/sbin/crond runs /ffp/etc/profile.
There are various ways to fix this:
1) Fully qualify the commands you use in your dailybackup.sh script. For example, use /ffp/bin/rsync instead of rsync.
2) Try putting a PATH=<path> line at the top of your crontab. The man page for crontab has the details.
3) Try putting a PATH=<path> line in your dailybackup.sh file
The choice is one of style and preference. In my case, I only have one such script in my crontab, so I just used option (1).
However, I cannot explain why this all-of-a-sudden stopped working. Something must have changed. Nor do I have any idea about sntp.
Offline
<i>That's almost always caused by a difference in the PATH variable used in your login shell compared to the cron's process.</i>
I don't think that's the problem.
This morning I reviewed last night's results and it ran 4 out of 9 directories (each is a separate rsync command). Those were the last 4 in rsnapshot.conf while the first 5 generate a error 12. Since these are run from rsnapshot they are run consecutively using identical commands in an identical environment.
This only seems to happen when running from crontab and hasn't yet happened when I manually execute the rsnapshot script through the "dailybackup.sh" script.
I also reviewed the basic logs from the remote server from which the files are getting pulled. It will show a login for every successful rsync login. In this case it doesn't show any logins for the first 5 rsync attempts, but does show the logins for the subsequent 4 that succeeded.
Could this be a ssh problem of some sort? (rsync is running through an ssh connection established with remote keys.)
Last edited by bozob (2010-03-08 18:21:12)
Offline
Thought 1:
One thing I do in my /ffp/etc/fun_plug.local file is to copy my ssh-related files into /home/root/.ssh/. These files include authorized_keys, id_rsa, and id_rsa.pub. Remember that /home/root/.ssh/ goes away on every reboot and so I use the fun_plug.local script to put them back on each reboot.
I believe that this will allow the crontab task to find the keys.
If you are using some way other than placing keys in $HOME/.ssh/ to make ssh locate your keyfiles in your login shell, then the cron task probably won't find the keys. It is possible that you had copied the key files to /home/root/.ssh/ at some point and things worked fine until the next reboot.
It is also possible that the key files that the crontab is accessing are stale. For example, if the machine you are connecting to changed IP addresses or regenerated its keys, you would need to update authorized_keys. Your login shell may be using the correct version, and the crontab might be using old ones.
It also seems that you might be able to turn on more verbose logging in rsnapshot. Try changing the "cmd_ssh" to include turning on debug/verbose, etc.
Thought 2:
If you installed rsnapshot from optware, then you may have picked up ssh from optware as well via a dependency, leaving you with two ssh installations on your system. If you have the keys set up so that they work when using ONE of the ssh installations, you may not have it set up for the other. Again, depending on the PATH in effect at the time, your login shell might be using a different ssh when running rsnapshot/rsync as compared to running it from the crontab shell. IOW, you may be running ffp ssh in your login shell and optware ssh in crontab or vice-versa. Depending on how you set up the keys, this could make a difference.
Related Thought, but not the answer:
Hmm, I had a very similar problem recently.
I use rsnapshot also to rsync the contents of a remote account to my DNS-323. That part of the rsnapshot also quit working all of a sudden. Turns out that the admin of the remote account makes the password expire on some periodic basis, blocking logins until the password is changed. This even keeps ssh logins using RSA keypairs from succeeding. But this doesn't sound like your case, since you can run it from the login shell. But I thought I'd toss the idea in there.
Offline
karlrado wrote:
Thought 1:
Thought 2:
If you installed rsnapshot from optware, then you may have picked up ssh from optware as well via a dependency, leaving you with two ssh installations on your system. If you have the keys set up so that they work when using ONE of the ssh installations, you may not have it set up for the other. Again, depending on the PATH in effect at the time, your login shell might be using a different ssh when running rsnapshot/rsync as compared to running it from the crontab shell. IOW, you may be running ffp ssh in your login shell and optware ssh in crontab or vice-versa. Depending on how you set up the keys, this could make a difference.
This may be the problem. Last night I remembered that I had to reset the unit and reload fun_plug onto it in order to regain root access. So I loaded fun_plug after I had loaded optware. So it could be that the system is getting confused.
How do I correct this? Do I have to reload optware?
bozob
Offline
You probably won't need to reinstall optware. But if you had a /ffp/etc/fun_plug.local file, you may want to see if that is still intact after the ffp reinstall.
One of things I have in my fun_plug.local file is:
# Copy SSH keyfiles for passwordless login mkdir /home/root/.ssh cp /opt/etc/openssh/authorized_keys /home/root/.ssh cp /opt/etc/openssh/ssh_host_rsa_key /home/root/.ssh/id_rsa cp /opt/etc/openssh/ssh_host_rsa_key.pub /home/root/.ssh/id_rsa.pub
I chose to use the optware ssh, so I am getting the keys from there. I suppose another option is to copy them to /etc/ssh, which is where they are in debian, but I did this instead. And it seems to let both my login shell and the cron task find the keys.
Are you using ssh-agent? If you are using ssh-agent to load keys in your login shell, that is something that the cron task is not doing. Basically, how are you getting your keys in the login shell? Where are they stored, etc?
Can you post your script, or at least part of it? Can you dump your env from both your login shell and the cron shell?
Last edited by karlrado (2010-03-09 21:28:03)
Offline
karlrado wrote:
Can you post your script, or at least part of it? Can you dump your env from both your login shell and the cron shell?
I did some poking around and it seems that the problem may be with the environment variables.
Here is ENV from the root login:
INPUTRC=/ffp/etc/inputrc
MAIL=/var/mail/root
USER=root
OLDPWD=/ffp
LESS=-M
HOME=/mnt/HD_a2/root
SSH_TTY=/dev/pts/0
PS1=\u@\h:\w\$
PS2=>
LOGNAME=root
TERM=xterm
PATH=/ffp/sbin:/usr/sbin:/sbin:/ffp/bin:/usr/bin:/bin
LANG=en_US
SHELL=/ffp/bin/sh
PWD=/mnt/HD_a2/root
Here is ENV from crontab:
USER=root
HOME=/mnt/HD_a2/root
TERM=vt102
PATH=/usr/bin:/bin:/usr/sbin:/sbin
SHELL=/bin/sh
PWD=/
I immediately notice a difference in the PATH, PWD, and SHELL.
PATH seems to me to be the main area to look at. How do I change the PATH for crontab?
bozob
Last edited by bozob (2010-03-10 18:06:54)
Offline
Wow, this was what I said back on 3/4 and you said that wasn't the cause in the 3/8 post on this thread. I also suggested several ways to fix it.
One way I suggested was to fix your nightly backup script to use full path names for all commands. This is also what I have done with many of my cron scripts.
I found this on the nets (http://lists.freebsd.org/pipermail/free … 13641.html):
The standard solution would be to use a full path to the command in the script (/usr/sbin/chown). If it's used in multiple locations, defining it as a shell variable makes it maintainable: CHOWN="/usr/sbin/chown" ${CHOWN} somefile ... ${CHOWN} anotherfile
In this case, the 'chown' command was not on root's crontab PATH, so the script failed with lines that look like
chown somefile
But this script worked from the user's shell because chown was in the user's PATH.
Changing the script to
/usr/sbin/chown somefile
fixes it. Or use the technique above.
There are other approaches such as changing the "command" in the crontab to set the PATH before the actual command. Also some versions of cron allow you to put a "PATH=somepath" line in the crontab itself, but my quick test with the 323 cron suggests that is not supported.
You can also set the PATH in the top of your script if you do not want to fully qualify the path for every command in the script.
Offline
I've some testing and narrowed down the problem. It seems that while crontab is executing the shell script, rsnapshot is not working properly.
executing rsnapshot -t daily will generate an rsync command that looks like this.
/ffp/bin/rsync -a --stats --delete --numeric-ids --relative \
--delete-excluded --modify-window=3 --exclude=/data/Trash/ \
--exclude=/data/USERS/TAMI/ --exclude=/jertam/Trash/ \
--exclude=/jertam/Joe/ --rsh=/ffp/bin/ssh -p YYYY \
root@xx.xx.xx.xx:/etc /mnt/HD_a2/.snapshots/daily.0/mirrotek/
The result is :
Unexpected remote arg: root@xx.xx.xx.xx:/etc
rsync error: syntax or usage error (code 1) at main.c(1202) [sender=3.0.5]
If I put quotes around the --rsh argument ("/ffp/bin/ssh -p YYYY") then it works. The new command looks like this:
/ffp/bin/rsync -a --stats --delete --numeric-ids --relative \
--delete-excluded --modify-window=3 --exclude=/data/Trash/ \
--exclude=/data/USERS/TAMI/ --exclude=/jertam/Trash/ \
--exclude=/jertam/Joe/ --rsh="/ffp/bin/ssh -p YYYY" \
root@xx.xx.xx.xx:/etc /mnt/HD_a2/.snapshots/daily.0/mirrotek/
Reading through some posts regarding this error it seems that there is a difference in how the shell interprets the output from rsync.
Now neither the original shell script nor this command works from within the login. Something has changed both the way crontab handled the rsync command and now how the root login handles the command.
I'm not sure what changed so I don't know how to change it back.
For reference here is my current env:
INPUTRC=/ffp/etc/inputrc
MAIL=/var/mail/root
USER=root
OLDPWD=/ffp/bin
LESS=-M
HOME=/mnt/HD_a2/root
SSH_TTY=/dev/pts/0
PS1=\u@\h:\w\$
PS2=>
LOGNAME=root
TERM=xterm
PATH=/ffp/sbin:/usr/sbin:/sbin:/ffp/bin:/usr/bin:/bin
LANG=en_US
SHELL=/ffp/bin/sh
PWD=/ffp/var/log
bozob
Offline
In your script that launches rsnapshot, do you have a line like:
#!/bin/sh
for the first line?
I am guessing not, since the behavior is different depending on the env from which the script was run. When crontab runs it, /bin/sh is used to interpret scripts. In your login shell, it is /ffp/bin/sh.
Since your script works better from the crontab env, I would suggest trying #!/bin/sh at the top of your script. Hopefully it will then run from your login session.
Yes, you needed the "" around the rsh parms, else rsync would try to interpret the -p YYYY as rsync options. It probably took the -p as an option (preserve permissions) and then thought YYYY was the Source.
Reading through some posts regarding this error it seems that there is a difference in how the shell interprets the output from rsync.
I think you mean how the shell interprets rsync commands and parms or options. And this can be true.
Now neither the original shell script nor this command works from within the login. Something has changed both the way crontab handled the rsync command and now how the root login handles the command.
If adding #!/bin/sh does not work, what was the nature of this failure?
Offline
karlrado wrote:
In your script that launches rsnapshot, do you have a line like:
#!/bin/sh
for the first line?
I am guessing not, since the behavior is different depending on the env from which the script was run. When crontab runs it, /bin/sh is used to interpret scripts. In your login shell, it is /ffp/bin/sh.
It is #!/bin/sh. I've tried both this and #!/ffp/bin/sh and neither interprets the rsnapshot-generated command correctly without the quotes even though this was accepted by the login just a week ago.
Yes, you needed the "" around the rsh parms, else rsync would try to interpret the -p YYYY as rsync options. It probably took the -p as an option (preserve permissions) and then thought YYYY was the Source.
The quotes aren't supposed to be around the command, at least according to the posts I've read regarding rsnapshot and rsync. The problem does seem to be shell related.
Either way I cannot get rsnapshot to generate the command with the quotes.
Reading through some posts regarding this error it seems that there is a difference in how the shell interprets the output from rsync.
I think you mean how the shell interprets rsync commands and parms or options. And this can be true.
Now neither the original shell script nor this command works from within the login. Something has changed both the way crontab handled the rsync command and now how the root login handles the command.
If adding #!/bin/sh does not work, what was the nature of this failure?
I get the following error from rsync:
Unexpected remote arg: root@75.127.196.251:/etc
rsync error: syntax or usage error (code 1) at main.c(1202) [sender=3.0.5]
I don't know why this was working for over a year and isn't working now.
Offline
I see.
In your rsnapshot.conf you probably have:
cmd_ssh /ffp/bin/ssh -p YYYY
Perhaps you should have
cmd_ssh /ffp/bin/ssh
ssh_args -p YYYY
(make sure to use a tab between ssh_args and the -p.
If ssh_args is not specified, rsnapshot probably just emits (in your case)
--rsh=/ffp/bin/ssh -p YYYY
If they are specified, I bet it will emit:
--rsh="/ffp/bin/ssh -p YYYY"
I just tried this on my installation and it did exactly what I said.
I can't explain why it stopped working all of a sudden for you. Something must have changed.
Last edited by karlrado (2010-03-19 00:23:25)
Offline
BTW, I am using rsnapshot 1.3.1 from optware.
Offline
karlrado wrote:
I see.
In your rsnapshot.conf you probably have:
cmd_ssh /ffp/bin/ssh -p YYYY
Perhaps you should have
cmd_ssh /ffp/bin/ssh
ssh_args -p YYYY
(make sure to use a tab between ssh_args and the -p.
That is what I've had up until now.
If ssh_args is not specified, rsnapshot probably just emits (in your case)
--rsh=/ffp/bin/ssh -p YYYY
If I don't specify the '-p YYYY' in the arguments and try to do it in cmd_ssh then rsnapshot spits out an error.
If they are specified, I bet it will emit:
--rsh="/ffp/bin/ssh -p YYYY"
I just tried this on my installation and it did exactly what I said.
Not happening for me, but I just checked that I have 1.3.0 and I remember reading in one of the rsnapshot posts that this was an error that could occur in 1.3.0 that was corrected in 1.3.1
I now realize that I didn't install Optware and just installed the extra packages from Funz. I'll go and look at Optware now.
I'm not sure why it worked before and stopped working now, but I've used the .ssh/config file to specify the port for this host as a workaround for now.
Thanks for the help.
Offline
Yeah, using .ssh/config is a good way to workaround it.
Link to the patch made to fix this.
Note that all they did was add "" around the parms. This patch was in the change log for 1.3.1.
You could apply this change to your version if you wanted.
Offline
Something is really screwy.
I haven't changed anything since my last successful connection and now I can't SSH into the remote machine at all. I get a "connection timed out" error.
I tried renaming the .ssh/config file and ssh with -p YYYY (something that always worked before), but no luck.
I am able to ssh from my PC on the same local network as the DNS-321 using PuTTY and can connect fine. I can connect ssh to itself on the DNS-321, (ssh localhost) so ssh should be working OK. But I can't get through to the remote machine from it.
Could my funz fun_plug be screwed up? How would I properly reinstall it?
Last edited by bozob (2010-03-21 20:40:07)
Offline
I don't know what happened differently, but this morning the login is working again.
I'm going to have to try it later with crontab.
I hate it when weird things happen and I can't nail down the problem.
Last edited by bozob (2010-03-22 15:39:08)
Offline
Getting confusing here.
One thing to keep in mind is that a lot of changes you might make to fix things do not survive a reboot. For example, adding a .ssh/config file to the root home directory will not survive a reboot. You need to take care to make sure that all changes you might make like this one are implemented in one of the fun_plug startup scripts (if using ffp, ffp/etc/fun_plug.local). For example, in this case you need to put a copy of that config file on HD_a2 someplace and then have fun_plug.local create root's .ssh dir and copy the config file there.
This *may* explain some of the variability you have been seeing.
Troubleshooting stuff like this starts with simplification. If you are not using optware, disable it and get it out of the way. Disable and get anything else out of the way that you can.
When you say that logins work or don't work, is this just by ssh'ing or telnet'ing into the 323 and then trying to login to the remote machine with plain ssh? If so, good. This needs to work first before getting rsync or rsnapshot to work.
You can start by rebooting to get a fresh start.
Telnet or ssh into the 323 and ssh to that remote machine.
If that doesn't work, troubleshoot.
If it does, add something like this temporarily to the crontab on the 323:
<timespec> ssh <parms> <remotehost> ls > /mnt/HD_a2/crontest.log 2>&1
Instead of ls, use whatever command you like that generates some simple output. Set the timespec to about 2 mins in the future and wait from crontab to run it. Ideally, you won't need any parms, unless you need to add the ones you intend to use with rsnapshot. Depends on what your current approach is regarding using .ssh/config vs parms.
If that doesn't work, try replacing ssh with /ffp/bin/ssh. If that works, then review the discussion we already had in this thread about crontab's path and the need to specify the full path to commands in your backup script. Fix your backup script and anything it calls to use full paths.
If it does work, then, at this moment, your 323 is probably OK, and it is time to look at your network and that remote machine for any problems. This effort is supported somewhat by your claim that the the 323 did the backups for a year and all of a sudden stopped working without changing anything on the 323.
Did anything on your network change?
Does your 323 get the same IP every time it boots? Think about the ssh known_hosts file on the remote machine. If the 323 IP addr keeps changing, the remote machine won't have it in the known_hosts file, causing ssh to fail in non-attended operation - you'll get a prompt if interactive. I don't know if the remote machine is on the same private network, or it is on a completely different domain. This is more likely to happen if the remote machine is on the same private network and only knows your 323 by IP. If the remote machine knows your 323 by an internet domain name, then less likely, if the domain name does not change.
Can there be an IP conflict on the network?
Is it possible that something changed on the remote machine?
Offline
OK, I nailed down the main problem.
When I moved I installed Verizon Fios instead of going with Cablevision. With Cablevision I had my own router connected to their cable-modem. Fios provides their own router...An Actiontek with Verizon's software on it. This worked fine at first and then I started getting intermittent problems.
Last night, while playing around I started looking if the router was blocking the connection in any way. Since I always disliked the way the router worked (it gets all screwy with passwords and my cellphone can only ever get a DNS lookup to work the first time it connects to the router) I was thinking of bypassing it. Looking up how to do it I found out that a lot of people dislike the software on the router because of it's limited NAT address space. Apparently, as you add devices problems start to occur. I then remembered that the night previous to the morning it worked, I had turned off the second computer and that last night, when I couldn't get the connection running, it was on again.
(Looking back at it I realize that I didn't set up the second computer until a few months after I moved and the problems seem to have started about then.)
So I looked at the connections on the router and noticed that some devices were listed as "active" while others were "inactive". The NAS was listed as "inactive" (even though I was trying to connect). I clicked a "Test Connectivity" button for this device and the link was changed to "bridged" and the status was now listed as "active". When I tried the connection it immediately worked.
So now I either bypass the router for the home network or I figure out how to activate or maintain the connection to the NAS.
For example, adding a .ssh/config file to the root home directory will not survive a reboot.
Funz automatically changes the root home directory to a directory on the hard drive so that is taken care of.
I checked all the other items you listed and they weren't the problem.
I will be trying a few more tests to verify that I really nailed down the issue. If that's it I know where to focus my attention.
Thanks for your effort.
Offline