Data Backup in the AWS Cloud with rsync

16 11 2011

After admitting that of all things Microsoft offers 25GB cloud storage for its Windows Live subscribers I will walk through my latest preliminary experiments regarding backup of important data using the using the Amazon Advanced Web Services. The storage is not free but quite cheap at around 0.1$ per GB and month.

If you use Windows and MS Office a lot use Skydrive and don’t read on ;) There are posts which describe how to map the Skydrive like a local harddisk using MS Word.

On the long run I would like to mount a EBS storage like a local file tree, probably using WebDAV, but this is my first successful preliminary solution. s3cmd does not work for me.

Using Ubuntu/Linux rsync is a well established, reliable and easy to use tool to keep data between locations in sync. The following post marries rsync with an Elastic Cloud (EC2) server instance for an hour or some. One has to set up the so called rsync daemon and attach a persistent Elastic Block Storage.

This is another post. I will link to it later. There will also be a small script. There are some holes in this tutorial, only the direct configuration of the rsync daemon (including the script) is complete and working. I filled in some hints how to get to this stage. But will write follow ups on that.

System Out provided a nice tutorial of how to set up the rsync in demon mode on a server which listens for clients to sync their data.

Here is my version of it, with a short script at the end which should do the job.

Prerequisites

Of course you need to have rsync on both machines (the server and the client); since both are Ubuntu this is the case.

I will write another post on how to start the server. It is completely possible and quite intuitive to achieve it in the Amazon web interface. When the server is running and an extra EBS harddisk is attached you have to connect to the server using ssh
ssh -i PATH/TO/YOUR/PEM-KEY-FILE ubuntu@ec2-xxx-xx-xxx-xxx.compute-1.amazonaws.com

Mount the persistent drive

There are some posts about the advantages of the xfs filesystem, so I sticked to it. Alestic recommends it for all persistent EC2 cloud disks and I trust they know what they are doing. But xfs is not per default included in the Ubuntu micro instance I use for my backups. That said, in the SSH shell:

sudo apt-get install -y xfsprogs
sudo modprobe xfs

If the backup volume is newly created then format it:
sudo mkfs.xfs /dev/xvdb
Note: Only the first time. Otherwise you wipe your data, of course. Note also the device name. I attached it as /dev/sdb. Though it showed up in the Ubuntu Oneiric i386 t1.micro instance as /dev/xvdb.

Now mount the volume
echo "/dev/xvdb /media/backup xfs noatime 0 0" | sudo tee -a /etc/fstab
sudo mkdir /media/backup
sudo mount /media/backup
sudo chown ubuntu:ubuntu /media/backup
sudo chmod 777 /media/backup

Configuration files

On the server machine you need to set up a daemon to run in the background and host the rsync services.

Before you start the daemon you need to create some rsync daemon configuration files in the /etc directory.

Three files are necessary:

  1. /etc/rsyncd.conf, the actual configuration file,
  2. /etc/rsyncd.motd, Message Of The Day file (the contents of this file will be displayed by the server when a client machine connects) and
  3. /etc/rsyncd.scrt, the username and password pairs.

To create the files on the server:
sudo nano /etc/rsyncd.conf

Now enter the following information into the rsyncd.conf file:

motd file = /etc/rsyncd.motd
[backup]
path = /media/backup
comment = the path to the backup directory on the server
uid = ubuntu
gid = ubuntu
read only = false
auth users = ubuntu
secrets file = /etc/rsyncd.scrt

Hit Ctrl-o to save and Ctrl-x to close nano.

The uid, gid, auth users are the users on the server. In the ssh session on the ec2 instance the user is ubuntu.

The format for the /etc/rsync.scrt file is
username:whatever_password_you_want

Use nano to put some arbitrary text into the /etc/rsync.motd.

Now you should have all the configuration information necessary, all that’s left to do is open the rsync port and start the daemon.

To open the port, open the /etc/default/rsync file, i.e.,

sudo nano /etc/default/rsync

and set RSYNC_ENABLE=true.

Here you might also specify another port than the default 873. Remember to open the port in the security group. Either with the AWS web interface in your browser or in the shell using the ec2-api-tools:
ec2-authorize default -p 873

Now to start the daemon,
sudo /etc/init.d/rsync restart
and exit the SSH session.

Syncing a folder

Now you can use your local shell to push some folders or files to the server. Update the server side from the client machine with ec2-api-tools installed:
EXIP=`ec2din | grep INSTANCE | grep -v terminated |awk '{print $4}'`
rsync -auv /home/rforge/articles ubuntu@$EXIP::backup/

$EXIP would be the server ip address

This gets the IP of the server from the ec2-api-tool and passes it to RSYNC.

Otherwise you have to remember the IP of your instance from the web interface and substitut it for xxx.xxx.xxx.xxx:
rsync -auv /PATH/TO/FOLDER/ ubuntu@$xxx.xxx.xxx.xxx::backup/

::backup has to match [backup] in the /etc/rsyncd.conf file. You will see the rsyncd.motd message and get prompted for the password in the rsyncd.scrt file. Then rsync starts the upload.

A Script

The following script should do the daemon setup after connecting to the server via ssh and mounting the volume. Keep me posted if something does not work.

echo "motd file = /etc/rsyncd.motd
[backup]
path = /media/backup
comment = the path to the backup directory on the server
uid = ubuntu
gid = ubuntu
read only = false
auth users = ubuntu
secrets file = /etc/rsyncd.scrt" > rsyncd.conf
sudo mv rsyncd.conf /etc/
#
sudo echo "Greetings! Give me the right password! Me want's it!" > rsyncd.motd
sudo mv rsyncd.motd /etc/
#
sudo echo "ubuntu:YourSecretPassword" > rsyncd.scrt
sudo mv rsyncd.scrt /etc/
#
sudo chmod 640 /etc/rsyncd.*
sudo chown root:root /etc/rsyncd.*
#
## enable demon mode in the /etc/default/rsync file
sudo cat /etc/default/rsync | sed 's/RSYNC_ENABLE=false/RSYNC_ENABLE=true/g' > rsync
sudo mv rsync /etc/default/
sudo chown root:root /etc/default/rsync
sudo chmod 644 /etc/default/rsync
#
sudo /etc/init.d/rsync restart # start the demon





Find BIOS version in Ubuntu

4 10 2011

The dmidecode command line utility dumps a list of SMBIOS specifications to the standard output. In order to get the version number of the currently installed BIOS open a shell and
sudo dmidecode --type 0 | grep Revision

The –type 0 option restricts the output to BIOS specific information and grep fishes for the revision number.

On my X61s Thinkpad the resulting output is
BIOS Revision: 2.19
Firmware Revision: 1.3





BASH: Convert Uppercase to Lowercase letters

12 09 2011

Vivek Gite on nixCraft suggests tr to tranform uppercase letters in the textfile input.txt to lowercase and output the transformed text to output.txt.

tr '[:upper:]' '[:lower:]' output.txt

I needed to clean up a messy old scriptfile where I lost track of my variable naming convention.

Very useful indeed :)





Add public key behind a firewall in Ubuntu Shell

7 09 2011

In short: Use
sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80/ --recv-key E084DAB9
instead of
sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-key E084DAB9
This way you force port 80 which is usually clear.

I got the idea from the answer of Phil Bradley on the superuser.com forum. He claimed that this would be fixed in Natty, but it isn’t although the configuration file he mentions has the port80 specification added by default, apt-key does not use it. The above snippet solves that.

For those Ubuntu users who have no idea what I am talking about:

Installing the newest R-version in Ubuntu requires to append the CRAN repository to you /etc/apt/sources.list. One might hit Alt+F2 and enter
gksu gedit /etc/apt/sources.list

With Xubuntu you would use mousepad instead of gedit. In any distro you can use
sudo nano /etc/apt/sources.list
in a terminal.

Usually I add the line
deb http://cran.uib.no/bin/linux/ubuntu natty/
at the end of the file and update with
sudo apt-get update.

CRAN at University of Bergen is closest to me. You might want another one (check the r-project.org site for mirrors).

apt-get update answers with a warning
GPG error: http://cran.uib.no nat/ Release: The folowing signatures coldn't be verified because the public key is not abailable

That is not a problem. One can install R and packages anyway, but it is better to have the public key.

Behind a firewall (and many public and open hotspots block several ports) it is not possible to use

sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-key E084DAB9

since the port through which the keyserver is contacted is blocked on most firewalls. You have to force port 80 by:
sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80/ --recv-key E084DAB9

After the key is added
sudo apt-get update
sudo apt-get install R-recommended emacs ess

proceeds without warning nor error.





Setup FTP on Amazon EC2

19 09 2010

If your Amazon EC2 instance is finally running – which is another story – one would want to have ftp access to upload files and documents.

I got the inspiration to use vsftpd from curiousdeveloper.blogspot.com

  1. Open port 21 for ftp access on you running instances:
    ec2-authorize default -p 21
  2. Connect to your instance via ssh
    ip=`ec2din | grep I | cut -f17`
    ssh -i /path/to/yourkey.pem ubuntu@$ip
  3. Install vsftpd
    sudo aptitude install vsftpd
  4. Start the demon
    sudo /etc/init.d/vsftpd start




Replacing capitals with normal letters using SED

30 05 2010

Needed to change CAPITAL variable names to normal in a long SQL-syntax file. Sourceforge had a way to do it with SED.

GSED 4.+ supports the \L& and \U& switches to transform everything to lowercase or uppercase respectively:

sed 's/MyRegExp/\L&/g'

changes the regular expression MyRegExp into lowercase.





Using gammu to connect to a mobile phone

28 05 2010

After switching between mobile phones several times I always lost some data, contacts, media, so on…

There is a Linux tool called Gammu which allows to connect a selection of mobile phones. It seems that Gammus functionality is maximal for Nokia and Siemens, but I will give it a try on my Sony Ericsson…

The configuration is not trivial and I found some hints on JohnMcClumpha.org:

Install Gammu

Installing gammu is surprisingly easy (once again thanks to the wonders of apt-get), just use the following command:

sudo apt-get install gammu

Hard wasn’t it? ;)

OK now it’s time to plug your phone in and see if we can get things talking. With the phone connected, type the following command:

lsusb

you should now see your phone listed as a device – for example:

Bus 001 Device 002: ID 0421:0802 Nokia Mobile Phones

if not – make sure your cables and power are all good and try again.

The gammu installation comes with some example configuration files which are worth using as a starting point – if nothing else they help you to understand how gammu can be configured so that you can tailor a solution for your needs. These are located in
/usr/share/doc/gammu/examples
(in gZip archives).

Copy the gammurc file to /etc/gammurc :

sudo cp /usr/share/doc/gammu/examples/config/gammurc /etc/gammurc

Now edit /etc/gammurc to specify your port and connection type (this will vary based upon where/how you have things plugged in and what sort of cable/interface your phone is using). The settings for mine are:

port = /dev/ttyACM0
connection = dku5

Save this config and from the shell type:

gammu --identify

you should now be presented with some information regaqrding your phone such as:

Manufacturer : Nokia
Model : 7200 (RH-23)
Firmware : 3.110 T (18-03-04)
Hardware : 0903
IMEI : 353363000813894
Original IMEI : 353363/00/081389/4
Manufactured : 04/2004
Product code : 0514143
UEM : 16

If this is the case then you have got gammu up and running and can send yourself a test message with the following command:

 echo "boo" | gammu --sendsms TEXT [recipient mobile number]




Restructuring the filetree – moving files from multiple directories

29 04 2010

I am considering changing the structure of my /home directory, maybe completely changing my data organisation habbits.

I found this post on ubuntuforums very useful:

You need to use find

For this example I`m going to assume that all the .txt files are located in directories and subdirectories of your Documents folder and you want to move them to a directory in your home called scripts
Code:
find ~/Documents -name '*.txt' -exec mv '{}' ~/scripts \;





Remove U3 System from SanDisk

16 04 2010

Bought a SanDisk Cruzer 16GB and found some smart software preinstalled which did not consider smart at all. Everytime I inserted the drive on any computer a CD drive with label U3 System“was mounted containing some funny .exe files. The whole “CD drive” took several MB of diskspace.

I wanted to get rid of it. Fortunately, I was not the first one beeing disturbed.

Sourceforge has a u3-tool which did the job:

  1. Download the tool to a place where you remember it
  2. Unpack the .tar.gz archive (I just rightclicked it and chose “extract here”). This creates a folder like /MyPathTo/u3-tool-0.3/
  3. open a terminal and type: cd /MyPathTo/u3-tool-0.3/
    ./configure
    make
    sudo make install

    Now u3-tool is installed and can be used.
  4. To remove the CD-like partition containing the firmware crap you need the device name of the USB disk: sudo fdisk -l gives the answer. In my case it is /dev/sdb1. Make shure you remember the right one.
  5. Remove the U3 partition with u3-tool -p 0 /dev/sdb1where /dev/sdb1 is the device name remembered from the previous step and the option -p is followed by a zero.

Done.





Passing an external variable to AWK

9 02 2010

Confronted with a heap of colon separated text files which had to be merged and cleaned of unrelated lines and columns, i tryed my luck inside Excel and spend a lot of time doing it manually, but finally got fed up.

So I decided to use AWK on the task.

A FOR-loop lists the files in the folder into the UNIX pipe.

AWK selects the non-empty observations and adds the name of the file as a classifier to the beginning of the line (the result is a repeated measure dataset).

This is the code:

for CSV in `ls`
do
cat $CSV | awk -F ";" '{
if ($2 ~ /[0-9]+/) {print CSV , FS , $0;}}'
done

Remark: -F ";" option specifies how to distinguish the columns/fields of the lines/records in the file(default is ” ” or empty space).

BUT: The variable CSV gets not passed to AWK by default it has to be fed into AWK.

Solution:
The
-v CATEGORY=$CSV
option feeds the external variable CSV into the AWK-variable CATEGORY.

This gives:
for CSV in `ls`
do
cat $CSV | awk -F ";" -v CATEGORY=$CSV '{
if ($2 ~ /[0-9]+/) {print CATEGORY , FS , $0;}}'
done

.. and works :)

Hat tip:fpmurphy








Follow

Get every new post delivered to your Inbox.