Monday, September 20, 2010

Switching to EC2 Hosting for my Personal Website

I had been using a GoDaddy shared hosting account for my personal website since I created it, but recently changed things up. With Amazon announcing their EC2 Micro instances, cloud based hosting was within my price range. The cost of running the VM is $0.02 per hour, totaling around $14.40 per month. There are some other costs including storage and bandwidth, but they will likely total less than $2.00 per month. What it comes down to is I have my own personal install of Linux to host my web site!

This does come with some down sides.  The VM is slow, specifically the CPU.  It comes with 613 MB of RAM which is plenty for my purposes.  Since my website doesn't receive that many visits the speed is not a major concern of mine.  However, I have determined that the micro instance isn't powerful enough to handle my install of ThinkUp because the database is just too large.  This is disappointing, but I will create a new install on my Linux box and run it locally.

The reason I can justify paying twice as much for hosting is the benefits that I get from having root on the Linux box.  While it means I have to do more Linux administration, it also means I can run whatever software I want!  Specifically, I have set up a personal SVN server that I am using for class projects.  I use both Google Code and Github for open source projects, but my class projects never had a home until now.  So far it has worked out with no problems and using WebSVN provides me with a useful way to browse and analyze my code.

I have migrated my backup script to the new setup and switched to using s3cmd to transfer files to Amazon S3.  With the additional services provided on the server, the script now includes a backup of the SVN repositories and the server configuration files.

While this does mean that I will be paying more, the additional features should justify the cost.

Monday, August 30, 2010

Almost there...

This is my last fall semester as a college student.

Wow.

Wednesday, August 4, 2010

Automated Generation of Javadocs for Open Source Android Applications

While Java is not my favorite language, it has its benefits. With one Android application already posted on the market and another application in development, I decided to start using Javadocs a little more seriously. The process of generating Javadocs is not that complicated using Eclipse, but that is not the solution I wanted. My goal was to automate the generation and posting of the docs as the source code changed.


My shared web host was suitable for hosting the html files, but not running javadoc to actually generate new documents. The solution I came up with was to automatically download the latest code from my public repository, generate the javadocs, and then upload the html files to my shared host. The computer I executed this code on was a Linux virtual server that I have sitting around. The process was actually very simple:
  1. Clean up any files from the previous run of the script
  2. Download the latest source code from the svn repository using svn export. It is notable that you can use svn export with Github since they support accessing repositories using the SVN protocol. Awesome!
  3. Generate the javadocs based off of the freshly downloaded code using the desired parameters.
  4. Copy the newly generated javadocs to the desired server. For my purposes, secure copy was the best solution. With my server's public key installed on the shared host, I was able to log into the remote box without prompting for a user name and password.
The final step in the process is to simply run the script nightly with a cron job and the javadocs will always be up to date. Since I was generating documents for Android applications it was important that the Android jar file be located on the server and the javadoc command be made aware of its location. Without this jar file, the generated javadocs would be incomplete. It is necessary that the Java code can actually be compiled on the computer where the javadoc command is run.

Here is the bash script with the file paths changed to protect the innocent:

#!/bin/bash
cd /path/to/files/docs/
rm -rf ampted.svn
rm -rf ampted

svn export http://ampted.googlecode.com/svn/trunk/
mv trunk ampted.svn

JAVADOCHEADER='<a target="_top" href="http://www.amptedapp.com/">Android Mobile Physical Therapy Exercise Documenter</a>'
JAVADOCFOOTER="Generated on `date`"
javadoc -private -header "$JAVADOCHEADER" -footer "$JAVADOCFOOTER" -d /path/to/files/docs/ampted/ -sourcepath /path/to/files/docs/ampted.svn/android/src/ -subpackages com.AMPTedApp -classpath /path/to/files/lib/android.jar

scp -r /path/to/files/docs/ampted remoteuser@example.com:/path/to/remote/files/docs/

This process has one point where it could definitely be improved and that is that it always overwrites the javadocs even if the source code did not change. If a check was added that compared the commit number of the previously generated documentation and only generate and upload a new copy if it is newer. This wasted effort is not a major concern for small projects, but may need to be fixed as my projects grow in size.

You can see the docs for AMPTed, which is a project that is in the very early stages of development, at http://javadocs.amptedapp.com/

Nothing to do, so what will I accomplish?

There are two and a half weeks before my last fall semester starts and I have very little to do. I have a few things on my calendar, but generally it is empty. So, what I am going to do with all of this free time? Simple, write lots and lots of code. Actually, my plan is to work on several projects while I still have the time. It just happens to be that most of these projects involve me writing code.


The main projects that I will be working on include creating DPX Answers for DyKnow Panel Extractor, creating the foundation for AMPTed App, fixing bugs and making small improvements to OpenNoteSecure, and implementing the NAESC Conference registration website. These are high level items on my to-do-list. I plan on making a low level to-do-list that will help me get my ideas organized.

It is a rare for me to have this much free time, so I plan on putting myself to work and making some major progress. I've already managed to code quite a bit, including some major improvements to other projects that are not on the above list. I just need to find a nice quiet place to sit down and start working and not move until I have a plan of action.

Monday, July 26, 2010

Some Web Server Management and a Plan for Backups

It has been quite a while since I spent some time administering my personal websites. My sites are hosted using GoDaddy's shared host, which isn't as bad as some of the reviews make it out to be. The big thing that I have been putting off is implementing a reliable and automated backup system. My previous strategy for backups was to simply dump the databases and copy down all of the files once a month, if I remembered. It would not be easy to replace the content of my websites if it were to be lost.


The first step on developing my backup strategy was to clean up my content and current installs. I deleted some web applications and code that I was playing with, but no longer used. These applications were some things that I was playing around with, but never did anything with. Once I did that I made a backup of everything by hand and then upgraded all of my web apps to the latest version. I was now ready to develop my automated system.

The first step was to get a local backup on the web server itself. This was done through shell access to the server using SSH and developing a shell script that performed all of the necessary steps. The two things that need to be backed up were the databases and the actual files. The MySQL databases are simple to backup using the mysqldump command and compressing the output and storing it to a file. The files can be backed up using a simple tar command which can also compress the files down to a reasonable size.

Once all of my database and files were compressed and organized, I took them all and packaged them up in another tar which was my final backup, a single file. This script was set to run as a cron job and the automated backup process was half way complete. The only thing left to do was to find a way to transfer the backup off-site.

My first though was to copy the backup to my personal Linux server. I eventually found a way to automate this process using scp and was happy with the results. It would have been fairly simple to automate this process, but it just didn't seem to be the solution I was looking for. The solution I went with was to store the backups on Amazon S3 using s3-bash. S3 provided very cheap storage and was easily accessed using open source tools that made the process of transferring files very painless. My estimates place the total cost of backups that will be stored on S3 at less than $0.40 a month!

Deciding to use a paid service meant that it would not be logical to store all of my backups indefinitely, and I needed to come up with a plan on how long to keep each backup. I also needed some way to delete backups after they were no longer needed. The solution I came up with was extremely simple. The backup script would run every night and generate and transfer the complete backup, about 45 MB, to the S3 servers. My plan was to keep the backup created on the first of each month for a year, this way I'll avoid data loss if something went wrong that was a long term problem. Additionally, I would keep a backup for each day of the week helping me to avoid loss of data in the short term. After 12 months of operation I would have a total of 19 backup files that would continue to be replaced as time went on. The old backups would not need to be deleted, because by uploading a file with the same key (or file name) it over writes the older version, thereby deleting the old backup.

My backup script has only been running for a few days, but I am very pleased with the results. I still want to do some testing to insure that by backups are comprehensive, but initial inspection shows reveals no problems. This set it and forget it approach is exactly what I was hoping to implement.

Powered By Blogger