Importance of Legacy URL Support

Posted Aug 19, 2009 12:31:46 AM

I've received several notifications over the past few weeks about dead URLs on my website. Notably, from places where I have linked my own articles, such as in my Best of 2003 roundup. All the links in that article died suddenly, because I'd migrated my blog software from one system to another. Given the date on that post, I'd guess I migrated from pyBlosxom to blosxonomy back then.

Anyway, I'm making an effort to resurrect those dead URL's using some apache re-write rules. Specifically, all the links on my Best of 2003 now work again.

Note to web site maintainers: it's important to pick a URL scheme, and be consistent... it will bite you in the butt later.

The Best Of ...

Posted May 1, 2006 5:50:18 PM

I checked my Google Analytics data this afternoon to get a feel for what my readers are reading... here's a list of the most popular posts on my site -- interestingly, they were some of my favorites too.

Also ranking pretty highly in my popular content is my JBoss tag. Back in the day when I was earning my Master's at Clarkson, I was posting a lot about JBoss configuration and customization.

So, according to Google, that's the best of Tim Fanelli (dot com)

Frameworks and Build Scripts and POMs, Oh My!

Posted Dec 8, 2005 10:22:38 PM

It's been a while since I've started a new J2EE project from scratch, and a lot has changed since the last time I did. I've been working in the J2EE arena for quite a few years now, but with the exception of jETIA, which I started in 2002, I've only been involved in existing projects. jETIA's build system was Ant, revision control was Subversion, and project management and dependencies were managed by hand.

I learned about Maven about a year ago when I responsibilities at my previous job suddenly included maintenance and development of a large, complicated ant based build system. I had hoped that Maven would be the golden solution to my dependency maintenance nightmares, and complicated Ant scripts integrating our source control mechanisms - however I quickly learned (the hard way) that Maven was not flexible enough to adapt to your project structure; I'd have to make my project structure meet their idea of a good layout. Not that I objected to their layout, but the discombobulated state of the project I was working on took years to screw up so badly; fixing it would only further confuse the developers who held deep understanding of how the project ran.

Which brings us today, with my new Maven book (Maven : A Developer's Notebook) and the need to start a project from scratch. Sadly, the book I bought not 4 weeks ago is officially out of date, as Maven 2 is now the standard release, and my book only briefly mentions that while the concepts are the same in version 2, that it's a complete rewrite of the system. Many commands are the same, but the one thing I wanted to do - create a new project - has changed :-\.

Additionally, I discovered today that my MVC framework of choice, Struts, has drastically changed as well! Struts is now broken into two core projects: Structs Action framework - the "original" struts, so to speak, and Shale, a Java Server Faces based MVC framework being billed as "what Struts would have been if we'd known then what we know now".

How fast things change. I feel like, from a technology standpoint, I'm starting all over again... While initially discouraging for me, this should be exciting for you, because this means that I'll be posting many many how tos on getting started with these exciting new versions of technologies designed to make you life in J2EE easier!

So my question to you, then, is aside from Maven for project management, Struts and/or Shale for a strong MVC framework, and Spring for a solid IoC driven dependency injection mechanism - what should I be reading up on during my "technology review" stages, and what are your opinions on them?

Protecting Content with SSL and mod_rewrite

Posted Dec 5, 2005 9:21:46 PM

Now that I've migrated my blog to a SQL Database, Blosxonomy had lost some of Blosxom like ease of use, so I decided it was necessary to create a web-based interface to post to my blog. I had originally thought I'd just use my entry-conversion utility and continue to write Blosxom style posts, but quickly decided that was absurd. For those of you keeping track of Blosxonomy, this feature will be included in 0.7.3, which is in the final testing stages now.

In any event, I needed to protect the page that posts to my website from being accessed, and use mod_ssl and mod_rewrite to do it.

In particular, I needed to hide /post from general access - to do this, I added a simple rewrite rule to redirect it to my SSL secured site:

RewriteRule ^/post(.*) https://www.timfanelli.com/post$1

Then, in my :443 virtual host, I added two redirect rules to pass anything other than /post back to the main site:

NameVirtualHost *:443

<VirtualHost *:443>
  ServerName www.myhostname.com

  # SSL Engine options go here
  # Directory authentication options go here

  RewriteEngine On
  RewriteRule /post(.*) /post$1 [PT,L]
  RewriteRule ^(.*)$ http://www.timfanelli.com$1 [R=301]
</VirtualHost>

That way, any requests to /post stay on the SSL protected site, any other requests go back to the main, non-SSL site. The SSL Engine options section enables SSL and directs apache to use my self-signed certificate (see how to create one here), and I copied the <Directory> element from my main site into the virtual host, and added a Require valid-user statement using DIGEST authentication.

This provides a secure place for me to make entries to my blog, and prevents general viewing of my site via SSL to minimize the performance overhead (while I love my mac and the G4 processor, SSL is not its strong point).

Windows Domain Authentication with SVN and Apache

Posted Dec 1, 2005 5:47:16 PM

I just finished setting up domain based authentication on our new SVN server here at work, so I thought I'd post my notes on the process and links to what you'll need - since I found that the information was pretty disparate.

First a list of what I used, and you'll need:

  • Subversion - obviously. I used version 1.2.3
  • Apache Web Server - I used version 2.0.55
  • mod_auth_sspi 1.0.3 - This has always been hard to come by and there's always been various patches of it floating around. This place is a unified attempt to bring all the patches together, and it works very well. Grab the one for the appropriate version of Apache2

I will assume that you've already installed both Apache 2.0.55 and SVN 1.2.3. If you haven't, please do so and then come back -- the installation for both of them is very simple and will only take you a few minutes to complete.

Step 1: mod_dav and mod_dav_svn

The first step to accessing SVN via Apache is to set up WebDAV. To do this, copy

C:\Program Files\Subversion\bin\mod_dav_svn.so

to:

C:\Program Files\Apache Group\Apache2\modules

Next edit your httpd.conf file, and add the following content:

LoadModule dav_module         	modules/mod_dav.so
LoadModule dav_svn_module 	modules/mod_dav_svn.so

<Location /svn>
	DAV svn
	SVNParentPath "/path/to/repositories"		
</Location>

This example uses SVNParentPath to point to the parent folder of multiple SVN repositories. If you set it to C:\repositories, then any directory you create under it, such as C:\repositories\ProjectA, is accessible under the /svn URL, like so: http://localhost/svn/ProjectA. If you only have 1 repository, or do not plan to use multiple repositories, you could use the SVNPath directive instead, and point it directly to your SVN repository. This approach is more flexible though, and allows for expansion without changing your configuration files.

mod_auth_sspi and mod_authz_svn

The next step is to enable domain based authentication and access control to your SVN repositories. Copy:

C:\Program Files\Subversion\bin\mod_authz_svn.so

to

C:\Program Files\Apache Group\Apache2\modules

And edit your http.conf file again to look like this:

LoadModule dav_module 		modules/mod_dav.so
LoadModule dav_svn_module 	modules/mod_dav_svn.so
LoadModule authz_svn_module 	modules/mod_authz_svn.so 
LoadModule sspi_auth_module 	modules/mod_auth_sspi.so

>Location /svn<
	DAV svn
	SVNParentPath "D:/Engineering/svn/repos"	

	AuthName "My SVN Server"
	
	AuthType SSPI
	SSPIAuth On
	SSPIOmitDomain On
	SSPIAuthoritative On
	SSPIDomain DOMAINNAME

	Require valid-user
	AuthzSVNAccessFile "C:/repositories/svnaccess.txt"	
</Location>

You can see that we've added two modules, and several lines to our Location /svn element. Set the SSPIDomain appropriately for the domain you want to authenticate against. SSPIOmitDomain On allows you to authenticate against the domain without specifying it as an explicit prefix, you can turn that off as you like, but it's simpler to just leave it on.

We also specify an AuthzSVNAccessFile directive that specifies the file we store our authroization information in, which leads us to:

AuthzSVNAccessFile

The AuthzSVNAccessFile specifies a plain text file that identifies which repositories users have access to. It's simple to set up, here's an example:

[groups]
developers=Tom,Dick,Harry,Sally,Sue
managers=Bill,Jean,Marry,Bob,Dave

[repositoryname:path]
@developers = rw
@managers = r
Bill = rw

Replace repositoryname with the name of your repository, which is a subdirectory under the path you specified in the SVNParentPath directive, and path with the path you're modifying, such as / for the whole repository, or /branches/Bill for a specific branch. In this example, we've given the group developers read write access, the managers group read access, and explicitly given Bill read write access (he's a manager).

Conclusion

You should now have web-based access to your SVN repository using domain based authentication! It's a good idea at this point to further protect the repository using an SSL configuration, which I won't cover here. I have some notes on it for an Apple platform that may be useful here and here - I'll cover it explicitly for a Windows installation in another post though, hopefully sometime soon.

Installing Apache2 and SVN on OS X using Fink

Posted Nov 20, 2005 3:15:30 PM

This guide will walk you through installing Apache2 and SVN using Fink on OS X 10.4. You should be able to follow the same instructions for 10.3; however, there is not currently a stable release of SVN in the fink repository for OS X 10.2 and earlier.

Fink is a package management system for OS X based on Debian Linux's apt-get system. Since it compiles packages from source, you'll need to have Apple's Developer Tools installed. The latest version of the developer tools will install the necessary compilers - GCC 4.0 and GCC 3.3.

Installation

Fink

You can download Fink at fink.sourceforge.net. Currently, the latest stable release is 0.8.0 for OS X 10.4, 0.7.2 for OS X 10.3 and 0.6.4 for OS X 10.2. You should download the latest available stable release for your platform.

Once you've installed Fink you should update it. Open a Terminal window and run:

sudo fink self-update

This will cause Fink to check for updates to itself, as well as download the latest package information. You may be asked to provide information about how Fink should be configured, and in most circumstances you'll be fine to just accept the defaults.

Apache2

We'll install Apache 2 with SSL support first. This will allow us to configure Subversion to work through secure http connections.

sudo fink install apache2-ssl

You'll be prompted by Fink to satisfy a virtual dependency:

fink needs help picking an alternative to satisfy a virtual dependency. The candidates:
(1)     apache2-ssl-mpm-worker: Apache2 Server Binary - [MPM WORKER]
(2)     apache2-ssl-mpm-perchild: Apache2 Server Binary - [MPM PERCHILD *EXPERIMENTAL*]
(3)     apache2-ssl-mpm-prefork: Apache2 Server Binary - [MPM PREFORK]
(4)     apache2-ssl-mpm-leader: Apache2 Server Binary - [MPM LEADER *EXPERIMENTAL*]
(5)     apache2-ssl-mpm-threadpool: Apache2 Server Binary - [MPM THREADPOOL *EXPERIMENTAL*]

Unless you have a preference in mind already, choose the default (1) to install MPM Worker. You may be prompted again to satisfy a second dependency:

fink needs help picking an alternative to satisfy a virtual dependency. The candidates:
(1)     db43-ssl: Berkeley DB embedded database
(2)     db43: Berkeley DB embedded database - non crypto

And again, unless you have a preference in mind, choose the default.

Fink will prompt you with the list of dependant packages that will be installed, simply press enter to accept, and let Fink work its magic.

Next we'll install the mod_ssl module for Apache2, by executing:

sudo fink install libapache2-ssl-mod-ssl

When it's done, you'll be able to start Apache2 by executing:

/sw/sbin/apachectl start

And stop it using:

/sw/sbin/apachectl stop

Subversion

Installing Subversion with Fink is equally simple. svn-ssl installs the SVN server utilities, and svn-client installs the SVN client software; we'll install the SSL enabled versions of both these packages.

sudo fink install svn-ssl
sudo fink install svn-client-ssl

If you're prompted to satisfy dependencies, the defaults will usually do. Simply sit back and and let Fink work its magic.

WebDAV

The final package we'll install is libapache2-ssl-mod-svn which enables serving respositories using WebDAV.

sudo fink install libapache2-ssl-mod-svn

Configuration

SSL

Now that we have everything installed, we'll configure Apache2 for SSL support. Most of the work has already been done for us by Fink, but we still need to create and install our own self signed RSA certificate. See my guide to creating an apache2 SSL certificate to create a private key file and self signed public key certificate, and then do the following to install it into Apache2:

sudo mkdir /sw/etc/apache2/ssl.key
sudo mkdir /sw/etc/apache2/ssl.crt 
sudo cp ~/server.key /sw/etc/apache2/ssl.key/
sudo cp ~/server.crt /sw/etc/apache2/ssl.crt/

chmod 0400 /sw/etc/apache2/ssl.key/server.key
chmod 0400 /sw/etc/apache2/ssl.crt/server.crt

Now, when you start Apache, you'll be prompted for your private key's password; this is because it is encrypted for security reasons. This can be a nuissance, but it's recommended that you keep it this way. If you decide not to, however, here's the steps to decrypt it so you're not prompted anymore:

cd /sw/etc/apache2/ssl.key
cp server.key server.key.orig
openssl rsa -in server.key.orig -out server.key

Creating SVN Repositories

Choose a location on your hard drive under which all your SVN repositories will reside. I'll use /opt/repositories, but the location really doesn't matter. We'll create a new "test" repository in this directory:

mkdir /opt/repositories/
mkdir /opt/repositories/test

svnadmin create /opt/repsitories/test

I like to set the file system permissions on it such that only Apache2 can write to it:

sudo chown -R www /opt/repositories/test
sudo chmod -R 0700 /opt/repositories/test

You should substitute the name of the user you run Apache2 as for "www".

WebDAV Access and Authentication

Finally, we'll enable WebDAV access to your SVN repository in Apache and set up user authentication. Add the following to your /sw/etc/apache2/ssl.conf file:

<Location /svn>
        DAV svn
        SVNParentPath /opt/repositories

        AuthType BASIC
        AuthName "Subversion Repository"
        AuthUserFile /sw/etc/apache2/svn-auth-file

        Require valid-user
</Location>

We'll then create the /sw/etc/apache2/svn-auth-file using htpasswd. You'll use this file to maintain the list of users and passwords that can access your repositories.

sudo htpasswd -cm /sw/etc/apache2/svn-auth-file 

This will create a new user file and add the specified user to it. You can use htpasswd to add, remove and edit users from this file as you see fit.

Conclusion

You'll now have a secure SVN server accessible through Apache2 using WebDAV.

Creating an SSL Certificate using OpenSSL

Posted Nov 20, 2005 3:11:45 PM

I just wanted to post some quick instructions on creating a self signed certificate that you can install into Apache 2 for use with mod_ssl. It seems that these instructions are hard to come by, and I thought it would be useful to just show how to do it without the messy explanations:

mkdir ~/sslcert
cd ~/sslcert

openssl genrsa -des3 -out server.key 1024
openssl req -new -key server.key -out server.csr
openssl x509 -req -days 365 -in server.csr -signkey server.key -out server.crt

server.key and server.crt are now your private key and self-signed public certificate pair. If you install them into Apache2, you'll notice that you're now prompted for your certificate password everytime you start the server. This is because your private key is stored in an encrypted format for security. It's recommended that you leave it this way, but if you really hate that password prompt starting apache, here's how you can decrypt your private key file:

cd ~/sslcert
cp server.key server.key.orig
openssl rsa -in server.key.orig -out server.key
Apache Rewrite Rules

Posted Oct 18, 2005 8:04:00 PM

The Apache module mod_rewrite provides a powerful mechanism for hiding, redirecting, and reformatting request URLs. I just finished implementing a mod_rewrite scheme for timfanelli.com to accomplish 3 things:

  1. Redirect old URLs with a 301 redirect code
  2. Hide certain parts of the URL from my readers.
  3. Optimize my Google pagerank.

My first goal was to redirect old URLs using 301 Redirect codes. I migrated to pyBlosxom a long time ago, and it recently came to my attention that not only were there links to my old URLs on other people's blogs, Google was turning up search results pointing to my old URLs also! All of these references resulted in 404's, driving my pagerank down towards 0.

Using a two simple rewrite rule, I was able to redireect my previous URLs, http://wwww.timfanelli.com/index.cgi to a static page, old.html, which provides links to my new URL, http://www.timfanelli.com/cgi-bin/blog.cgi:

RewriteEngine on
RewriteRule ^/index.cgi(.*) /old.html [R=301]

Now, any link followed to my page that starts with "/index.cgi" is redirected, and a 301 is issued to the requesting client indicating that the resource has been permanently relocated.

My second goal was to hide the /cgi-bin/blog.cgi portion of my URL. It's ugly and it's hard to remember. I wanted any request sent to http://www.timfanelli.com/blog/ to go directly to that CGI script. Using a passthrough rule and a 301 redirect accomplished this nicely:

RewriteRule ^/blog/(.*)$ /cgi-bin/blog.cgi/$1 [PT]
RewriteRule ^/$ /blog/ [R=301]
RewriteRule ^/blog$ /blog/ [R=301]

The first rule redirects any request sent to /blog/ to /cgi-bin/blog.cgi/. Any extra characters in the URL string is copied into the new rewritten URL using regular expression groupings. The second rule causes a 301 redirect from my base-url to the blog, and the third causes a 301 redirect if the URL is missing the trailing / character. We use a 301 redirect here instead of another passthrough rule to prevent having multiple "valid" URLs with the same content.

Having multiple "valid" URLs with the same content isn't in and of itself a problem. Your website would work just fine, but I also wanted to optimize my site for Google pagerank. To this end, the astute reader would have noticed that there is now two ways to access my site: http://www.timfanelli.com/blog and http://www.timfanelli.com/cgi-bin/blog.cgi. We need to hide the /cgi-bin/blog.cgi URL from the outside world. This gets a little tricky, because we can't just redirect /cgi-bin/blog.cgi to /blog/ -- this would cause an infinitely recursive rewrite, because /blog/ rewrites to /cgi-bin/blog.cgi! We'll still use this rewrite rule though, but we'll protect it with a RewriteCond clause so its only evaluated when it comes in the original request URL:

RewriteCond ${IS_SUBREQ} false
RewriteRule ^/cgi-bin/blog.cgi(.*)$ /blog/$1 [R=301]

IS_SUBREQ is "true" if the rule is being processed as a sub request of the original; false otherwise. So when it's matching the user-entered URL, it is not a sub request, and the rewrite rule substitutes /cgi-bin/blog.cgi with /blog/. This is done with a 301 redirect, so Google won't see it as a valid URL. Later, when the rewrite engine substitutes /blog/ for /cgi-bin/blog.cgi - IS_SUBREQ is going to be "true", and this rule won't be executed again.

So now the only valid way to access my site from the "outside" is via the URL http://www.timfanelli.com/blog/, even though all of the following URLs will appear to work as well:

  • http://www.timfanelli.com/ (no "/blog/")
  • http://www.timfanelli.com/blog (no trailing slash)
  • http://www.timfanelli.com/cgi-bin/blog.cgi
  • http://www.timfanelli.com/cgi-bin/blog.cgi/

Many thanks to Pete for all his help!

Popular Tags

Recent Stories

${recent.title}

About

My name is Tim Fanelli, I am a software engineer in Northern NY. I spend most of my time working, and when I can, I try to post interesting things here.

Cigar Dossiers