Time flyes by!

I’m sorry about the lack of updated lately — nothing much has happened… But that doesn’t mean that I don’t have anything to write!

This bunny killed censorship I’m still playing with FreeNet, it’s getting better and better all the time. I’ve now been running my computer as a node in the network for about three weeks and I’m pretty well connected now.

Apart from FreeNet, then I’ve spend some time on fixing [PhpWiki][] so that I can export a whole WikiWikiWeb, complete with pages, images, and style-sheets. There was already support for exporting the pages as XHTML pages, but the links to the images and stylesheets were left as-is, that is, pointing to the installation directory. My version of PhpWiki uses relative links to a files directory which contains all external images and style-sheets needed for the site to function. It’s a bit of a hack, but it works — take a look at the website for The Danish National Research Foundation: Center for Catalysis which is a site I’ve made using PhpWiki. You’ll recognize the RecentChanges page and all the PhpWiki documentation if you look around a little :-)

One of my plans is to export GimpsterDotCom as a set of static XHTML pages, and then insert it into FreeNet. I think getting a relatively big site like GimpsterDotCom with it’s nearly 298 pages (see AllPages for a list) would be a good thing for the network, since most FreeSite~s out there consist of just a (often very big) single page. The downside of having such a large number of small pages is, that many of them will drop out of FreeNet if they’re not being requested often enough. The LeastPopular pages are simply the ones that are removed first to make room for new content. So FreeNet is not about permanent storage, it’s a more democratic system where everybody can publish everything, and where the popular stuff (whatever that might be) spreads to many nodes.

But I’ll let you know when you can find GimpsterDotCom in FreeNet. Last time I tried to insert it Fishtools (which is mirrored on the normal Internet) it couldn’t verify the inserted pages. I think I’ve narrowed it down to my use of = in filenames. The =’s come when I encode the names of the pages as Quoted-Printable when exporting from PhpWiki. The pagenames has to be encoded in some way, as they can contain all sorts of strange characters, or at least the full ISO-8859-1 (also known as Latin-1) character set which includes all the normal accented characters we use in Western Europe.

At first the pagenames were encoded by php-function:urlencode, but this gave problems when viewing exported pages on a webserver. It’s perfectly fine to have a file called foo%2Fbar.html on the server, this is a valid filename in Linux. But when you ask the webserver for it, using a browser, then it will (correctly) interpret the %2F in the URL given by the browser as the character with ASCII value 0×2F which a / and therefore look for bar.html in the foo directory. There’s no such file, so it returns the dreaded “404 Not Found” error to the user. And even if we created a foo directory and moved foo%2Fbar.html to foo/bar.html, then all relative links in the new bar.html would have to be changed, because bar.html is moved relative to the other pages.

The links themselves could instead be rewritten, so that the browser would use the URL foo%25Fbar.html in the request. The webserver will now decode the URL into foo%Fbar.html and find this file, because %25 is interpreted as %. The problem with this is, that the links no longer works when you’re viewing the site offline from your harddisk, because then noone will translate the %25 into the required % :-(

The net result is, that we shouldn’t use php-function:urlencode to encode the pagenames. This function is used to encode arguments passed in a GET request, but it’s a mess to use it with the filenames. Using another encoding like Quoted-Printable works around this problem, for no webserver will do Quoted-Printable decoding on the URL before it looks for the file in the filesystem. But then there’s the problem with some tools that cannot handle = in filenames… I think I’ll just replace the = with another character like _ or -, but I haven’t done this yet…

Leave a comment