Time flyes by!
I’m sorry about the lack of updated lately — nothing much has happened… But that doesn’t mean that I don’t have anything to write!
I’m still playing with FreeNet, it’s getting better and better all the time. I’ve now been running my computer as a node in the network for about three weeks and I’m pretty well connected now.
Apart from FreeNet, then I’ve spend some time on fixing [PhpWiki][] so that I
can export a whole WikiWikiWeb, complete with pages, images, and
style-sheets. There was already support for exporting the pages as XHTML
pages, but the links to the images and stylesheets were left as-is, that
is, pointing to the installation directory. My version of PhpWiki uses
relative links to a files
directory which contains all external images
and style-sheets needed for the site to function. It’s a bit of a hack,
but it works — take a look at the website for The Danish National
Research Foundation: Center for Catalysis which is a site I’ve made
using PhpWiki. You’ll recognize the RecentChanges page and all the PhpWiki
documentation if you look around a little :-)
One of my plans is to export GimpsterDotCom as a set of static XHTML pages, and then insert it into FreeNet. I think getting a relatively big site like GimpsterDotCom with it’s nearly 298 pages (see AllPages for a list) would be a good thing for the network, since most FreeSite~s out there consist of just a (often very big) single page. The downside of having such a large number of small pages is, that many of them will drop out of FreeNet if they’re not being requested often enough. The LeastPopular pages are simply the ones that are removed first to make room for new content. So FreeNet is not about permanent storage, it’s a more democratic system where everybody can publish everything, and where the popular stuff (whatever that might be) spreads to many nodes.
But I’ll let you know when you can find GimpsterDotCom in FreeNet. Last
time I tried to insert it Fishtools (which is mirrored on the normal
Internet) it couldn’t verify the inserted pages. I think I’ve narrowed
it down to my use of =
in filenames. The =
’s come when I
encode the names of the pages as Quoted-Printable when exporting from
PhpWiki. The pagenames has to be encoded in some way, as they can contain
all sorts of strange characters, or at least the full ISO-8859-1 (also
known as Latin-1) character set which includes all the normal accented
characters we use in Western Europe.
At first the pagenames were encoded by php-function:urlencode, but this
gave problems when viewing exported pages on a webserver. It’s perfectly
fine to have a file called foo%2Fbar.html
on the server, this is a valid
filename in Linux. But when you ask the webserver for it, using a browser, then
it will (correctly) interpret the %2F
in the URL given by the browser as
the character with ASCII value 0×2F which a /
and therefore look for
bar.html
in the foo
directory. There’s no such file, so it returns the
dreaded “404 Not Found” error to the user. And even if we created a foo
directory and moved foo%2Fbar.html
to foo/bar.html
, then all relative
links in the new bar.html
would have to be changed, because bar.html
is moved relative to the other pages.
The links themselves could instead be rewritten, so that the browser would
use the URL foo%25Fbar.html
in the request. The webserver will now
decode the URL into foo%Fbar.html
and find this file, because %25
is
interpreted as %
. The problem with this is, that the links no longer
works when you’re viewing the site offline from your harddisk, because
then noone will translate the %25
into the required %
:-(
The net result is, that we shouldn’t use php-function:urlencode to encode
the pagenames. This function is used to encode arguments passed in a GET
request, but it’s a mess to use it with the filenames. Using another
encoding like Quoted-Printable works around this problem, for no webserver
will do Quoted-Printable decoding on the URL before it looks for the file
in the filesystem. But then there’s the problem with some tools that
cannot handle =
in filenames… I think I’ll just replace the =
with
another character like _
or -
, but I haven’t done this
yet…
Leave a comment