Archive for the ‘Computing’ Category.

Securing my data

My machine is now running with /home on my RAID-1 mirror. When I booted my machine with one of the drives turned off (by removing the power cable) it didn’t make a fuss. Putting the “faulty” disk back online was a simple matter of adding it back to the array. The RAID then resynched and was back to normal status after 40 minutes. Pretty cool!

I ran bonnie++ to test the read performance on my regular /dev/sda drive and on the /dev/md0 RAID mirror. The results from my normal disk:

------Sequential Output------
-Per Chr- --Block-- -Rewrite-
K/sec %CP K/sec %CP K/sec %CP
33601  94 49338  15 17551   4
--Sequential Input- --Random-
-Per Chr- --Block-- --Seeks--
K/sec %CP K/sec %CP  /sec %CP
16533  46 42633   6 184.3   0
------Sequential Create------
-Create-- --Read--- -Delete--
 /sec %CP  /sec %CP  /sec %CP
22150  87 +++++ +++ 22595  99
--------Random Create--------
-Create-- --Read--- -Delete--
 /sec %CP  /sec %CP  /sec %CP
21485  87 +++++ +++ 21210  99

compared to the results from my RAID mirror:

------Sequential Output------
-Per Chr- --Block-- -Rewrite-
K/sec %CP K/sec %CP K/sec %CP
32071  91 54533  16 24885   6
--Sequential Input- --Random-
-Per Chr- --Block-- --Seeks--
K/sec %CP K/sec %CP  /sec %CP
22946  62 52236   6 382.1   0
------Sequential Create------
-Create-- --Read--- -Delete--
 /sec %CP  /sec %CP  /sec %CP
26386  92 +++++ +++ 24380  99
--------Random Create--------
-Create-- --Read--- -Delete--
 /sec %CP  /sec %CP  /sec %CP
27755  99 +++++ +++ 22607  99

The performance is better in all areas, except that the CPU utilization is a tad higher. The read performance went up from 42 MiB/s to 52 MiB/s, an increase of 25%. I expected an increase, but it could have been bigger considering that the read requests are balanced over the two drives. But then again, the main goal of the RAID was to make sure that my data will be kept safe, so the increase in performance is just an added bonus!

Even the write performance went up, from 49 MiB/s to 54 MiB/s. This is a bit strange, since every RAID Howto I’ve read explains how the write performance should drop using RAID-1. This is because each block is put on the bus twice, once for each disk. But who am I to complain? :-)

With my data backed by two disks I’m feeling fairly safe on that front. Of course my computer could still be stolen, hit by lightning, or the data could simply be deleted. To protect against the latter I’ve installed dirvish to take backups of /home to my normal disk.

These backups are made daily, and rotated so that I have images for the last two weeks. The nice feature of dirvish is that the backups are live — they exist on my disk as a normal filesystem tree.

Normally it would require a hidious amount of space to keep two weeks worth of full backups, but since dirvish only uses space for new and changed files it should do just fine with my 80 GB disk. The trick is using hard links for files which haven’t changed between backups — hardlinks take up almost no space,or rather, they take up inodes, but ReiserFS (which is the filesystem I use) allocates inodes dynamically as needed so I wont suddenly run out of them.

So with my data spread over no less than three disks I can sleep with ease at night :-)

The RAID is resyncing…

The hard drives have arrived and are now in my machine as a RAID-1 mirror. The mirror is currently resyncing at 50 MB/sec, but I can still work on it — pretty cool!

The original plan was to use the two new disks in a mirror and reinstall Debian on it. But now I’m thinking about simply moving /home to the RAID and then leave the rest of the system alone.

That would make it extremely easy for me to upgrade and reinstallthe system since I would have my important data stored safely away on the RAID. And I don’t have to fiddle around with reinstalling my system right now; I had hoped that Debian Sarge would have been released by now, but they are still working on it.

Unfortunately I wont see any speed gains when I start, say, Emacs, or any other program if they aren’t located on the RAID. So I’m thinking about moving more than just /home to it — according to the FHS (which Debian follows) — I should be able to move at least /usr to it since it is read-only data.

Begone evil spam!

Just three days after installing WordPress my blog has been spammed — I guess that shows how quickly the blog is integrated into bigger blogging community thank to the links from Janus and Kristian and the automatic pinging of several other sites by Ping-O-Matic.

So the world is tuned in… that’s all very well if it weren’t for those pesky spammers! I’ve had seven spam comments already, and it didn’t look like it was going to stop. But with help from Spam Karma I hope to get rid of them. It checks the comment in several ways and deletes it if it determines that it is definite spam. Good comments are let through immediatedly. Comments where Spam Karma is in doubt will be subjected to a captcha test where you have to recognize some letters in an image. That is a really cool feature which should ensure that no automated program can post comments, while still allowing access to real people.

Provided that the commenter gave a valid email address when submitting the comment, he or she can also verify themselves using that. That also means that people with bad sight can get their comments approved even if they cannot use the captcha test.

Visiting Aalborg

I’m visiting my family in Aalborg for a couple of days — when I go to Switzerland next month they wont be just an hours drive away anymore :-)

The plan for this week is to package the stuff in my room and move it up here to my parents. I’ll bring some of it with me to Switzerland, but only the most important stuff. Of course that includes my computer!

Speaking of my computer… I’ve ordered two Seagate 7200.7 SATA NCQ disks each with 120 GB. The plan is to combine them into a RAID 1 mirror so that I will have a reliable place for my photos and other data. In Skejbygård I could just copy my data to my friend Svend’s computer, but that wont work anymore when I’m in Switzerland. And besides: I’ve wanted a RAID system for quite some time now — I believe that the Debian installer can install on a RAID 1 system now.

Parse errors…

While trying to put my PHP tutorial back online I have just spend at least half an hour fighting the markup language! The problem is the lack of tables in [Markdown][].

I wanted a table listing the operators in one column and their meaning in another column — a quite simple talk one would think, and something which I was able to do in PhpWiki with only a minor problem: I could not use || when describing the or operator, since the latter is used in the definition of the table itself.

With Markdown I had to write the table myself. No problem, I’ve written lots of tables by hand! But no, everytime I saved the page the closing table tag somehow disappeared. This messed the page up quite severely.

Next idea: make the table in plain ASCII. It wont be as nice, but it ought to work. But no, even within a code block, where the lines are interpreted literally, I got into trouble. The line with the < ate the following spaces.

These things are what I hate the most with all those PHP content-management systems: they are all so fragile! The parsing done in [Markdown][] is based on regular expressions, and so is the parsing in all other systems I’ve seen. This just doesn’t work reliably — my experience is that you either get strange results like I did, or that you get silly limitations, or you end up with both.

The limitations I’m talking about is when you in PhpWiki cannot apply formatting markup to a link. A quick test to show that it works with Markdown.

I believe that using stronger tools for the parsing would help with these problems — in particular defining a proper grammar and writing a lexer and parser would make things more robust. When people submit a comment with parse errors it would be up to the compiler to flag them as such. It wont be easy to make a compiler with good error-recovery for such a system.

But if it were done, then we would in effect have a system of writing valid [XHTML][] without all the tags — that is a worthy goal! And given such a precise understanding of the structure of the text, one could easily convert it into all sorts of interesting formats such as [LaTeX][] (for later conversion into good-looking PDFs) and ASCII (for inclusion in README files and such).

Looking at the source code for the [GNU][] Flex and Bison tools one sees that they are not exactly trivial to reimplement in PHP — far from it. But I still hope that they will some day either support PHP natively, or that we get another lexer/parser framework for PHP.