Archives for May 2008

Implementing HAT-Tries in Java

As I said in an earlier post detailing the CCHashTable, it formed a part of a larger data structure, the HAT-Trie. The HAT-Trie is a recent variant on the trie data structure, published by Askitis and Sinha in last year’s Australasian conference on Computer science (the original paper is here). The problem they were addressing was that while the trie is very fast, it’s also very expansive in memory; and while some variants of it like the radix trie (or Patricia trie) and the related burst-trie, are less expensive in terms of memory space, they aren’t as fast as they could be because they don’t take into account caches in their design.

Consider for a moment, the standard trie node, with its array of pointers to all possible following characters. Think of a string of these nodes, representing a character string of some type, and how the terminal node then stores a link to a piece of information which the overall character string is indexing to (there are obviously a lot of applications for such a data structure). The obvious problem is that the longer the string, the more nodes, and the nodes are large – even for standard ascii, that’s 128 pointers per node, which means that to store ten characters, you’re looking at 512 bytes. And few strings used in such applications are that short – certainly in the application I was looking at originally, we were looking at strings ten times that size at least.

The burst-trie, the radix trie (and it’s predecessor the compact trie) all attempt to solve this problem through similar methods involving compressing parts of the trie. The compact trie looks at all the branches of the trie that lead straight to a terminal node and represents it with a single … Read the rest

Ubuntu upgrades and fundamental problems

New job, new machine. So the last two days I’ve been setting up hardware platforms for work (one linux desktop machine, one win32 laptop, and some cool toys like an iPod Touch and a Nokia N800 – there are more to come, but it’s more than enough to start with). I’ve been using Kubuntu for work for a few years, and Xubuntu on my ancient (and now breaking) laptop at home (an Inspiron 1100, which should tell you why it’s xubuntu and not kubuntu there as well). So, no-brainer, other people have had no problems so just download the latest ISO and do a quick install, right?

Well, yes and no.

 … Read the rest