Archive for the ‘Programs’ Category

dudders and reliable DNS zone updates

Monday, August 23rd, 2010

I’ve released a new version of dudders, 1.04, and finally submitted it as a package to OpenWRT. The focus of this release was on making the update more robust to network failure as the result of an email correspondence with Peter Holik. I am of the opinion that DNS UPDATE is a strong candidate for being TCP by default (along with zone-transfers).

In RFC 1123 it is stipulated that:

a DNS resolver or server that is sending a non-zone-transfer query MUST send a UDP query first.

However, if you are doing a DNS UPDATE you really want the reliability that TCP offers, even if you don’t expect truncation to be an issue. The update is sent to the relevant authority server, so the arguments about load on root servers in the RFC aren’t applicable.

I’ve made the UDP implementation retry by default, but I think if you need more than 2 retries, you should be considering using TCP with its (much more advanced) retransmission algorithms.

Peter also found a bug in glibc’s res_send (actually in their send_dg function) whereby the resolver interprets the lack of the DNS “recursion available” flag in the header as an error. However, that flag isn’t even meaningful for DNS UPDATE responses; according to RFC 2136, those bits:

Should be zero (0) in all requests and responses. A non-zero Z field should be ignored by implementations of this specification.

As a result, glibc was setting errno to ECONNREFUSED or ETIMEDOUT even when the update was successful. I’ve hacked dudders to double-check after res_send, but it’s making me question the wisdom of using res_send at all, given that I’m constantly working around it.

Update: submitted glibc bug report #11950

To get dudders-1.04 on OpenWRT, simply update the official package feed and select dudders from the Net > DNS > dudders menu in the buildroot config. For systems other than OpenWRT, you can grab the source from sourceforge, or even github.

Improved Aho-Corasick in Haskell

Sunday, August 1st, 2010

I was disappointed at the poor performance of my last attempt at string matching in Haskell, so I decided to rewrite it using the high-performance Data.Map and Data.Array data structures. This makes the semantics a lot closer to an imperative C implementation. The result isn’t so elegant, but it is predictable due to guarantees on the asymptotic performance of the array and map functions. It’s also fast.

This highlights one of the big problems in writing fast Haskell: the elegant, recursive, list-based style you learnt is unpredictable in terms of time complexity. You know not to use ++ where cons would do, and more generally aim for tail recursion; but when you need to decide what should be boxed/unboxed/strict/lazy it’s very frustrating. Without being very familiar with what reduction strategy the compiler used, it was unclear what was actually going on with my previous code until I pulled out the profiler.

(more…)

Aho-Corasick string matching in Haskell

Sunday, July 18th, 2010

The Aho-Corasick string matching algorithm constructs an automaton for matching a dictionary of patterns. When applied to an input string, the automaton’s time complexity is linear in the length of the input, plus the number of matches (so at worst quadratic in the input). It’s been around since 1975, but it isn’t implemented in the Haskell stringsearch library and I couldn’t even find a general trie data structure from google. So I implemented the Aho-Corasick algorithm myself: take a look at the full Aho-Corasick module.

There was an interesting paper on deriving the algorithm as a result of applying fully-lazy evaluation and memoization on a more naive algorithm. Unfortunately, applying fully-lazy evaluation and memoization to a function in Haskell is non-trivial (despite it being theoretically possible for the compiler to do so!).

It’s always interesting trying to find the functional equivalent to an imperative algorithm. I ended up using some cute Haskell tricks.

Update: I’ve written an improved version of Aho-Corasick implemented with Data.Array and Data.Map

(more…)

Muxic Beta 2, XMMS2 0.6 Macport

Saturday, June 13th, 2009

I finally got around to updating Muxic to the new xmms2-0.6DrMattDestruction API. There were a few changes to the xmms2 clientlib that needed attention.

While I was doing this, I noticed that the xmms2 CF clientlib isn’t just superficially unusable, but will rev up your CPU something awful if the xmms2d drops its connection after registration of the clientlib CF runloop event source. It needs a shutdown function, and that shutdown function needs some connection-specific CF variables from the init function. Hence we can’t use the API in its current state (unless there’s some way to inject mainloop-specific state into the xmmsc_connection_t that I don’t know about).

I’ve reported it and presented a patch as XMMS2 bug #2232, but essentially it means that the libxmmsclient-cf that currently comes with xmms2 is very bugged.

Update: patch accepted into xmms2-devel.git.

Since macports haven’t updated their port of xmms2, here is my xmms2 @0.6DrMattDestruction portfile. I would have liked to support variants, but whenever I add my own –with-optionals or –with-plugins options to waf configure, the build breaks with link errors. This happens even when I copy the exact list that waf configure automagically selects without arguments. I haven’t figured out why this is, and trying to trace the internals of waf is a bitch.

Anyway, presenting Muxic Beta 2, source code available in the Muxic git repository as usual.