<?xml version="1.0" encoding="utf-8"?>
<!-- If you are running a bot please visit this policy page outlining rules you must respect. http://www.livejournal.com/bots/ -->
<feed xmlns="http://www.w3.org/2005/Atom" xmlns:lj="http://www.livejournal.com">
  <id>urn:lj:livejournal.com:atom1:evan_tech</id>
  <title>evan_tech</title>
  <subtitle>Evan writes about technology</subtitle>
  <author>
    <name>Evan Martin</name>
  </author>
  <link rel="alternate" type="text/html" href="http://community.livejournal.com/evan_tech/"/>
  <link rel="self" type="text/xml" href="http://community.livejournal.com/evan_tech/data/atom"/>
  <updated>2008-07-22T15:57:31Z</updated>
  <lj:journal username="evan_tech" type="community"/>
  <link rel="service.feed" type="application/x.atom+xml" href="http://community.livejournal.com/evan_tech/data/atom" title="evan_tech"/>
  <entry>
    <id>urn:lj:livejournal.com:atom1:evan_tech:251804</id>
    <author>
      <email>evan@livejournal.com</email>
      <name>Evan Martin</name>
    </author>
    <lj:poster user="evan"/>
    <link rel="alternate" type="text/html" href="http://community.livejournal.com/evan_tech/251804.html"/>
    <link rel="self" type="text/xml" href="http://community.livejournal.com/evan_tech/data/atom/?itemid=251804"/>
    <title>dns attack of doom</title>
    <published>2008-07-22T15:57:31Z</published>
    <updated>2008-07-22T15:57:31Z</updated>
    <category term="grumpy"/>
    <content type="html">If I've learned anything from the new Kaminsky DNS attack, it's that if you want to keep something a secret while disclosing to a trusted subset of vendors, you do &lt;em&gt;not&lt;/em&gt; include &lt;a href="http://www.matasano.com/log/1105/regarding-the-post-on-chargen-earlier-today/"&gt;publicity-hungry overeager bloggers&lt;/a&gt; in the list of people who can keep their mouths shut.</content>
  </entry>
  <entry>
    <id>urn:lj:livejournal.com:atom1:evan_tech:251565</id>
    <author>
      <email>evan@livejournal.com</email>
      <name>Evan Martin</name>
    </author>
    <lj:poster user="evan"/>
    <link rel="alternate" type="text/html" href="http://community.livejournal.com/evan_tech/251565.html"/>
    <link rel="self" type="text/xml" href="http://community.livejournal.com/evan_tech/data/atom/?itemid=251565"/>
    <title>heavy protocols add up</title>
    <published>2008-07-19T17:39:37Z</published>
    <updated>2008-07-19T17:45:16Z</updated>
    <content type="html">I found &lt;a href="http://mail.jabber.org/pipermail/standards/2008-February/018015.html"&gt;this discussion&lt;/a&gt; of Android's dropping XMPP interesting.  (Disclosure: I have no insider knowledge about any of this.)  In particular, this remark about compression in the context of the notoriously fat XMPP protocol:&lt;blockquote&gt;Bandwidth used means radio transmissions sent, and overhead means more work done by the processor, both of which take battery power and reduce battery life.  Meanwhile, compression turned out to not be very helpful.  Since it's negotiated during connection startup, it doesn't help with startup overhead.  It does help somewhat with steady-state bandwidth, but at the expense of additional CPU cycles.  The result is that enabling compression actually reduced battery life in our tests -- it took more power for the CPU to do compression than we saved on radio power.&lt;/blockquote&gt;(I wonder though: perhaps they could've used a simpler form of compression?  XML ought to be "easy" to compress.  Maybe the spec doesn't allow it?)&lt;br /&gt;&lt;br /&gt;You see a similar phenomenon with HTTP on a heavy page.  Like CNN.com: Firebug says it took 135 HTTP requests to load the page.  Many of them are to a CDN and only have ~400 bytes of request headers but all the ones to cnn.com (including the ads, apparently!) include cookies pushing them up to nearly a kilobyte.  The net result is that the latency of the page starts getting affected by the end-user's &lt;em&gt;upstream&lt;/em&gt; bandwidth, which is usually terrible.  (But now, having typed that out, I wonder: does it really matter?  Even those heavier requests fit within a packet anyway...)</content>
  </entry>
  <entry>
    <id>urn:lj:livejournal.com:atom1:evan_tech:251355</id>
    <author>
      <email>evan@livejournal.com</email>
      <name>Evan Martin</name>
    </author>
    <lj:poster user="evan"/>
    <link rel="alternate" type="text/html" href="http://community.livejournal.com/evan_tech/251355.html"/>
    <link rel="self" type="text/xml" href="http://community.livejournal.com/evan_tech/data/atom/?itemid=251355"/>
    <title>dolt libtool wrapper</title>
    <published>2008-06-10T23:05:51Z</published>
    <updated>2008-06-11T01:41:57Z</updated>
    <content type="html">&lt;a href="http://lists.debian.org/debian-devel/2008/04/msg00286.html"&gt;dolt&lt;/a&gt;: double the build speed of libtool-dependent software.  Examples of such software: kdelibs, gtk, libx11, libxml2, dbus.&lt;br /&gt;&lt;br /&gt;So many CS problems are significantly improved by caching at the right place...&lt;br /&gt;&lt;br /&gt;[&lt;span class='ljuser' lj:user='hober' style='white-space: nowrap;'&gt;&lt;a href='http://hober.livejournal.com/profile'&gt;&lt;img src='http://p-stat.livejournal.com/img/userinfo.gif' alt='[info]' width='17' height='17' style='vertical-align: bottom; border: 0; padding-right: 1px;' /&gt;&lt;/a&gt;&lt;a href='http://hober.livejournal.com/'&gt;&lt;b&gt;hober&lt;/b&gt;&lt;/a&gt;&lt;/span&gt; put this on reddit.  I had it first! :P]</content>
  </entry>
  <entry>
    <id>urn:lj:livejournal.com:atom1:evan_tech:251031</id>
    <author>
      <email>evan@livejournal.com</email>
      <name>Evan Martin</name>
    </author>
    <lj:poster user="evan"/>
    <link rel="alternate" type="text/html" href="http://community.livejournal.com/evan_tech/251031.html"/>
    <link rel="self" type="text/xml" href="http://community.livejournal.com/evan_tech/data/atom/?itemid=251031"/>
    <title>utf-8 is hard</title>
    <published>2008-05-25T21:02:02Z</published>
    <updated>2008-05-25T21:02:24Z</updated>
    <content type="html">&lt;img src="http://neugierig.org/pics/livejournal/2008/05/25-utf8.png"&gt;&lt;br /&gt;&lt;br /&gt;Evite trying to do Mårtensson.</content>
  </entry>
  <entry>
    <id>urn:lj:livejournal.com:atom1:evan_tech:250810</id>
    <author>
      <email>evan@livejournal.com</email>
      <name>Evan Martin</name>
    </author>
    <lj:poster user="evan"/>
    <link rel="alternate" type="text/html" href="http://community.livejournal.com/evan_tech/250810.html"/>
    <link rel="self" type="text/xml" href="http://community.livejournal.com/evan_tech/data/atom/?itemid=250810"/>
    <title>what you need to do in response to the openssl fiasco</title>
    <published>2008-05-14T22:10:45Z</published>
    <updated>2008-05-14T22:13:29Z</updated>
    <content type="html">If you have used a Debian-based system to generate SSH keys in the past two years, your keys are likely no good.  &lt;a href="http://www.debian.org/security/2008/dsa-1576"&gt;This document has instructions&lt;/a&gt;.  In brief:&lt;br /&gt;&lt;br /&gt;1) Delete your bad keys: &lt;code&gt;.ssh/id_*&lt;/code&gt;.  Fix all systems where you're trusting those keys (think &lt;code&gt;.ssh/authorized_keys&lt;/code&gt;); someone has already published a table of all private keys, so it's just a matter of time before your system is brute-forced.&lt;br /&gt;&lt;br /&gt;2) Update your systems.  I see an "openssl-blacklist" package show up on both my Debian stable and my Ubuntu whateverletterthey'reon one.  You'll get some debconf prompts about it clobbering stuff, including potentially your host keys, which means the next time you connect to the machine you'll get the "host keys have changed" message.&lt;br /&gt;&lt;br /&gt;3) To make yourself feel less anxious, try running &lt;code&gt;ssh-vulnkey&lt;/code&gt; to print an analysis of keys in standard paths on your system.  (Run it as &lt;code&gt;sudo ssh-vulnkey -a&lt;/code&gt; to check all users on your system.)</content>
  </entry>
  <entry>
    <id>urn:lj:livejournal.com:atom1:evan_tech:250400</id>
    <author>
      <email>evan@livejournal.com</email>
      <name>Evan Martin</name>
    </author>
    <lj:poster user="evan"/>
    <link rel="alternate" type="text/html" href="http://community.livejournal.com/evan_tech/250400.html"/>
    <link rel="self" type="text/xml" href="http://community.livejournal.com/evan_tech/data/atom/?itemid=250400"/>
    <title>type-safe printf</title>
    <published>2008-05-14T05:08:46Z</published>
    <updated>2008-05-14T17:07:35Z</updated>
    <category term="go read"/>
    <category term="programming languages"/>
    <content type="html">printf is a function with a complicated type.  In C we used to just give up and tell the compiler "this function takes some other stuff that you shouldn't worry about" with the amusing "..." builtin. These days compilers have special support for annotating printf-like functions to provide type-checking.  The other side of this is that an implementation of printf necessarily has a little tokenizer/parser for run-time processing of the format string, along with the associated performance penalty*.&lt;br /&gt;&lt;br /&gt;Yet pretty much all programs that involve format strings ought to have the format strings known statically.  Even a mini-language like printf turns out to have enough power to not be able to safely process untrusted input, as the "poke" instruction (named &lt;code&gt;%n&lt;/code&gt;) demonstrated by creating a completely new class of security vulnerabilities.  And without the compiler to help with type coercions, it's easy to write something invalid, especially when you're playing fast and loose with integer and pointer sizes across platforms.&lt;br /&gt;&lt;br /&gt;Perl and Ruby neatly sidestep this problem by using string interpolation: at parse/compile time, the compiler scans the strings for bits to be expanded and just rewrites the "format" string to the equivalent concatenation of literal strings and variable values, which then uses the normal language's support for pasting strings together. (Prove it to yourself: &lt;code&gt;perl -e 'use strict; print "$x";'&lt;/code&gt; aborts with a "compilation error".) But sometimes you really just want something like printf, and both those languages fall back on "figure it out at runtime" for that.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Supporting printf at all proves to be pretty difficult in more strict languages which generally require all types to be known.  OCaml's compiler does some crazy hacks where sometimes a quoted string is interpreted as a &lt;a href="http://caml.inria.fr/pub/docs/manual-ocaml/libref/Pervasives.html#TYPEformat"&gt;&lt;code&gt;format&lt;/code&gt;, a six-parameter type&lt;/a&gt; that, for example, needs its own concatenation operator.  Haskell at least encodes it in the user-available language with some typeclass magic that gets you to more or less to feature parity with dynamic languages -- failure at runtime if the parameters don't properly line up.&lt;br /&gt;&lt;br /&gt;But it turns out there's a &lt;a href="http://www.brics.dk/RS/98/12/"&gt;nice paper&lt;/a&gt; that provides a type-safe encoding of printf that doesn't rely on any fancy language features.  The paper is structured like this: (1) "wouldn't it be nice if printf worked like this?" (2) "oh wow check it out, here are the functions!"  I've been staring at it for a week and though I can sorta see how it works, it's unclear to me how anyone would come up with it.  Here's an overview from a person who lacks sufficient brain to say much smarter (me).&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;To start with, you don't use a string for the format string.  This sorta seems like you've already given up, but you could imagine a macro expanding a format string into the proper expression here, much like how Perl/Ruby's interpolation works.  Since this is functional code we're talking about, it ends up being an expression involving some functions. Format string concatenation ends up being encodable as composition, which means you end up with the same operator as Perl (&lt;code&gt;.&lt;/code&gt;) for pasting them together.&lt;br /&gt;&lt;br /&gt;The basic task then is that you need &lt;code&gt;print (some magic here)&lt;/code&gt; to be able to give you a function of varying types, depending on what the magic is, so that &lt;code&gt;print format1 3 "foo"&lt;/code&gt; can be type-checked that format expects an int and a string.  So the type of printf must be &lt;code&gt;(some magic type involving an a) -&amp;gt; a&lt;/code&gt;, where the polymorphic a is a function produced by the magic.  And here's where the magic drops, painful in its simplicity:&lt;br /&gt;&lt;br /&gt;&lt;code&gt;lit :: String -&amp;gt; (String -&amp;gt; a) -&amp;gt; String -&amp;gt; a&lt;br /&gt;lit text k s = k (s ++ text)&lt;br /&gt;&lt;br /&gt;int :: (String -&amp;gt; a) -&amp;gt; String -&amp;gt; Int -&amp;gt; a&lt;br /&gt;int k s val = k (s ++ show val)&lt;br /&gt;&lt;br /&gt;printf :: ((String -&amp;gt; String) -&amp;gt; String -&amp;gt; a) -&amp;gt; a&lt;br /&gt;printf fmt = fmt id ""&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;And that's it.   &lt;code&gt;lit&lt;/code&gt; is what converts a literal string into a format, while &lt;code&gt;int&lt;/code&gt; is the placeholder for an int.  So &lt;code&gt;"my int is %d\n"&lt;/code&gt; would be expressed as &lt;code&gt;lit "my int is " . int . lit "\n"&lt;/code&gt;.  If you drop this in to GHC you'll see that the type of &lt;code&gt;printf (lit "my int is " . int . lit "\n")&lt;/code&gt; really is, as you'd expect, &lt;code&gt;Int -&amp;gt; String&lt;/code&gt; -- it's waiting for you to give it an int so it can dump out the formatted string.  The result of &lt;code&gt;printf&lt;/code&gt; is just a plain function, so you do all the normal sorts of things you'd want, like partially apply it or pass it to &lt;code&gt;map&lt;/code&gt;.  The formatters are plain functions, too, so you can add your own formatter that, for example, can accept a list (as he does in the paper).&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;So how's it work?&lt;br /&gt;&lt;br /&gt;Look at the two formatters, &lt;code&gt;int&lt;/code&gt; and &lt;code&gt;lit&lt;/code&gt;.  The &lt;code&gt;k&lt;/code&gt; parameter to the formatters is a continuation: it's where each formatter should pass the string constructed so far when the formatter is done.  Then the &lt;code&gt;s&lt;/code&gt; input parameter is the string as constructed so far.  You can mentally expand &lt;code&gt;printf (lit "foo" .  int)&lt;/code&gt; as &lt;code&gt;(lit "foo" (int (id))) ""&lt;/code&gt;, where that empty string is your starting string and &lt;code&gt;id&lt;/code&gt; is the innermost continuation which just gives you back the string that's been constructed.&lt;br /&gt;&lt;br /&gt;You can also look at &lt;code&gt;int&lt;/code&gt; like this, just with some added parens for clarity: &lt;code&gt;(String -&amp;gt; a) -&amp;gt; (String -&amp;gt; (Int -&amp;gt; a))&lt;/code&gt;.  It takes the continuation as a parameter, and the function it returns is sorta the same shape as the continuation but with &lt;code&gt;Int -&amp;gt; a&lt;/code&gt; in place of &lt;code&gt;a&lt;/code&gt; -- that's how it tacks on its need for an int to the greater formatting requirement.&lt;br /&gt;&lt;br /&gt;But from there... uh, the types just work out.  I don't know.  It's pretty much magic.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;[Edit: a commenter on reddit linked to &lt;a href="http://mlton.org/Printf"&gt;this discussion of the general technique&lt;/a&gt;, which I haven't read yet but looks promising.]&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;* Though I'd argue you're in a pretty bad place if printf performance is your bottleneck.</content>
  </entry>
  <entry>
    <id>urn:lj:livejournal.com:atom1:evan_tech:250307</id>
    <author>
      <email>evan@livejournal.com</email>
      <name>Evan Martin</name>
    </author>
    <lj:poster user="evan"/>
    <link rel="alternate" type="text/html" href="http://community.livejournal.com/evan_tech/250307.html"/>
    <link rel="self" type="text/xml" href="http://community.livejournal.com/evan_tech/data/atom/?itemid=250307"/>
    <title>related posts</title>
    <published>2008-05-12T16:23:54Z</published>
    <updated>2008-05-12T16:23:54Z</updated>
    <category term="grumpy"/>
    <content type="html">Dear Wordpress,&lt;br /&gt;&lt;br /&gt;The links added to posts via your "related posts" feature are rarely (perhaps never?) actually "related" to the post you add the links from.  This harmful in two ways: one, every time I click one of those links thinking the page author had more information for me to read (like &lt;a href="http://glinden.blogspot.com/"&gt;Greg Linden's blog&lt;/a&gt;, which often has great related links), I find unuseful content and it frustrates me.  Two, and maybe more worrying for you, you're training me to ignore those links; if you ever do improve the quality of the matching system later I'll never discover it.</content>
  </entry>
  <entry>
    <id>urn:lj:livejournal.com:atom1:evan_tech:249887</id>
    <author>
      <email>evan@livejournal.com</email>
      <name>Evan Martin</name>
    </author>
    <lj:poster user="evan"/>
    <link rel="alternate" type="text/html" href="http://community.livejournal.com/evan_tech/249887.html"/>
    <link rel="self" type="text/xml" href="http://community.livejournal.com/evan_tech/data/atom/?itemid=249887"/>
    <title>concurrent editing</title>
    <published>2008-05-10T22:29:43Z</published>
    <updated>2008-05-10T22:33:28Z</updated>
    <category term="go read"/>
    <category term="projects"/>
    <content type="html">While I'm on the subject of concurrent editing:&lt;br /&gt;&lt;br /&gt;My reading group read &lt;a href="http://www.google.com/search?q=%22Designing+a+commutative+replicated+data+type%22+filetype%3Apdf"&gt;Designing a commutative replicated data type&lt;/a&gt; a few months ago.  The basic idea I've retained from the paper (is has been some months) is that one way to avoid conflicts is to design your data representation such that conflicts are impossible by making all operations commute, demonstrating the theory by presenting a design for a multiuser simultaneous editor.  ("Like SubEthaEdit" is what I kept saying to people, but apparently few people I know have heard of it.)  By representing positions within the buffer as adddresses related to characters you currently know about, and having a globally-defined resolution strategy for two edits to the same position, you can safely allow edits to come in from clients in any order and maintain consistent state.&lt;br /&gt;&lt;br /&gt;(Commuting operaitons sounds like darcs, doesn't it?  In fact, &lt;a href="http://byorgey.wordpress.com/2008/02/13/patch-theory-part-ii-some-basics/"&gt;this fellow&lt;/a&gt; was discussing darcs's patch theory in connection to concurrent editing, though I suspect ultimately it's the wrong model...)&lt;br /&gt;&lt;br /&gt;The paper's really pretty clever in a bunch of ways (like: how do you make a globally-consistent addressing scheme in the presence of simultaneous edits?) and a friend and I sat down to implement it as a web app with a bunch of Javascript.  I had planned to target the release of App Engine but the lack of Comet (again, that bites me) sorta turned me off to the whole idea.  And I have so many other projects I ought to be finishing first...  He, however, powered on to get something that seems to mostly work but is a bit on the inefficient side.&lt;br /&gt;&lt;br /&gt;PS If you're curious, the model Gobby uses (Gobby is another simultaneous editor) is &lt;a href="http://gobby.0x539.de/trac/wiki/ObbyInternals"&gt;described here&lt;/a&gt;.  It's again the "hope we update often enough that there's no conflict" design I mentioned in the previous post.</content>
  </entry>
  <entry>
    <id>urn:lj:livejournal.com:atom1:evan_tech:249715</id>
    <author>
      <email>evan@livejournal.com</email>
      <name>Evan Martin</name>
    </author>
    <lj:poster user="evan"/>
    <link rel="alternate" type="text/html" href="http://community.livejournal.com/evan_tech/249715.html"/>
    <link rel="self" type="text/xml" href="http://community.livejournal.com/evan_tech/data/atom/?itemid=249715"/>
    <title>more on bug tracking; distributed editing</title>
    <published>2008-05-10T22:09:54Z</published>
    <updated>2008-05-10T23:32:01Z</updated>
    <category term="bugs"/>
    <category term="dvcs"/>
    <content type="html">A few separate posts, all in the same area.&lt;br /&gt;&lt;br /&gt;1) Most (all?) the distributed bug tracking software I've glanced at stores bugs in a directory, one file per bug.  This seemed like poor design to me.  I confirmed by showing Brad the output of &lt;code&gt;ls&lt;/code&gt; on one; his full response was "doesn't scale" and turning back to what he was working on.&lt;br /&gt;&lt;br /&gt;&lt;hr&gt;&lt;br /&gt;2) Having thought more about the relationship between code and bug state, I have concluded I was thinking about it the wrong way.  Going back and reviewing your comments, I see a bunch of you figured this out before me.  Here's the critical piece I was missing.  Code has history, which is tracked by the graph of related versions.  Bug state both refers to the code history and also has its own history, in that new bugs are opened and old ones are closed.  Those two histories related to bugs are not the same: even when examining old code, you generally care about the newest bug state.  (This is why most modern bug systems only let you use the newest bug state; making changes to it permanently clobbers the old state.  However, note that most do care about showing you the history of modifications to a bug; the interesting view is the most recent copy of the bug's entire history.)&lt;br /&gt;&lt;br /&gt;As Aristotle and Lee pointed out &lt;a href="http://community.livejournal.com/evan_tech/248736.html"&gt;on my older post&lt;/a&gt;, connecting the code history graph and the bug state could be modeled as annotations pointing at commits.  The state of bugs present in a given version is the collection of all bugs states that have been attached to an ancestor of that version.  This means discovering a bug in a previous release "infects" (to use his term, which is a good one) all branches derived from that release, and a given branch is only fixed once it merges the code that fixes the bug.  (Making that work efficiently is an exercise for the reader; I have some ideas that aren't worth sharing yet.)&lt;br /&gt;&lt;br /&gt;&lt;hr&gt;&lt;br /&gt;3) Part of the reason I got thinking about all of this because I wanted a separate feature: a command-line interface to bug tracking.  I hate using web apps both because web sites become inaccessible, get slow, or go down (a problem addressed by making it distributed) and just because I hate clicking around on web forms (a website can't, for example, query my current checkout for which branch I'm claiming the bug is fixed on).  You could make a CLI-based interface to Trac -- maybe one exists already -- and it would at least address the second half of that.&lt;br /&gt;&lt;br /&gt;At a superficial level, the command-line problem isn't really at all same thing as a distributed system.  But there also is a connection at a deep level.  I like to say (and here by "say" I mean "think" because nobody ever wants to listen to me jabber on about this stuff) that even centralized projects have distributed branches every time someone edits a file in their own checkout; it's just that the tools we have for those branches are weaker than the tools typically used for "real" branches.  On most systems you typically can't record your changes until you've verified they would merge cleanly with the master branch (though monotone/mercurial/fossil fix this implicitly and git does if you're using the proper workflow); on most systems you can't examine what happened upstream in the same way you examine changes that happened locally.&lt;br /&gt;&lt;br /&gt;This same problem -- that forks happen on every checkout -- is true with any web-based database; it's just that when you're using a website you tend to commit more frequently so the forks don't get a chance to conflict.  And for that reason the tools for managing conflicts are usually pretty weak, as anyone who's encountered a conflict on a bug tracker has probably experienced (my impression is that most just say "click back and type in what you were saying again").  My favorite bad conflict-resolution system is probably Google Docs, which, upon a conflict, pops up a window saying something to the effect of, "This paragraph was edited while you were editing it.  Here is your text; please copy and paste it back into the document and figure out how to fix it."&lt;br /&gt;&lt;br /&gt;So back to the command line: you need to solve this conflict problem anyway for a distributed system to work, but you also probably need to at least improve it for a command-line-editable centralized system to work, because with a command-line app you don't really get a "back button".  You could implement back-button-like functionality, but if you're going to implement new functionality to handle this case, perhaps you could implement a more sane model instead.</content>
  </entry>
  <entry>
    <id>urn:lj:livejournal.com:atom1:evan_tech:249588</id>
    <author>
      <email>evan@livejournal.com</email>
      <name>Evan Martin</name>
    </author>
    <lj:poster user="evan"/>
    <link rel="alternate" type="text/html" href="http://community.livejournal.com/evan_tech/249588.html"/>
    <link rel="self" type="text/xml" href="http://community.livejournal.com/evan_tech/data/atom/?itemid=249588"/>
    <title>terminal emulation</title>
    <published>2008-05-07T06:06:57Z</published>
    <updated>2008-05-07T06:06:57Z</updated>
    <content type="html">When you run &lt;code&gt;script&lt;/code&gt; (see &lt;code&gt;man 1 script&lt;/code&gt; if you're not familiar with it) it logs every keystroke you type, including backspacing over typos.  Val asked today: how can you format that output so that it appears as the result?  I found it interesting to think about.  You'd imagine there'd be some way to run terminal emulation (?) and then somehow pipe that through cat.  It's an interesting mixing of levels; I started running my mouth off about screen and vte but didn't come to a good conclusion.&lt;br /&gt;&lt;br /&gt;Along those lines, it made me think of &lt;a href="http://groups.google.com/group/msysgit/browse_thread/thread/43046778c549605b/c8080791412c5a2f"&gt;this interesting thread&lt;/a&gt; in which "cmd --color" produces escape codes while "cmd --color | cat" produces colored output.  (Try to guess the circumstances before you read the other side of that link!)</content>
  </entry>
  <entry>
    <id>urn:lj:livejournal.com:atom1:evan_tech:249156</id>
    <author>
      <email>evan@livejournal.com</email>
      <name>Evan Martin</name>
    </author>
    <lj:poster user="evan"/>
    <link rel="alternate" type="text/html" href="http://community.livejournal.com/evan_tech/249156.html"/>
    <link rel="self" type="text/xml" href="http://community.livejournal.com/evan_tech/data/atom/?itemid=249156"/>
    <title>your animation is too slow</title>
    <published>2008-04-23T15:39:34Z</published>
    <updated>2008-04-23T15:40:33Z</updated>
    <content type="html">Animation in software has a real use: by showing intermediate states between two end states you let the user instinctually see how they're related.  But as soon as your animation takes more than around 100ms it's gratuitous.  Movement should be so fast I don't notice unless I'm looking at it, because otherwise it's happening so slow I'm waiting for it.  Unfortunately, the difficulty of implementing animation tends to make programmers too proud of it (I personally have fallen for this) and you end up with these terrible 500ms fades so common in Flash or Javascript-heavy apps.&lt;br /&gt;&lt;br /&gt;Here's a personal request from me to you: knock that shit off.  It looks amateurish.</content>
  </entry>
  <entry>
    <id>urn:lj:livejournal.com:atom1:evan_tech:248853</id>
    <author>
      <email>evan@livejournal.com</email>
      <name>Evan Martin</name>
    </author>
    <lj:poster user="evan"/>
    <link rel="alternate" type="text/html" href="http://community.livejournal.com/evan_tech/248853.html"/>
    <link rel="self" type="text/xml" href="http://community.livejournal.com/evan_tech/data/atom/?itemid=248853"/>
    <title>dvcs and offline</title>
    <published>2008-04-22T16:41:07Z</published>
    <updated>2008-04-22T16:41:07Z</updated>
    <category term="dvcs"/>
    <content type="html">I got a couple of comments on that previous post that betray a bit of a misunderstanding about how collaborative projects work in the presence of distributed version control.&lt;br /&gt;&lt;br /&gt;Two possibilities: you're offline or online.&lt;br /&gt;&lt;br /&gt;For circumstances when you really are offline (and again, I note that I really am without internet access for a surprising fraction of my hacking time) there is fundamentally no way a bug tracker that requires me to be online can work.  For many, the answer to this problem is "wait until you're online" or "take notes in a text file", which is well and good if you're the sort to accept a sub-par situation as unsolvable.  (I repeat from before: any time you're resorting to manually shuffling source code, your source control system is failing you*.  You could make the same claim about any other tool.)  So yes, bug tracking is importantly a communication tool, but if the situation is that I'm physically unable to communicate you can't cite non-communication as a negative of any proposed alternative.&lt;br /&gt;&lt;br /&gt;Otherwise, even when you have internet access, no technology solves the communication problem if people aren't willing to communicate.  "&lt;a href="http://community.livejournal.com/evan_tech/248736.html?replyto=1541280"&gt;Private repositories&lt;/a&gt;" are a problem, but they're a problem no different from contributors who won't email or contributors who sit on a patch.&lt;br /&gt;&lt;br /&gt;When I'm online collaborating with someone else, the workflows between an "offline" and "online" system are equivalent except for running the sync step after some operations.  So I thought anon-commenter Daniel's point that email also has this offline-online hybrid behavior (where Outlook has the huge "Send / Receive" button on the toolbar) was right on -- I'd much rather use an offline email client that could sync than one that forced me to be online to use it**.  And as he points out, email would be terrible for real-time communication if people only synced once every few days -- that's why you don't do that, with email or dvcs, if you intend to participate in a project.  (Though in circumstances where I want to &lt;a href="http://www-cs-faculty.stanford.edu/~knuth/email.html"&gt;avoid email&lt;/a&gt; or hack on some code that doesn't yet compile, that workflow remains available.)&lt;br /&gt;&lt;br /&gt;&lt;small&gt;* By this metric the tools I use continue to fail me occasionally; they just fail less than the tools I used before.&lt;br /&gt;** I remain hopeful such a technology will one day be invented.  I gave up on mutt when I realized it can't scale beyond a few thousand messages and didn't provide any useful search.  Maybe &lt;a href="http://sup.rubyforge.org/"&gt;sup&lt;/a&gt; will get there.&lt;/small&gt;</content>
  </entry>
  <entry>
    <id>urn:lj:livejournal.com:atom1:evan_tech:248736</id>
    <author>
      <email>evan@livejournal.com</email>
      <name>Evan Martin</name>
    </author>
    <lj:poster user="evan"/>
    <link rel="alternate" type="text/html" href="http://community.livejournal.com/evan_tech/248736.html"/>
    <link rel="self" type="text/xml" href="http://community.livejournal.com/evan_tech/data/atom/?itemid=248736"/>
    <title>distributed bug tracking</title>
    <published>2008-04-21T00:39:41Z</published>
    <updated>2008-05-10T22:10:16Z</updated>
    <category term="bugs"/>
    <category term="dvcs"/>
    <content type="html">Distributed bug tracking is the natural extension of distributed version control.  Aside from the normal benefits of distributed version control, like being able to interact with bugs database while offline, there also seems to be a trend of making the interface to them work via the command line instead of annoying web-based systems.  And, like with Trac, the integration of issue tracking with the source is pretty natural: when you've fixed a bug on a branch you can mark the bug as fixed in that branch, and when that branch lands on your "main" tree that tree's bug state is also merged as a natural consequence of how merges work.&lt;br /&gt;&lt;br /&gt;It seems like there isn't dominant software for this yet.  Here's my five-minute take on the software I can find:&lt;ul&gt;&lt;li&gt;&lt;a href="http://www.fossil-scm.org/"&gt;Fossil&lt;/a&gt; is its own full version control system that integrates a bug tracker as well&lt;/li&gt;&lt;li&gt;&lt;a href="http://bugseverywhere.org/"&gt;bugs everywhere&lt;/a&gt; -- perhaps abandonware, last commit was July 2007&lt;/li&gt;&lt;li&gt;&lt;a href="http://www.ditrack.org/"&gt;DITrack&lt;/a&gt; -- subversion only, "planning to be backend agnostic" (not sure how svn matches up with distributed, but ok)&lt;/li&gt;&lt;li&gt;&lt;a href="http://www.distract.wellquite.org/"&gt;DisTract&lt;/a&gt; -- only a web interface using Firefox-specific Javascript to write to disk(?), requires monotone, latest news August 2007&lt;/li&gt;&lt;li&gt;&lt;a href="http://github.com/schacon/ticgit/wikis"&gt;TicGit&lt;/a&gt; -- just learned about it five minutes ago so not sure yet; seems a bit janky to keep bugs in a separate branch&lt;/li&gt;&lt;li&gt;&lt;a href="http://ditz.rubyforge.org/"&gt;Ditz&lt;/a&gt; -- seems the most appealing to me except that it's all of three weeks old, has &lt;a href="http://d.hatena.ne.jp/antipop/20080412/1208010913"&gt;emacs integration&lt;/a&gt;*, last commit last week&lt;/li&gt;&lt;/ul&gt;For whatever reason these all seem to involve the most obscure technologies available: in the above list I see fossil, bazaar, monotone, and even Haskell.&lt;br /&gt;&lt;br /&gt;From reading through these I find a surprising variety of models.  Here's what seems to me to be the simplest and sanest model: the bugs live in a normal top-level directory in your tree alongside "src" or whatever other directories you have; each issue is in its own file; comments are modifications of the per-issue file.&lt;br /&gt;&lt;br /&gt;But more generally, I'm not even sure if distributed is the appropriate model.  The action of recording a new bug modifies the current version of a branch but the bug's existence usually is older than the most recent commit (for example, it often belongs in older branches that have branches off before the bug was added).  So if a new bug is fixed in an older branch, there's no way to merge that new bug into the older branch without merging the entire state of the newer branch in.  Is that sensible?  I'm not sure.  The alternatives all seem to involve tracking bugs separately from branches and trying to match them up after you commit (like when commit messages mention bug numbers) which always feels like a failure of technology.&lt;br /&gt;&lt;br /&gt;The other issue that's I'm unsure about is how to integrate a sane web-based frontend for casual users who want to be able to query and report bugs without checking out the code.  Some systems have web frontends but it seems to me conflicts could be hard to resolve.  Maybe if you make sure a modification to an issue is always appends, and then add some smarts that auto-merges simultaneous adds by some textual timestamp included in the add.&lt;br /&gt;&lt;br /&gt;Needs more thought.  Sorry for the braindump.&lt;br /&gt;&lt;br /&gt;&lt;small&gt;* I'm a vim user, and don't really care about editor wars, so I mention it only to note that emacs integration isn't as useful for me.&lt;/small&gt;</content>
  </entry>
  <entry>
    <id>urn:lj:livejournal.com:atom1:evan_tech:248465</id>
    <author>
      <email>evan@livejournal.com</email>
      <name>Evan Martin</name>
    </author>
    <lj:poster user="evan"/>
    <link rel="alternate" type="text/html" href="http://community.livejournal.com/evan_tech/248465.html"/>
    <link rel="self" type="text/xml" href="http://community.livejournal.com/evan_tech/data/atom/?itemid=248465"/>
    <title>google app engine limitations</title>
    <published>2008-04-13T19:55:19Z</published>
    <updated>2008-04-13T20:20:05Z</updated>
    <category term="google"/>
    <content type="html">It's weird to see people talk about Google App Engine online because I think many people focus on minor details.  Like, to make apps scale horizontally you do need a "shared-nothing" infrastructure, so that's not really novel.  The BigTable aspects are sorta interesting except there's nothing there (in terms of application design) that you couldn't have gotten out of the paper, and the App Engine API is so high-level it's not that close to BigTable.  It's more like any other high-level flat-address object database, like maybe CouchDB.  The Python thing is also pretty irrelevant; they just picked a language they have experience with (having Guido around helped), and it's easier to launch supporting one API than &lt;i&gt;n&lt;/i&gt; languages' worth.  A good engineer just solves the problem with the tools available, and Python is a pretty good tool to start with.&lt;br /&gt;&lt;br /&gt;As for evil plans to steal ideas or code, that's between you and your skepticism.  Big companies are surprisingly good at doing shitty things, and Google is definitely big, but it's also true that within Google people really try to do the right thing.  I was touched to see a privacy-concerned friend of mine start using Gmail after he was hired, saying that only after he saw how seriously they take privacy inside the company could he feel confident about using it.  But I can't tell you anything that will change your mind about this subject.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;I developed an internal application using Google App Engine on and off over a period of months (during its development I kept trying it out) and then finally rewriting a few weeks before launch (after the APIs had all settled).&lt;br /&gt;&lt;br /&gt;Here are some real problems I've encountered:&lt;br /&gt;&lt;br /&gt;1) All code runs only in response to HTTP fetches.  So that means no cron jobs, and no persistent server-side processes.  I know I just wrote above that you can't really have persistent jobs if you want to scale, but ultimately real apps do occasionally need these.  For example, imagine a timed test app that needs a consistent view of time no matter which server (or datacenter!) the user hits.  A time server becomes a single point of failure but when it's critical for your app it can be engineered around.&lt;br /&gt;&lt;br /&gt;2) No long connections means no "comet" (server-push messaging).&lt;br /&gt;My first thought on hearing about App Engine was to port &lt;a href="http://neugierig.org/software/lmnopuz/"&gt;lmnopuz&lt;/a&gt; but I can't.&lt;br /&gt;&lt;br /&gt;3) Playing around with your data is hard.  Since there's no way to perform operations on your data except by uploading code to the server, you're often left creating a new URL per operation you want to perform.  Hacks like the &lt;a href="http://shell.appspot.com/"&gt;shell&lt;/a&gt; helps with this, but a lot of the time I want to be able to just run a local script and see the output.  (For my project I found a decent workaround: make a URL that accepts Python code as a POST and runs it.  Then your scripts just need to know to serialize themselves into strings and send them over the wire.)  But see the next point.&lt;br /&gt;&lt;br /&gt;4) Slow table scans.  My app had ~1200 rows that it performs various analyses on and produces graphs.  I can appreciate that such a query is labor-intensive, and so I had written it to cache the results of the graph generation (the rows only change once a day).  But I can't even seed the cache once because fetching 1200 rows is too slow to happen within a single query.&lt;br /&gt;&lt;br /&gt;5) Bulk operations are hard.  Say you want to delete all objects in a table (or class, I forget the App Engine term).  The "delete" operation requires you fetch the object first, and then you're back into slow table scans land.  The best you can do is batch up your processing into multiple smaller stages, each of which write their intermediate output into the data store: either make a page that auto-refreshes itself with Javascript and leave a browser pointed at it, or make a command-line script that repeatedly hits a URL on your app.&lt;br /&gt;&lt;br /&gt;6) No arbitrary queries.  (If you haven't read the docs in detail, you wouldn't know this, but any query that involves multiple attributes [columns, if you're still thinking SQL] of an object must have an index exactly matching the query.  They make index creation and maintenance trivial, and even automatic in most cases.)&lt;br /&gt;Though everyone's repeatedly shoehorned SQL underneath object-relational mappers, App Engine (and others) demonstrate that you can provide an object storage API and gain performance by not using SQL underneath.  I argue the real utility of SQL is that it lets you quickly (in terms of programmer time, not machine time) perform queries that you haven't done before and won't do again.  Say I learn about a bug where I built all of March's data with the word "none" in place of where a column should really be null (None in Python terms) -- that's a line of SQL to fix but it's a world of pain with App Engine due to the bulk operations thing.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;With all that said, it's still pretty good.  When I was looking to switch projects about a year ago, it came down to basically three projects and App Engine was one of them, because the guys who work on it are some of the best hackers I know at the company.  All of the above bullet points (and minor stuff like the languages thing) aren't fundamental limitations of the design, they're temporary flaws that can be solved by good engineering and are surely being prioritized by the team.  I'm pretty confident it'll improve rapidly.</content>
  </entry>
  <entry>
    <id>urn:lj:livejournal.com:atom1:evan_tech:248123</id>
    <author>
      <email>evan@livejournal.com</email>
      <name>Evan Martin</name>
    </author>
    <lj:poster user="evan"/>
    <link rel="alternate" type="text/html" href="http://community.livejournal.com/evan_tech/248123.html"/>
    <link rel="self" type="text/xml" href="http://community.livejournal.com/evan_tech/data/atom/?itemid=248123"/>
    <title>the barbarians are at the gates</title>
    <published>2008-04-08T18:29:35Z</published>
    <updated>2008-04-08T18:29:35Z</updated>
    <category term="haskell"/>
    <content type="html">I enjoyed this introduction to the &lt;a href="http://www.haskell.org/sitewiki/images/0/0a/TMR-Issue10.pdf"&gt;tenth issue of "The Monad.Reader"&lt;/a&gt;:&lt;blockquote&gt;The barbarians are at the gates. Hordes of Java programmers are being exposed to generics and delegates; hundreds of packages have been uploaded to Hackage; the Haskell IRC channel has nearly hit 500 users; and it’s only a matter of time before Microsoft seals that multi-billion dollar bid for Hayoo.&lt;br /&gt;&lt;br /&gt;The time has come to retreat and climb higher into our ivory tower: we need to design a language that is so devious, so confusing, and so bizarre, it will take donkey’s years for mainstream languages to catch up. Agda, Coq, and Epigram are some approximation of what functional programming might become, but why stop there? I want strict data, lazy codata, quotient types, and a wackier  underlying type theory.&lt;/blockquote&gt;</content>
  </entry>
  <entry>
    <id>urn:lj:livejournal.com:atom1:evan_tech:248030</id>
    <author>
      <email>evan@livejournal.com</email>
      <name>Evan Martin</name>
    </author>
    <lj:poster user="evan"/>
    <link rel="alternate" type="text/html" href="http://community.livejournal.com/evan_tech/248030.html"/>
    <link rel="self" type="text/xml" href="http://community.livejournal.com/evan_tech/data/atom/?itemid=248030"/>
    <title>app engine invites</title>
    <published>2008-04-08T17:35:42Z</published>
    <updated>2008-04-08T17:35:42Z</updated>
    <category term="google"/>
    <content type="html">If I know you and you want a cut-in-line invite to &lt;a href="http://code.google.com/appengine/"&gt;Google App Engine&lt;/a&gt;, drop me an email.&lt;br /&gt;&lt;br /&gt;For an internal contest I wrote an app using this system, so I have a bunch of thoughts on it that I'll maybe write up later.</content>
  </entry>
  <entry>
    <id>urn:lj:livejournal.com:atom1:evan_tech:247590</id>
    <author>
      <email>evan@livejournal.com</email>
      <name>Evan Martin</name>
    </author>
    <lj:poster user="evan"/>
    <link rel="alternate" type="text/html" href="http://community.livejournal.com/evan_tech/247590.html"/>
    <link rel="self" type="text/xml" href="http://community.livejournal.com/evan_tech/data/atom/?itemid=247590"/>
    <title>haskell syntax idea</title>
    <published>2008-04-07T15:44:19Z</published>
    <updated>2008-04-07T15:44:19Z</updated>
    <category term="haskell"/>
    <content type="html">I often end up with nested parens, when I need binding on the right:&lt;br /&gt;&lt;code&gt;foo (bar (baz (etc x)))&lt;/code&gt;&lt;br /&gt;That's ok in Lisp where the parens become invisible, but parens are infrequent enough in Haskell that they're ugly.  So they provide an "apply" operator that just has different binding rules, and the above can be written:&lt;br /&gt;&lt;code&gt;foo $ bar $ baz $ etc x&lt;/code&gt;&lt;br /&gt;but the dollar signs are huge!  If it were colon, it could be written like this&lt;br /&gt;&lt;code&gt;foo: bar: baz: etc x&lt;/code&gt;&lt;br /&gt;Where the colon instinctively means: "I'm done supplying arguments to this function; everything to the right of here is its final argument."</content>
  </entry>
  <entry>
    <id>urn:lj:livejournal.com:atom1:evan_tech:247524</id>
    <author>
      <email>evan@livejournal.com</email>
      <name>Evan Martin</name>
    </author>
    <lj:poster user="evan"/>
    <link rel="alternate" type="text/html" href="http://community.livejournal.com/evan_tech/247524.html"/>
    <link rel="self" type="text/xml" href="http://community.livejournal.com/evan_tech/data/atom/?itemid=247524"/>
    <title>on my continuing love for dvcs</title>
    <published>2008-04-07T15:34:19Z</published>
    <updated>2008-04-07T15:46:34Z</updated>
    <category term="vcs"/>
    <category term="git"/>
    <content type="html">I spent the weekend hacking on a project that involved a bunch of trying things out -- maybe this object needed to carry that field, or maybe it belonged over here, and each change required a cascade of changes throughout the code.  I managed it all with local branches in git.  I could write commit #1 with "this is the structural change I intended, but only the unit test compiles", commit #2 with "this other module was no good in the first place and should be another way",  commit #3 with "restructure a third module to now use the new interface provided in #1 and 2".  Then, when I decided it was no good, I could rewind the branch, start a new one, and transplant patches (like #2 above) from the dead branch and keep on.  Right now I have eight branches in my local repo, and I've cleaned up some of the unused ones.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;I was attracted to distributed version control initially because I really actually do a significant fraction of my programming offline: on planes, on the shuttle to work (which used to not have wifi).  But since then I've come to see it in two new ways, the latter of which took me a while to settle into.&lt;br /&gt;&lt;br /&gt;One (I think this is due to &lt;a href="http://www-cs-students.stanford.edu/~blynn/"&gt;Ben&lt;/a&gt;) is that it's just &lt;em&gt;locally cached&lt;/em&gt; version control: you can develop in the traditional centralized model if you like, where every commit immediately goes out to the server, but your local disk serves as a cache of data so any read-only commands are faster.  (This was one of svn's innovations over CVS: they keep a copy of the tree in your .svn directory so "svn diff" is fast.  The distributed model is the natural evolution of this.)&lt;br /&gt;&lt;br /&gt;The other is that these systems are designed to &lt;em&gt;manage your source code&lt;/em&gt;.  Any time you're making a temporary backup of a file, or copying a directory into another place*, or making a patch file, or even holding off on checkpointing where you're at until you just get this one other thing working, your code management system is failing you.  (I find I make liberal use of git rebase --interactive to squash crazy "this doesn't work yet" commits to make the tree make sense once I've decided the code is right.)&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Last Friday on my project at work I ran into this sort of thing, like I'm perpetually running into it.  I'm working on a kind of fundamental change to my project and, of course, one of the 120 high-level regression tests fails.  In examining that code I find there's a bug in the testing code; the existing design masks the race condition.  So now I'd like to fix that bug: what to do?  I end up doing a second checkout and rebuilding (~30 minutes), fixing the bug there, rerunning all the tests to verify that bugfix didn't break anything.  Then I manually create a patch file and apply it to my first tree.  Now, after sending that bug fix for review, I learn that I should've done it a different way -- ok, fix tree #2, use patch -r to unpatch tree #1 and create a new patch to bring it back in ... yuck!&lt;br /&gt;&lt;br /&gt;If I were more confident in my patch-management skills, you could imagine I could skip the second tree: I'd save my current work in a patch file, revert, fix the bug, test the fix, then reapply the patch.  And at the code-review point I'd pop two patches off the stack to reshuffle them.   (This is, in fact, exactly what &lt;a href="http://en.wikipedia.org/wiki/Quilt_(software)"&gt;Quilt&lt;/a&gt; manages.)&lt;br /&gt;&lt;br /&gt;But this sort of problem has been solved already in a general way.  What I should be able to do is: commit my work in progress in a local branch, rewind, write the bugfix, rebase on top of that.  When the review comes, I can rewind and rebase again if necessary.  And if I used two checkouts (see the footnote) I could just sync the branches between them, just as if I were making temporary commits server-side.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;I still don't love git -- so clunky in so many ways! -- but using it makes me see how obviously useful this stuff is and how silly it will be if we're not all using it in a few years.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;small&gt;* Sometimes this makes sense in circumstances where you want to keep a separate copy of all the build output around.  This is arguably the reason you want builddir != srcdir, though in my experience nobody cares about keeping that working.&lt;/small&gt;</content>
  </entry>
  <entry>
    <id>urn:lj:livejournal.com:atom1:evan_tech:247278</id>
    <author>
      <email>evan@livejournal.com</email>
      <name>Evan Martin</name>
    </author>
    <lj:poster user="evan"/>
    <link rel="alternate" type="text/html" href="http://community.livejournal.com/evan_tech/247278.html"/>
    <link rel="self" type="text/xml" href="http://community.livejournal.com/evan_tech/data/atom/?itemid=247278"/>
    <title>tinymenu for firefox 3</title>
    <published>2008-04-04T23:48:16Z</published>
    <updated>2008-04-04T23:49:02Z</updated>
    <category term="project"/>
    <content type="html">&lt;a href="http://trac.arantius.com/wiki/Extensions/TinyMenu"&gt;TinyMenu&lt;/a&gt; lets you put all your Firefox menus into a single button.  I almost never use the menus so this lets me recover another row of screen real estate.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://neugierig.org/software/misc/tinymenu.xpi"&gt;Here's a hacked-up .xpi&lt;/a&gt; I made that Works For Me in Firefox 3.  &lt;a href="http://neugierig.org/software/git/?r=tinymenu"&gt;Here's a repo&lt;/a&gt; where you can see the changes I made.&lt;br /&gt;&lt;br /&gt;To install my non-secure extension (which will be refused by Firefox "because it does not provide secure updates"), you have to go to &lt;code&gt;about:config&lt;/code&gt; and add a boolean pref &lt;code&gt;extensions.checkUpdateSecurity&lt;/code&gt; set to False.</content>
  </entry>
  <entry>
    <id>urn:lj:livejournal.com:atom1:evan_tech:246871</id>
    <author>
      <email>evan@livejournal.com</email>
      <name>Evan Martin</name>
    </author>
    <lj:poster user="evan"/>
    <link rel="alternate" type="text/html" href="http://community.livejournal.com/evan_tech/246871.html"/>
    <link rel="self" type="text/xml" href="http://community.livejournal.com/evan_tech/data/atom/?itemid=246871"/>
    <title>c-repl lives again</title>
    <published>2008-03-30T05:28:24Z</published>
    <updated>2008-03-30T05:28:45Z</updated>
    <category term="c-repl"/>
    <category term="debian"/>
    <category term="project"/>
    <content type="html">Seeing my &lt;a href="http://neugierig.org/software/c-repl/"&gt;c-repl&lt;/a&gt; show up &lt;a href="http://packages.debian.org/c-repl"&gt;in Debian&lt;/a&gt; inspired me to rewrite and extend it.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://neugierig.org/software/c-repl/"&gt;See the new home page&lt;/a&gt;, with a more extended session.  An extra especially cute feature this time around: &lt;code&gt;#include&lt;/code&gt;-ing a header brings all of its functions into tab-complete scope.&lt;br /&gt;&lt;br /&gt;It's not solid yet, so feedback welcome.</content>
  </entry>
  <entry>
    <id>urn:lj:livejournal.com:atom1:evan_tech:246692</id>
    <author>
      <email>evan@livejournal.com</email>
      <name>Evan Martin</name>
    </author>
    <lj:poster user="evan"/>
    <link rel="alternate" type="text/html" href="http://community.livejournal.com/evan_tech/246692.html"/>
    <link rel="self" type="text/xml" href="http://community.livejournal.com/evan_tech/data/atom/?itemid=246692"/>
    <title>haskell trick #3: use the pretty-printer</title>
    <published>2008-03-27T16:12:51Z</published>
    <updated>2008-03-27T16:14:19Z</updated>
    <category term="haskell"/>
    <category term="howto"/>
    <content type="html">Here's an underappreciated Haskell standard library.&lt;br /&gt;&lt;br /&gt;&lt;a name="cutid1"&gt;&lt;/a&gt;&lt;a href="http://haskell.org/ghc/docs/latest/html/libraries/pretty/Text-PrettyPrint-HughesPJ.html"&gt;&lt;code&gt;Text.PrettyPrint&lt;/code&gt;&lt;/a&gt; provides combinators for pretty-printing.  It's kinda like a super-simplified HTML or TeX: you give it a bunch of boxes and how they relate to one another and it manages output.  It's trivial to use.&lt;br /&gt;&lt;br /&gt;A common case I have is printing something that's tree-like, where children are indented below their parents.  The code for printing a node is straightforward: print this node's content, then map the print function over all its children &lt;em&gt;with a bit more indentation&lt;/em&gt;.  But the indentation bit is annoying!  You have to remember to properly indent every line, and then that complicated wrapping, etc.&lt;br /&gt;&lt;br /&gt;Instead, use the pretty-printer.  A &lt;code&gt;Doc&lt;/code&gt; is basically a box of text.  &lt;code&gt;text "foo"&lt;/code&gt; produces a &lt;code&gt;Doc&lt;/code&gt; with that string.  Then there are higher-level functions to compose &lt;code&gt;Doc&lt;/code&gt;s: &lt;code&gt;nest&lt;/code&gt; indents a &lt;code&gt;Doc&lt;/code&gt;, &lt;code&gt;vcat&lt;/code&gt; puts a sequence of docs together, one above the other... and finally &lt;code&gt;render&lt;/code&gt; converts a &lt;code&gt;Doc&lt;/code&gt; to a string.&lt;br /&gt;&lt;br /&gt;For my &lt;a href="http://evan.livejournal.com/958091.html"&gt;expenses chart&lt;/a&gt; I pretty-printed the parsed version of my expenses data.&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;showDays :: [DayLog] -&amp;gt; String
showDays days = render (vcat (map showDay days)) where
  showDay (DayLog day exps) = text (show day) &amp;lt;+&amp;gt; vcat (map showExp exps)
  showExp (Expense cost desc tags) = text (show cost) &amp;lt;+&amp;gt; text desc
                                 &amp;lt;+&amp;gt; showTags tags
  showTags tags = text "[" &amp;lt;&amp;gt; hsep (map text tags) &amp;lt;&amp;gt; text "]"&lt;/pre&gt;&lt;br /&gt;&lt;code&gt;showDays&lt;/code&gt; outputs a sequence of days by vcatting them together.&lt;br /&gt;&lt;code&gt;showDay&lt;/code&gt; shows a single day by writing out the date on the left and then all the day's expenses vertically listed the right.&lt;br /&gt;&lt;code&gt;showExp&lt;/code&gt; and &lt;code&gt;showTags&lt;/code&gt; just write out the fields in sequence, more or less.&lt;br /&gt;&lt;br /&gt;The output looks like this:&lt;pre&gt;2003-03-15 16 airport food [food]
2003-03-23 10 bart [transport]
           9 burger [food]
           10 ikea teapot [home]
           1 ikea water [drink]&lt;/pre&gt;All the expenses properly indented over on the right.  (I didn't track expenses on my vacation, so there's a gap in the dates there...)</content>
  </entry>
  <entry>
    <id>urn:lj:livejournal.com:atom1:evan_tech:246296</id>
    <author>
      <email>evan@livejournal.com</email>
      <name>Evan Martin</name>
    </author>
    <lj:poster user="evan"/>
    <link rel="alternate" type="text/html" href="http://community.livejournal.com/evan_tech/246296.html"/>
    <link rel="self" type="text/xml" href="http://community.livejournal.com/evan_tech/data/atom/?itemid=246296"/>
    <title>haskell trick #2: using list monad fail</title>
    <published>2008-03-27T15:55:07Z</published>
    <updated>2008-03-27T15:59:29Z</updated>
    <category term="haskell"/>
    <category term="howto"/>
    <content type="html">Here's a Haskell trick that I use infrequently but is still nice.&lt;br /&gt;&lt;br /&gt;&lt;a name="cutid1"&gt;&lt;/a&gt;Sometimes you want to filter a list on something that can be pattern-matched on.  For example, say you want to grab all the &lt;code&gt;Just&lt;/code&gt;s out of a list of &lt;code&gt;Maybe&lt;/code&gt;s.  (That's &lt;a href="http://www.haskell.org/hoogle/?q=catmaybes"&gt;&lt;code&gt;Data.Maybe.catMaybes&lt;/code&gt;&lt;/a&gt;, by the way.  Or, if you want to be fancy, &lt;code&gt;Control.Monad.msum&lt;/code&gt;, since &lt;code&gt;Maybe&lt;/code&gt;'s an instance of &lt;code&gt;MonadPlus&lt;/code&gt;.)  One way is something like:&lt;br /&gt;&lt;code&gt;filter (\x -&amp;gt; case x of Just _ -&amp;gt; True; _ -&amp;gt; False) xs&lt;/code&gt;&lt;br /&gt;But that's ugly.  It turns out that pattern match failure (&lt;em&gt;only within a do block or comprehension&lt;/em&gt;) calls &lt;code&gt;fail&lt;/code&gt;, which can be defined by the monad, and within the list monad it just skips over that item.  So you could write the above as:&lt;br /&gt;&lt;code&gt;[x | Just x &amp;larr; xs]&lt;/code&gt;&lt;br /&gt;or even:&lt;br /&gt;&lt;code&gt;do Just x &amp;larr; xs; return x&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;The latter sort of pattern comes in handy sometimes when your filtering expression is more complicated than a simple pattern-match, or something that can be tucked into a one-line comprehension.</content>
  </entry>
  <entry>
    <id>urn:lj:livejournal.com:atom1:evan_tech:246130</id>
    <author>
      <email>evan@livejournal.com</email>
      <name>Evan Martin</name>
    </author>
    <lj:poster user="evan"/>
    <link rel="alternate" type="text/html" href="http://community.livejournal.com/evan_tech/246130.html"/>
    <link rel="self" type="text/xml" href="http://community.livejournal.com/evan_tech/data/atom/?itemid=246130"/>
    <title>haskell trick #1: using ErrorT</title>
    <published>2008-03-27T15:47:54Z</published>
    <updated>2008-03-27T15:47:54Z</updated>
    <category term="haskell"/>
    <category term="howto"/>
    <content type="html">Here's a Haskell trick that took me a while to figure out but that I use all the time.&lt;br /&gt;&lt;br /&gt;&lt;a name="cutid1"&gt;&lt;/a&gt;First, the error monad: lets you sequence operations that may fail with error messages.  There are a bunch of different types and type classes available so it's pretty general, but the one I always end up using is &lt;code&gt;String&lt;/code&gt; as my &lt;code&gt;Error&lt;/code&gt; type and &lt;code&gt;Either&lt;/code&gt; as my &lt;code&gt;MonadError&lt;/code&gt; instance.  Here's a contrived example:&lt;br /&gt;&lt;pre&gt;import Control.Monad.Error

-- suppose you have a function like:
parseRomanNumeral :: String &amp;rarr; Either String Int
parseRomanNumeral input =
  -- code here; result in either (Left "parse error message") or (Right somenumber)
  -- equivalently, either (throwError "parse error message") or (return somenumber)

-- then you can use the monad like this:
addRoman :: String &amp;rarr; String &amp;rarr; Either String Int
addRoman a b = do
  inta &amp;larr; parseRomanNumeral a
  intb &amp;larr; parseRomanNumeral b
  return (a + b)&lt;/pre&gt;If either of those parses fail, then the result of &lt;code&gt;addRoman&lt;/code&gt; is &lt;code&gt;Left&lt;/code&gt; with the error.  Otherwise, success is again &lt;code&gt;Right&lt;/code&gt;.  Hopefully you can see this composes easily.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Ok, more background: you'll often mix this with IO.  For example:&lt;pre&gt;loadConfigFile :: FilePath &amp;rarr; IO (Either String Config)
loadConfigFile path = do
  contents &amp;larr; readFile path
  -- parse parse parse.
  -- errors are:  return (Left err)
  -- success is:  return (Right ok)
  -- (again can use "throwError", but the success case then looks like
  --  return (return ok), which is sorta crazy)

-- now imagine loading two config files:
loadTwo :: IO (Either String (Config, Config))
loadTwo = do
  maybeconfig &amp;larr; loadConfigFile "foo"
  case maybeconfig of
    Left err &amp;rarr; return (Left err)
    Right foo &amp;rarr; do
      maybeconfig &amp;larr; loadConfigFile "bar"
      case maybeconfig of
        -- yuck! I won't even finish this
&lt;/pre&gt;It'd be nice to again compose these like I did in &lt;code&gt;addRoman&lt;/code&gt;.  That's what the &lt;code&gt;ErrorT&lt;/code&gt; monad transformer is for.  Rewriting the above to use it, with the new bits underlined:&lt;pre&gt;loadConfigFile' :: FilePath &amp;rarr; &lt;u&gt;ErrorT String IO Config&lt;/u&gt;
loadConfigFile' path = do
  contents &amp;larr; &lt;u&gt;liftIO $&lt;/u&gt; readFile path
  -- plain "throwError" and "return ok" now work here.

loadTwo' :: IO (Either String (Config, Config))
loadTwo' = &lt;u&gt;runErrorT $ do
  foo &amp;larr; loadConfigFile' "foo"
  bar &amp;larr; loadConfigFile' "bar"
  return (foo, bar)&lt;/u&gt;&lt;/pre&gt;This is much closer to what you're trying to express.  The monad is letting you say: "first load the file foo, and stop here and return if there's an error.  then, ...".&lt;br /&gt;&lt;br /&gt;All that was background.  Here's the trick.&lt;br /&gt;&lt;br /&gt;I had to change the type of &lt;code&gt;loadConfigFile&lt;/code&gt; so it would be fed into &lt;code&gt;runErrorT&lt;/code&gt;.  That was ok in the above example, maybe, because the code ended up clearer.  But what about code that you don't control, that has the old type with an &lt;code&gt;Either&lt;/code&gt; nested inside an &lt;code&gt;IO&lt;/code&gt;?  Simple: the &lt;code&gt;ErrorT&lt;/code&gt; constructor exactly converts &lt;code&gt;IO (Either String a)&lt;/code&gt; into &lt;code&gt;ErrorT String IO a&lt;/code&gt;.  So I could've used the original &lt;code&gt;loadConfigFile&lt;/code&gt;  and just written the second function like this:&lt;pre&gt;loadTwo'' :: IO (Either String (Config, Config))
loadTwo'' = runErrorT $ do
  foo &amp;larr; &lt;u&gt;ErrorT $&lt;/u&gt; loadConfigFile "foo"
  bar &amp;larr; &lt;u&gt;ErrorT $&lt;/u&gt; loadConfigFile "bar"
  return (foo, bar)&lt;/pre&gt;&lt;br /&gt;In summary, the basic pattern is:&lt;ol&gt;&lt;li&gt;In circumstances where it makes sense, you can leave functions out of the &lt;code&gt;ErrorT&lt;/code&gt; monad.&lt;/li&gt;&lt;li&gt;Then, when combining them, wrap plain IO calls with &lt;code&gt;liftIO&lt;/code&gt;, the &lt;code&gt;IO (Either ...)&lt;/code&gt; calls with &lt;code&gt;ErrorT&lt;/code&gt;, and the whole block in &lt;code&gt;runErrorT&lt;/code&gt;.&lt;/li&gt;&lt;/ol&gt;&lt;br /&gt;Final trick: you can even use non-IO-using plain &lt;code&gt;Either&lt;/code&gt;s here, too.  Within &lt;code&gt;loadTwo''&lt;/code&gt;:&lt;pre&gt;  num &amp;lt;- ErrorT $ return $ parseRomanNumeral "iix"&lt;/pre&gt;The &lt;code&gt;return&lt;/code&gt; brings the &lt;code&gt;Either&lt;/code&gt; into &lt;code&gt;IO (Either ...)&lt;/code&gt;, then the &lt;code&gt;ErrorT&lt;/code&gt; converts it into an &lt;code&gt;ErrorT&lt;/code&gt;.&lt;br /&gt;&lt;br /&gt;Whenever code gets this hairy, though, the real consideration is that you're composing layers at the wrong abstractions and you should restructure it.</content>
  </entry>
  <entry>
    <id>urn:lj:livejournal.com:atom1:evan_tech:245719</id>
    <author>
      <email>evan@livejournal.com</email>
      <name>Evan Martin</name>
    </author>
    <lj:poster user="evan"/>
    <link rel="alternate" type="text/html" href="http://community.livejournal.com/evan_tech/245719.html"/>
    <link rel="self" type="text/xml" href="http://community.livejournal.com/evan_tech/data/atom/?itemid=245719"/>
    <title>dear ubuntu: wtf?</title>
    <published>2008-03-24T15:41:26Z</published>
    <updated>2008-03-24T15:41:26Z</updated>
    <category term="ubuntu"/>
    <content type="html">&lt;code&gt;$ apt-cache show ubufox | egrep '^(Desc| )'&lt;br /&gt;Description: Ubuntu Firefox specific configuration defaults and apt support&lt;br /&gt; Extension package for Firefox provides ubuntu specific configuration defaults&lt;br /&gt; as well as apt support for firefox plugins/extensions.&lt;br /&gt; .&lt;br /&gt; You can uninstall this package if you prefer to use a pristine firefox&lt;br /&gt; install.&lt;br /&gt;$ dpkg -L ubufox | grep searchplug&lt;br /&gt;/usr/share/ubufox/searchplugins&lt;br /&gt;/usr/share/ubufox/searchplugins/ask.xml&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;What does an Ask-specific search plugin have to do with Ubuntu-specific configuration?&lt;br /&gt;&lt;br /&gt;(Before someone asks: I don't care about this from a Google-vs-the-world perspective -- please, go use Ask if you like it.)</content>
  </entry>
  <entry>
    <id>urn:lj:livejournal.com:atom1:evan_tech:245495</id>
    <author>
      <email>evan@livejournal.com</email>
      <name>Evan Martin</name>
    </author>
    <lj:poster user="evan"/>
    <link rel="alternate" type="text/html" href="http://community.livejournal.com/evan_tech/245495.html"/>
    <link rel="self" type="text/xml" href="http://community.livejournal.com/evan_tech/data/atom/?itemid=245495"/>
    <title>deflate compression in googlebot</title>
    <published>2008-03-11T21:58:53Z</published>
    <updated>2008-03-11T21:58:53Z</updated>
    <category term="google"/>
    <content type="html">Random bit of Google trivia: Googlebot recently added support for "deflate" compression, despite no real demand for it.  Why?  Go ahead and guess; the answer follows.  &lt;a name="cutid1"&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;This allowed changing the &lt;code&gt;Accept-Encoding&lt;/code&gt; header to &lt;code&gt;gzip,deflate&lt;/code&gt;, which then makes it better match the &lt;code&gt;Accept-Encoding&lt;/code&gt; sent by web browsers.  This then allows proxies like Squid to share cache entries between requests from browsers and requests from Googlebot.  &lt;a href="http://www.squid-cache.org/mail-archive/squid-dev/200208/0198.html"&gt;Here's an interesting thread&lt;/a&gt; on the way the &lt;code&gt;Accept-Encoding&lt;/code&gt; and &lt;code&gt;Vary&lt;/code&gt; headers interplay to ruin things for proxies.  (&lt;a href="http://www.schroepl.net/projekte/mod_gzip/cache.htm"&gt;Here's a page I skimmed that goes into a bunch more detail&lt;/a&gt;.)  You could argue the blame lies with Squid, which apparently(?) treats the &lt;code&gt;Accept-Encoding&lt;/code&gt; header value as an opaque string rather than a list of encodings.  On the other hand, doing something smarter depends on Squid magically knowing what circumstances caused a server to choose which Accept-Encoding combination.&lt;br /&gt;&lt;br /&gt;[N.B.: I'm not an expert on any of this stuff, so feel free to correct me.  Thankfully, I was also not involved in the implementation details, so I can excuse myself from blame!]&lt;br /&gt;&lt;br /&gt;This was all prompted by &lt;a href="http://dammit.lt/2007/12/29/rant-on-search-crawlers/"&gt;an observation by one of the wikipedia folks&lt;/a&gt;, who emailed me about it.  However, I asked him if this Google-side change helped and he never replied.  :~(</content>
  </entry>
</feed>
