Wednesday, July 30, 2008

Prolog Instantiation Modes (and Python exit contexts)

I wrote a piece of Python code that does a funny little transform on the syntax tree of a function, and creates a new function that returns all the local variables defined at the exit point of the original function. When I explain this to someone, they ask "Why?" and I don't have a good answer, but usually I say that it's related to predicates in Prolog that have multiple calling modes. Which I'm going to try explaining here:

So, Prolog doesn't have functions, but it has something called "predicates" which are just as good. Predicates don't return a value, but any argument to a predicate can be an "output" variable, like a C reference parameter. Unlike C though, Prolog predicates often treat all of their parameters as outputs. For example, the function append can be used in (no less than) three different ways.

/* Appending, mode = input,input,output */
?- append([1,2,3],[4,5], X).
X = [1,2,3,4,5]

/* Trimming off a shared starting sequence, mode = input,output,input */
?- append([1,2,3],X,[1,2,3,4,5]).
X = [4,5]

/* Trimming off a shared ending sequence, mode = output,input,input */
?- append(X,[4,5],[1,2,3,4,5]).
X = [1,2,3]


Through these three operations seem very different from the perspective of an imperative language, in Prolog append can be defined in two simple lines. The intuition about Prolog is that the runtime doesn't think of variables so much as inputs and outputs, but as "things I already know" and "thinks I don't know yet".

We can translate Prolog's append into Python (at some loss of conciseness and functionality), and then use the Python macro mentioned above to check that the variable bindings at the end of the function maintain the invariant that we expect:

context = append_exit_context(None,[4,5],[1,2,3,4,5])
assert context['head'] + context['tail'] == context['result']


So, this is pretty cool, we can pretend that we're writing a predicate instead of a function, assigning computed values to variables when we're able to discover them, and then examine those bindings after running the predicate.

Of course, Prolog's append can also be called with the first two arguments uninstantiated, but that sort of magic is much harder to fit into Python.

Saturday, July 19, 2008

Slow Sphinx Indexing

Tejus and I are building a Ruby on Rails site that needs both structured and unstructured search. Hamed suggested that we use Ultrasphinx, a Rails plugin that provides an interface to the Sphinx search backend.

I got everything downloaded and compiled, and had figured out how to debug the nil:NilClass errors that Ultrasphinx's configuration mini-language was generating, and then when I went to build the index for our database of seven documents... it seemed to hang. I was patient though, and let it run in the background for 10 minutes. This might be acceptable on a huge database, but... it was clear that something was wrong.

Several hours of debugging led me to the root cause: sphinx was assuming that the primary keys of the indexed table were sequential, and was creating a query for every 5000 rows between the min and max id of that table. With an auto_increment primary key, this is a valid assumption, but our data was being loaded by an ActiveRecord fixture which was generating random primary keys, so the range between min and max was nearly a billion, thus the number of queries was in the hundreds of thousands, all but seven of them returning nothing.

The solution of course, is to put explicit id's on your fixtures.

Thursday, July 17, 2008

ICFP 2008

Last weekend, I went over to the Monroe Drive House and wrote code for the International Conference on Functional Programming Contest. Thankfully, the ICFP Contest doesn't require that your implementation language be purely functional (or even mostly functional), so we (Mark, Martin, and Erik) wrote our entry in python, using twisted to handle network events. As we hacked on procedural code downstairs, Alex and Lindsey wrote very pure Scheme code upstairs.

We didn't do anything particularly fancy, just used a PID controller to adjust the driver's angle towards the goal, and wrote some geometry routines to detect when we were on a collision course with an obstacle, and then plotted a course in whichever direction around the obstacle looked shorter. We also moved away from Martians if they were too close to us (and facing us). We talked about several more complex tactics, but didn't wind up with the extra time (or brainpower) to implement them.

We had an awful version control experience with mercurial: constant permissions errors in the remote repository, the need to manually "hg up" on the remote server, and flukey merges. I've had good luck in the past with mercurial on (a PyWeek entry) stochasm, and some of the trouble this time was because we were using an SSH repository, rather than the svnserve style that Drew setup on stochasm.

Finally, our entry, the code.