Tuesday, December 30, 2008
PHP's strtotime
The closest Python equivalent to strtotime seems to be dateutil.parser.parse, though it's much less flexible from a cursory glance. Of course, if a person is only trying to compute "common moveable Christian feasts that can be deduced from the date of Easter Sunday" they need look no further than the mx.DateTime.Feasts library (which includes translations into German and French). Weird. And awesome.
Friday, December 5, 2008
Not Lazy Enough
One reason for being non-lazy (some say: strict) is to ensure consistency across your dataset. If exact consistency is not a requirement, object/document stores like CouchDB or AppEngine's datastore can provide inexpensive "ALTER TABLE" behavior. With these systems, outdated/unaltered rows only need to be upgraded/altered when they're fetched from the datastore. Here's a case where deferring the "ALTER TABLE" code allows us to write more flexible/dynamic programs because our data management is lazier.
Code can be too lazy though, resulting in bad caching behavior and sometimes huge memory space growth.
Any other examples of cases where laziness/demand-driven-evaluation would be a performance win?
Friday, October 10, 2008
Ten Years of Progress
Hello from October 10th 2008. The Dow Jones Industrial Average closed yesterday at 8,579 which is a little bit under its close of 8,643 on March 10th 1998.
Does this mean that the global economy could have been put on hold ten years ago and life would be just the same today?
Probably not. People have to eat.
Does it mean that ten years worth of free market profit motive could have been thrown out the window, that executives could have been told to just keep 'er runnin' and life would be just the same as today?
Probably not. There have been winners and losers, and markets have adjusted for changing consumer preferences over the decade.
I don't know what it means.
But it does seem like an appropriate time to ask:
"Have we been making progress?"
I think that the answer is: Yes!
Though we may have to scale down our retirement plans, and start buying the less expensive brand of veggie burgers, our Wikipedia Content Index (WCI) just keeps growing:
And lest we ignore the importance of emerging global cute funds, think of how far we've come since 1998: a world without lolcats.
Thursday, September 25, 2008
Compiling x264 with MP4 support on OS X
wget http://downloads.sourceforge.net/gpac/gpac-0.4.4.tar.gz
open gpac # lazy-man's untar
git clone git://git.videolan.org/x264.git x264
cd gpac
find . -name os_net.c | xargs perl -pi -e 's/u_long/unsigned long/g'
./configure
make lib
make install-lib
cd ../x264
./configure --enable-mp4-output
make
Then, to test that everything worked, run this:
wget notlime.com/2008/h264/foreman_part_qcif.yuv
./x264 -v -q 20 -o foreman.mp4 foreman_part_qcif.yuv 176x144
open foreman.mp4
If you don't have git installed, you can find a nightly tarball of the x264 source code at ftp://ftp.videolan.org/pub/videolan/x264/snapshots/. Hope this helps someone!
Tuesday, September 2, 2008
Growing Triangle Vines
The first is a visualization of a geometric computation that we used in the ICFP contest: Arcsin of R / D
And the second comes from a pen-and-paper sketch made on a legal pad during my WTC tenure: Iconic Growing Triangles
Tuesday, August 26, 2008
Persistent Refactorings
Imagine for a moment though, that your revision control system was aware of refactorings, and recorded the refactoring command itself, not just the source text changes. You (or your IDE) might run this command:
svn refactor pullup myFunction
Which would record the refactoring and which could be used to inform merges with other branches. Everyone loves easier merges of course, but if your automatic merge of two significantly-differently-structured-branches was nearly flawlessness, it would be possible to support two functionally equivalent, but architecturally different branches over a long timespan. In practice you could commit to either branch, and then pull from the other branch, relying on your version control system to refactor your changes so that the merge succeeded.
Why do this?
Perhaps you have a refactoring that reduces code-duplication but makes your codebase less readable, scannable, coherent. In this case you can branch and refactor, ending up with one easy-to-read branch, and one safe-to-modify-without-fear-of-missing-a-copy-pasted-version-of-the-same-function-in-another-module branch. Because the refactorings that we're considering here don't change the code functionally, these aren't really even branches, but simply two views on the same trunk.
What other cases can you see this being handy for?
Wednesday, July 30, 2008
Prolog Instantiation Modes (and Python exit contexts)
So, Prolog doesn't have functions, but it has something called "predicates" which are just as good. Predicates don't return a value, but any argument to a predicate can be an "output" variable, like a C reference parameter. Unlike C though, Prolog predicates often treat all of their parameters as outputs. For example, the function append can be used in (no less than) three different ways.
/* Appending, mode = input,input,output */
?- append([1,2,3],[4,5], X).
X = [1,2,3,4,5]
/* Trimming off a shared starting sequence, mode = input,output,input */
?- append([1,2,3],X,[1,2,3,4,5]).
X = [4,5]
/* Trimming off a shared ending sequence, mode = output,input,input */
?- append(X,[4,5],[1,2,3,4,5]).
X = [1,2,3]
Through these three operations seem very different from the perspective of an imperative language, in Prolog append can be defined in two simple lines. The intuition about Prolog is that the runtime doesn't think of variables so much as inputs and outputs, but as "things I already know" and "thinks I don't know yet".
We can translate Prolog's append into Python (at some loss of conciseness and functionality), and then use the Python macro mentioned above to check that the variable bindings at the end of the function maintain the invariant that we expect:
context = append_exit_context(None,[4,5],[1,2,3,4,5])
assert context['head'] + context['tail'] == context['result']
So, this is pretty cool, we can pretend that we're writing a predicate instead of a function, assigning computed values to variables when we're able to discover them, and then examine those bindings after running the predicate.
Of course, Prolog's append can also be called with the first two arguments uninstantiated, but that sort of magic is much harder to fit into Python.
Saturday, July 19, 2008
Slow Sphinx Indexing
I got everything downloaded and compiled, and had figured out how to debug the nil:NilClass errors that Ultrasphinx's configuration mini-language was generating, and then when I went to build the index for our database of seven documents... it seemed to hang. I was patient though, and let it run in the background for 10 minutes. This might be acceptable on a huge database, but... it was clear that something was wrong.
Several hours of debugging led me to the root cause: sphinx was assuming that the primary keys of the indexed table were sequential, and was creating a query for every 5000 rows between the min and max id of that table. With an auto_increment primary key, this is a valid assumption, but our data was being loaded by an ActiveRecord fixture which was generating random primary keys, so the range between min and max was nearly a billion, thus the number of queries was in the hundreds of thousands, all but seven of them returning nothing.
The solution of course, is to put explicit id's on your fixtures.
Thursday, July 17, 2008
ICFP 2008
We didn't do anything particularly fancy, just used a PID controller to adjust the driver's angle towards the goal, and wrote some geometry routines to detect when we were on a collision course with an obstacle, and then plotted a course in whichever direction around the obstacle looked shorter. We also moved away from Martians if they were too close to us (and facing us). We talked about several more complex tactics, but didn't wind up with the extra time (or brainpower) to implement them.
We had an awful version control experience with mercurial: constant permissions errors in the remote repository, the need to manually "hg up" on the remote server, and flukey merges. I've had good luck in the past with mercurial on (a PyWeek entry) stochasm, and some of the trouble this time was because we were using an SSH repository, rather than the svnserve style that Drew setup on stochasm.
Finally, our entry, the code.
Friday, May 23, 2008
Kitten Naming
We named momma-cat "Olestra" because it has a classical ring to it, and gave the four kittens the names: Sprint, Cingular, (T-) Mobile, and Verizon. We dropped the "T dash" from Mobile's name because it was confusing - what does the "T" mean anyways? Both Sprint and Mobile are grey-tan tabbies, we're not sure which is which yet. Cingular is the fuzzy brown-black one, and Verizon is skinny and all-black.
Naming critter-litters is a bit like naming servers: you want to pick your names from a category that is roughly the same size as your things-to-be-named.
Tuesday, March 11, 2008
Avoiding ThreadDeath with env.js
At Appcelerator we use John Resig's "simulated browser environment for Rhino" as part of our IDE. I'd had intermittent problems with a ThreadDeath error being thrown, which would then cause any other thread running Rhino to hang.
I wasn't sure if the problem was in my Java code, my JavaScript, Rhino's Java code, or Aptana's Java code (we built our IDE atop their HTML/CSS/JS editor). Turns out it wasn't any of those! It was in that "simulated browser environment" code, in window.clearInterval where the thread spawned with setInterval is killed. I puzzled for a moment over why Mr.Resig was using multiple threads rather than a single one for setIntervals (expediency I assume), and then changed two lines so that it doesn't kill the thread, but allows it to die of natural causes.
window.setInterval = function(fn, time){
var num = timers.length;
timers[num] = new java.lang.Thread(new java.lang.Runnable({
run: function(){
while (true) {
while (timers[num]) {
java.lang.Thread.currentThread().sleep(time);
fn();
}
}
}));
timers[num].start();
return num;
};
window.clearInterval = function(num){
if ( timers[num] ) {
timers[num].stop();
delete timers[num];
}
};
Another example of a day of debugging yielding a two line fix. Ugh.
Friday, February 29, 2008
Dodging the DVCS Trainwreak
So what happens when the darcs or mercurial project goes the way of CVS (that is: to bit-heaven)? How do you rescue your distributed repository from legacy-software-land?
Well, the Tailor project provides convertors between different repository formats, even some of the more obscure distributed VCSes. It seems to work in an all-at-once mode, rather than mirroring one live repository in other formats, but it does full history support, so...
There's no harm in picking a bizarre distributed version control system today, if the project stalls, you can always upgrade your repository to whichever system becomes mainstream!
Wednesday, February 27, 2008
Atlhack distributed version control roundup
Alex writes about the advantages of git, Erik shows us how to configure git to serve over http, Lindsey is using darcs, Drew has been using mercurial for his personal projects (Miru, Yue), and I've been using darcs.