What's That Noise?! [Ian Kallen's Weblog]

Main | Next page of month (Feb 2006) »

20060228 Tuesday February 28, 2006

Winter Fights Back

Last week we had the early taste of spring, with cherry blossoms blooming. Warm weather was asserting itself but clearly winter won't leave, at least not without a fight. Right now, it we have sunny skies in the east, and cracking thunder, and torrential rain and hail. The rain gutters are over flowed, rain water shooting off of the roof. The dog is freaked out.

They fight. And fight. And fight. And fight. And fight.

Minutes later: it stops. All is quiet. The dog is sleeping, again.

It's a lot like life.

( Feb 28 2006, 08:08:09 AM PST ) Permalink

20060226 Sunday February 26, 2006

Hug Yours

I don't know these folks, but having read the Dear Elena postings, I know them well enough. I'm hugging my little ones a little extra.

( Feb 26 2006, 05:35:13 PM PST ) Permalink

hReview: Nobody Knows (Dare mo shiranai)

This evening, the kids were off having fun with their friends at a birthday party, so we watched Nobody Knows. I thought I'd seize an opportunity to try a quick hReview:

Nobody Knows (Dare mo shiranai)

Feb 25, 2006 by Ian Kallen Nobody Knows (Dare mo shiranai)

A moving, disturbing and excellent film (Japanese, english subtitles)

Based on a true story in Japan, this makes the New Years Home Alone Case look like a blip of a parental lapse. A young mother leaves her four kids alone for months on end in a small urban apartment. The eldest son, Akira, in an intense portrayal by adolescent Yƻya Yagira, grows up real fast to look after his younger siblings. Despite a deliberate pace, the predicament that the kids were in, the sweetness of young Yuki (Momoko Shimizu) and the gritty inner-city ambience of Japan keep you rivetted to the story.


I've been meaning to take it for a spin for a while, this hReview was built with hReview Creator and then fiddled with a bit. I think I'll have muck with my stylesheets to give hReviews any kind of a reasonable display.

( Feb 26 2006, 12:11:35 AM PST ) Permalink

20060222 Wednesday February 22, 2006

Making Velocity Be Quiet About Resource Loading

I'd never dug into where velocity's annoying messages were coming from but I decided enough is enough already. These tiresome messages from velocity were showing up on every page load:

2006-02-22 12:08:02 StandardContext[/webapp] Velocity   [info] ResourceManager : found /path/to/resource.vm with loader org.apache.velocity.tools.view.servlet.WebappLoader
Such messages might be good for debugging your setup but once you're up and running, they're just obnoxious. They definitely weren't coming from the log4j.properties in the webapp. So I took a look at velocity's defaults. The logging properties that velocity ships with in velocity.properties concern display of stacktraces but the constant chatter in Tomcat's logs weren't in there either. So I unwrapped the velocity source and found it in org.apache.velocity.runtime.RuntimeConstants -- all I had to do is add this to velocity.properties and there was peace:
resource.manager.logwhenfound = false
Ah, much better!

They shoulda named that property resource.manager.cmon.feel.the.noise, seriously.

( Feb 22 2006, 12:40:20 PM PST ) Permalink

20060221 Tuesday February 21, 2006

Whiskers on Kittens

Today, Technorati launched a feature that enables you to identify and collect a few of your favorite things. You can select items as you browse search results, tags and so forth or you can identify them en masse by uploading an OPML file. I started mine by grabbing sixteen blogs that variously talk about the San Francisco Giants (it's almost the Good time of year, baseball time), here they are. As the baseball season gets underway, I expect I'll be refining this a bit. When opening day comes, I hope to have a truly browsable, searchable Giants blog portal of my very own. Hum baby!

Go to Technorati Favorites to start your own collection and put one of these on your blog:

( Feb 21 2006, 09:02:46 PM PST ) Permalink

20060220 Monday February 20, 2006

Code Search Engines

There's been recent much-ado about Krugle that I just don't get. The website isn't even open, yet folks are salivating as if they've never heard of such a thing as a code search engine. Not that I'm invested in anyway in Koders, Codase or Codefetch but they're providing a credible story already for that kind of specialized search. I'll be happy to try Krugle when they open their doors but the level of excitement in the posts about them seems really odd. Wired has some screenshots in their article about it. Let's see the web site live and then get all frothy about it, please! This just smells like more web 2.0 over-hype; people: control yourselves!

What I'd really find useful is code search built into proprietary projects. After reading Using Lucene to Search Java Source Code, I'm imagining building a Lucene full text index of a project at build time, as a maven plug-in or an ant task. The current tools out there to make source code and docs browsable would benefit so much from making them searchable; there's already a Jetty plugin, run your build, start your webserver and search the code base whenever you want to find a particular method call (or something. That would be cool!

( Feb 20 2006, 03:27:44 PM PST ) Permalink

20060219 Sunday February 19, 2006

The Dick Cheney Humor Train

For once in a great while we have a political todo with a real smoking gun and it has been some of the most entertaining political hoopla the blogosphere has ever seen. It's like we're under an attack of laugh bombs from the QLGF (doncha know about the Quail Liberation Guerilla Front?). I posted about the I got shot by Dick Cheney page but the conversation fodder keeps on a-coming:

In association with Zazzle.com
Cheney's Got A Gun (Aerosmith spoof) provides the video and soundtrack
The Bob Rivers Show - 102.5 KZOK Seattle (grr... packaging flash as a windows executable)
Dick Cheney Quail Hunt action packed game
Canned Hunting at its Cannedest
Dave Letterman's Top Ten
And no shortage of hardy hars from Jon Stewart, Bill Maher, Jay Leno, Conan O'Brien and so forth.

I've been following this Technorati search for the blogosphere's yucks and the posts from About.com's Political Humor feed for most everything else:

And of course, there are the products:

In addition to the Ready... Fire... Aim t-shirt above, there's Dick Cheney: Reducing the burden on social security one old bird at a time and other stuff
I'd Rather Hunt With Dick Cheney Than Ride With Ted Kennedy and such bumper stickers, t-shirts, yadda yadda
Yee haw!

( Feb 19 2006, 08:25:33 AM PST ) Permalink

20060218 Saturday February 18, 2006

Is There Anybody Who Hasn't Been Shot By Dick Cheney?

The man is on a veritable rampage. It seems as though everybody is getting "mistaken" for quails, republican donors and other kinds of terrorists and paying for it big time. Here's the news on me:

Beware, Dick might be after you next. Check it out at WHAT SUCKER GOT SHOT BY THE VEEP? Fill in the form to get a keepsake of your very own.

( Feb 18 2006, 01:13:54 AM PST ) Permalink

20060216 Thursday February 16, 2006

What Does Yahoo! Have Against Baseball?

Yahoo! has feeds, lot's of them. And they have a lot of sports feeds: NFL, NBA, NHL ... they have NASCAR but they don't have MLB? Sure, they have NCAA baseball but that doesn't count. Full coverage of baseball requires major league news. it's time for spring training to resume and they don't have MLB. What's up with that? Sure, too much hot dogs and cracker jacks are bad for you but c'mon. Are they communists or something? I bet they don't like apple pie, either.

Well, they weren't apparent from those pages I looked on. But as some kind readers have advised me, there is a feed for MLB at [XML]. Thanks!

( Feb 16 2006, 11:03:34 AM PST ) Permalink

20060214 Tuesday February 14, 2006

Google Buys Measure Map

Wow, Jeffrey Veen posted to Google Blog, the GOOG is now the proud owner of Measure Map. Congrats to the Adaptive Path folks!

( Feb 14 2006, 04:29:44 PM PST ) Permalink

Oracle's "Resistance is Futile" Message to MySQL?

Five months after scooping up InnoDB, a major technology provider to MySQL AB, it looks like Larry is borging them further. MySQL users who depend on InnoDB for transaction support were no doubt shaken by that announcement but, since MySQL has other backends, there's at least some assurance there that transactional capabilities won't be completely chopped into little pieces, wrapped in a carpet and tossed into a Redwood Shores swamp; there's always other vendors, like Sleepycat and their BerkeleyDB product, right?

Bwah hah hah! Larry's got a Bloody Valentine for you now! Seems as though an undisclosed sum has been passed and another one bites the dust. This article suggests that Oracle's also set its sites on JBoss and Zend (the latter of which currently has BD2 support front and center on their home page). Mark Fluery and Larry Ellison ... that has a ring to it!

I think it's time to solve the PostgreSQL database replication problem (no, Slony is not a good answer) for once and for all, lest Larry's bloodthirst vaporize MySQL.

( Feb 14 2006, 03:55:40 PM PST ) Permalink

20060213 Monday February 13, 2006

URL.hashCode() Busted?

I did a double take on this:

        HashSet set = new HashSet();
        set.add(new URL("http://postsecret.blogspot.com"));
        set.add(new URL("http://dorion.blogspot.com"));
        for (Iterator it = set.iterator(); it.hasNext();) {
I was expecting to get output like
But all that I got was


The java.net.URL javadoc says what I'd expect "Creates an integer suitable for hash table indexing." So I tried this:

        URL url1 = new URL("http://postsecret.blogspot.com");
        URL url2 = new URL("http://dorion.blogspot.com");
        System.out.println(url1.hashCode() + " " + url1);
        System.out.println(url2.hashCode() + " " + url2); 
and got this
1117198397 http://postsecret.blogspot.com
1117198397 http://dorion.blogspot.com
I was expecting different hashCode's. Either java.net.URL is busted or I'm blowing it and my understanding of the contract with java.lang.Object and its hashCode() method is busted.

( Feb 13 2006, 07:37:29 PM PST ) Permalink

Pretty Fly For A White Guy

A common topic of discussion within Technorati is the audience that we serve. We strive to be of service to both the authors (bloggers) and non-bloggers (the blogosphere's newbie readers) alike as well as the advertisers who help pay us. Being generally focused on more geekly topics, I'm oft reminded that my interests are shared only amongst other "middle aged white guys talking about Ruby on Rails" (hey, I'm not that narrow!), but the blogosphere is deep and wide across topics, locales and other demographics.

Yep, even discounting the excess hype aspect, rails is cool. As a matter of fact, I'm pleasantly surprised with the easy read that Agile Web Development with Rails is. Buy it, it's a great book! Unfortunately, it's kind of an academic interest for me at this point. I actually want to reduce the number of programming languages that we're using at Technorati. While it's great to enable developers to contribute to and consume our service oriented architecture with the languages and frameworks that they are most productive in, there's also a battle against degeneration of standards and practices; the blade cuts both ways. Expertise sharing across different programming environments can be difficult and hiring the polylingual programmer is sufficiently challenging already (did I fail to mention that we're hiring?); simplify by constraining is one of the recurrent themes of AWDR's reference to "convention over configuration." In the case of programming language repertoire, less can be more. Anyway, since I'm the only one at Technorati (that I know of) with a rising enthusiasm for Rails, it's unlikely it'll be in use for work stuff anytime soon, bummer. Meantime, I'm just another white guy talking about ruby on rails... JAWGTAROR.

On the topic of who the bloggers are, it's an ever-unfolding story; Dave's State of The Blogosphere is one snapshot into it. Well, I'll be at O'Reilly Emerging Technology Conference's session on data oogling The Data Dump: Fun with Graphs and Charts with some more goodies. The Technorati platform has a rich set of data streams to tap, so this should be fun! See ya in San Diego!

( Feb 13 2006, 09:17:20 AM PST ) Permalink

20060208 Wednesday February 08, 2006

BerkeleyDB's "Tied Hash" for Java

One of the really wonderful and evil things about Perl is the tie interface. You get a persistent hash without writing a boat load of code. With Sleepycat's BerkeleyDB Java Edition you can do something very similar.

Here's a quick re-cap: I've mentioned fiddling with BerkeleyDB-JE before with a crude "hello world" app. You can use the native code version with Perl with obscene simplicity, too. In years past, I enjoyed excellent performance with older versions of BerkeleyDB that used a class called "DB_File" -- today, the thing to use is the "BerkeleyDB" library off of CPAN (note, you need db4.x+ something for this to work). Here's a sample that writes to a BDB:


use BerkeleyDB;
use Time::HiRes qw(gettimeofday);
use strict;

my $filename = '/var/tmp/bdbtest';
my %hash = ();
tie(%hash, 'BerkeleyDB::Hash', { -Filename => $filename, -Flags => DB_CREATE });
$hash{'539'} = "d\t" . join('',@{[gettimeofday]}) . "\tu\thttp://www.sifry.com/alerts";
$hash{'540'} = "d\t" . join('',@{[gettimeofday]}) . "\tu\thttp://epeus.blogspot.com";
$hash{'541'} = "d\t" . join('',@{[gettimeofday]}) . "\tu\thttp://http://joi.ito.com";
Yes, I'm intentionally using plain old strings, not Storable, FreezeThaw or any of that stuff.
To prove that our hash was really persistent, we might do this:

use BerkeleyDB;
use strict;

my $filename = '/var/tmp/bdbtest';
my %hash = ();
tie(%hash, 'BerkeleyDB::Hash', { -Filename => $filename, -Flags => DB_RDONLY });
for my $bid (keys %hash) {
    my %blog = split(/\t/,$hash{$bid});
    print "$bid:\n";
    while(my($k,$v) = each(%blog)) {
        print "\t$k => $v\n";
Which would render output like this:
        u => http://http://joi.ito.com
        d => 1139388034903283
        u => http://www.sifry.com/alerts
        d => 1139388034902888
        u => http://epeus.blogspot.com
        d => 1139388034903227 

Java has no tie operator (that's probably a good thing). But Sleepycat has incorporated a Collections framework that's pretty cool and gets you pretty close to tied hash functionality. Note however that it's not entirely compatible with the interfaces in the Java Collections Framework but if you know those APIs, you'll immediately know the Sleepycat APIs.

com.sleepycat.collections.StoredMap implements java.util.Map with the folloing cavaets:

  1. It doesn't know how big it is, so don't call the size() method unless you want to see a UnsupportedOperationException
  2. You can't just abandon java.util.Iterators that have been working on a StoredMap, you have to use com.sleepycat.collections.StoredIterator's .close(Iterator) method to tidy up.
But that's no big deal.

So what does the code look like? Well, let's say you wanted to store a bunch of these vanilla beans in the database:

public final class ImmutableBlog implements Serializable {

    private static final long serialVersionUID = -7882532723565612191L;
    private long lastmodified;
    private String url;
    private int id;
    public ImmutableBlog(final int id, final long lastmodified, final String url) {
        this.id = id;
        this.lastmodified = lastmodified;
        this.url = url;
    public int getId() {
        return id;
    public long getLastmodified() {
        return lastmodified;
    public String getUrl() {
        return url;
    public boolean equals(Object o) {
        if (!(o instanceof ImmutableBlog))
            return false;
        if (o == this)
            return true;
        ImmutableBlog other = (ImmutableBlog)o;
        return other.getId() == this.getId() &&
            other.getLastmodified() == this.getLastmodified() &&
    public int hashCode() {
        return (int) (id * 51 + url.hashCode() * 17 + lastmodified * 29);
    public String toString() {
        StringBuffer sb = new StringBuffer(this.getClass().getName());
        return sb.toString();
note that it implements java.io.Serializable
This is a class that knows how to persist ImmutableBlogs and provides a method to fetch the Map:
public class StoredBlogMap  {
    private StoredMap blogMap;
    public StoredBlogMap() throws Exception {
    protected void init() throws Exception {
        File dir = new File(System.getProperty("java.io.tmpdir") +
                File.separator + "StoredBlogMap");
        EnvironmentConfig envConfig = new EnvironmentConfig();
        Environment env = new Environment(dir, envConfig);
        DatabaseConfig dbConfig = new DatabaseConfig();
        Database blogsdb = env.openDatabase(null, "blogsdb", dbConfig);
        Database classdb = env.openDatabase(null, "classes", dbConfig);
        StoredClassCatalog catalog = new StoredClassCatalog(classdb);
        blogMap = new StoredMap(blogsdb,
                new IntegerBinding(), new SerialBinding(catalog, 
                        ImmutableBlog.class), true);
    public Map getBlogMap() {
        return blogMap;
The majority of the code is just plumbing for setting up the underlying database and typing the keys and values.
Here's a unit test:
public class StoredBlogMapTest extends TestCase {

    private static Map testMap;
    static {
        testMap = new HashMap();
        testMap.put(new Integer(539), 
                new ImmutableBlog(539, System.currentTimeMillis(), 
        testMap.put(new Integer(540), 
                new ImmutableBlog(540, System.currentTimeMillis(), 
        testMap.put(new Integer(541), 
                new ImmutableBlog(541, System.currentTimeMillis(), 
    private StoredBlogMap blogMap;
    protected void setUp() throws Exception {
        blogMap = new StoredBlogMap();
    public void testWriteBlogs() throws Exception {
        Map blogs = blogMap.getBlogMap();
        for (Iterator iter = testMap.entrySet().iterator(); iter.hasNext();) {
            Map.Entry ent = (Map.Entry) iter.next();
            blogs.put((Integer)ent.getKey(), (ImmutableBlog)ent.getValue());
        int i = 0;
        for (Iterator iter = blogMap.getBlogMap().keySet().iterator(); iter.hasNext();) {
        assertEquals(testMap.size(), i);
    public void testReadBlogs() throws Exception {
        Map blogs = blogMap.getBlogMap();
        Iterator iter = blogs.entrySet().iterator();
        while (iter.hasNext()) {
            Map.Entry ent = (Map.Entry) iter.next();
            ImmutableBlog test = (ImmutableBlog) testMap.get(ent.getKey());
            ImmutableBlog stored = (ImmutableBlog) ent.getValue();
            assertEquals(test, stored);

    public static void main(String[] args) {
These assertions all succeed, so assigning to and fetching from a persistent Map works! One of the notable things about the BDB library, it will allocate generous portions of the heap if you let it. The upside is that you get very high performance from the BDB cache. The downside is... using up heap that other things want. This is tunable, in the StoredBlogMap ctor, add this:
// cache size is the number of bytes to allow Sleepycat to nail up
// ... now setup the Environment

The basic stuff here functions very well, however I haven't run the any production code that uses Sleepycat's Collections yet. My last project with BDB needed to run an asynchronous database entry remover, so I wanted to remove as much "padding" as possible.

( Feb 08 2006, 12:22:21 AM PST ) Permalink

20060206 Monday February 06, 2006

PHP Best Practices, Frameworks and Tools

I've annoyed PHP enthusiasts, friends and colleagues alike, with my distaste for PHP. There's nothing intrinsically bad, buggy or poorly performing about PHP per se. It's real simple: a lot of PHP code that I've had to pick up the hood on is a mess and is susceptible to worlds of instability and bugs. The common symptoms I see are mixing business logic, undeclared variables and globals, display code and SQL all scrambled up along with a complete absence of automated tests -- an intractable mess as soon as you want to refactor it. Sorry, my PHP-loving friends, it's nothing personal. I've used PHP longer than most of you. In 1995 or thereabouts it was a refreshing change from Perl CGI's with "print" statements. But now, I frankly don't get all of the zealous passion that PHP proponents have. I'm sure some of the suggestions I've heard ("turn off globals in php.ini", "read Sterling Hughes", "buy PHP 5 Objects, Patterns, and Practice", etc) are all good. I'm sure there are PEAR contributions that are legible and well factored (though, there are those that are not). But all of that misses the point. I'm confident that I or someone else could eventually derive a tool set that meets a rigorous standard for maintainable code. What concerns me are the prevalent practices and establishing best practices. I want to work with the someone else to establish them.

OK. So if it were upto me to establish best practices with PHP, what would I do?

Well, for one, I'd insist on using PHP classes with clear API's. I hate seeing PHP code with a long list of require_once statements, all of which can bring new functions and globals into the current scope. When files are used as grab-bags of functionality, when you're asking yourself "Which included file provided this function or that function? Rewind to read the source code and remember it..." you're in a World of Hurt. Better to define a class, instantiate it or call its static methods. I've been accused of writing PHP code that looks like Java. Well, I'm not sure if that's a disparagement but I think it was intended to be. Thank you very much!

I think PHP has an equivalent to Perl's "use strict" pragma. Gotta have it. Also, I think PHP 5 has exceptions. Gotta have that, too.

I'd use frameworks to encourage a separation of concerns. Either use an existing one or invent one if none of them are upto the task. On my list of things to look into:

The Web Application Component Toolkit project has a somewhat overwhelming list of PHP MVC frameworks. I'm concerned when I see pre-ambles along the lines of "the goal of this project is to port struts to php..." That sounds like a bad idea. Strut's reliance on inheritance and an awkward XML configuration grammar isn't really something to aspire to... I think the Ruby on Rails folks got it right: convention over configuration.

I'd insist on unit testing. I don't know anybody using PhpUnit but I'm willing to be convinced that it's good. Tracing without random writes to error_log (or worse) is also a must-have, proper use of log4php is probably the ticket.

What am I missing? What are the best practices when programming with PHP? Any experts with these topics, come talk to me. Technorati is hiring.

( Feb 06 2006, 09:14:36 PM PST ) Permalink