What's That Noise?! [Ian Kallen's Weblog]

Main | Next page of month (Apr 2007) »

20070430 Monday April 30, 2007

Blogging Upright

I've been asleep just about all day, the pain killers and muscle relaxants they gave me last night were that good.

It all started a few weeks ago when EBMUD sent me a water bill that indicated over three times our normal water usage (and three times the cost). Everything seemed fine with all of the household plumbing. I called for an inspection, their inspector didn't show up on the day I expected them. But we got a note left on the door saying that, while nobody is home, the water meter runs continuously and that our usage continues to be unusually high.

Over the weekend, I checked around the house more diligently. What I thought may have been a wet spot by the side of the garage (not far from a spigot) seemed like a good candidate, so I got the shovel and started digging. The soil didn't get much softer as I dug deeper. There was no specific motion or event that I recall being more vigorous than others but in the hours that followed, a pain in my lower back grew. And grew. And grew to a point of intensity that everything I did hurt in my lower back. Sitting down. Getting up from a sitting position. Laying down. Everything hurt, intensely! A doctor friend of mine told me that I musta skipped charter 2 of the "You're over 40 now" manual where it is specified not to do any more shoveling. Doh!

At the emergency room, they gave me a cocktail of toradol, dilaudid and phenergan and a prescription for soma and percocet. The shot last night really knocked me out, I've been asleep off and on most of the day today. I'm gonna be doing a lot of laying down with ice on my back. A lot of walking around. But not a lot of sitting. So, I'm writing this post woozy from the drugs but standing upright with the pooter on the kitchen counter. Gonna go for a walk next. I need to resolve things with the water company and the plumbing on our premises.

           

( Apr 30 2007, 04:47:38 PM PDT ) Permalink


20070426 Thursday April 26, 2007

Temperature Swings At The Old Ball Game

The rhythm of the baseball is always about hot streaks and cold streaks. In the 2006 season, the Giants couldn't put together any sustained hot streaks; it was a dark time for Giants fans -- I don't think they won more than 3 games in row and that they only did a few times. The first weeks of 2007 baseball were even darker; losing 7 out of the first 9 games disheartened a lot of fans. But what a difference now, the Giants have gone from a polar chill to an equatorial blaze in a matter of weeks; they've won 9 of their last 10!

Matt Cain finally has the victory he's been deserving; he's got a 1.55 ERA but what should be a 4-0 record is only at 1-1 so far. I think we're gonna see his W:L ratio shifting favorably in the weeks ahead. Barry Bonds is getting pitches, and smashing them. I'm sure soon enough competing team managers will get the message: the old Barry is back and crushinger than ever and we'll see lots of 4 finger calls. But for now, enjoy the ride.

Yesterday's victory came on the backs of a partial relief squad (Todd Linden and Lance Niekro) as Omar Vizquel and Dave Roberts took a rest (Roberts came on late in the as a pinch runner and scored). Next up this evening, Russ Ortiz will duel against Brad Penny and I'm looking forward to an exciting game. Three words: beat sweep el aye!

       

( Apr 26 2007, 07:14:56 AM PDT ) Permalink


20070425 Wednesday April 25, 2007

Intel Migration Pain With Perl

There's a bunch of code that I haven't had to work on in months. Some of it predates my migration from PPC Powerbook to the Intel based MacBook Pro. Now that I'm dusting this stuff off, I'm running to binary incompatibilities that are messin' with my head. My recompiled my Apache 1.3/mod_perl installation just fine but doing a CVS up on the code I need to work on and updating the installation, there's a new CPAN dependency. No problem, use the CPAN shell. Oh, Class::Std::Utils depends on version.pm and it's ... the wrong architecture. Re-install version.pm. Next, XMLRPC::Lite is unhappy 'cause it depends on XML::Parser::Expat and it's ... the wrong architecture.

Aaaaugh!

The typical error looks like

mach-o, but wrong architecture at /System/Library/Perl/5.8.6/darwin-thread-multi-2level/DynaLoader.pm
I just said "screw it" and typed "cpan -r" ... which looks to be the moral equivalent of "make world" from back in my FreeBSD days. Everything that has an XS interface just needs to be recompiled.

Compiling... compiling... compiling. I guess that'll give me time to write a blog post about it. OK, that's done, seems to have fixed things: back to work.

                 

( Apr 25 2007, 05:19:37 PM PDT ) Permalink


20070423 Monday April 23, 2007

Simple is as simple ... dohs!

I was working on an Evil Plan (tm) to serialize python feedparser results with simplejson.

 parsedFeed = feedparser.parse(feedUrl)
 print simplejson.dumps(parsedFeed) 
Unfortunately, I'm hitting this:
TypeError: (2007, 4, 23, 16, 2, 7, 0, 113, 0) is not JSON serializable 
I'm suspecting there's a dictionary in there that has a tuple as key and that's not allowed in JSON-land. So much for simple! Looks like I'll be writing a custom serializer fror this. I was just trying to write a proof-of-concept demo; what I've proven is that just 'cause "simple" is in the name, doesn't mean I'll be able to do everything I want with it very simply.

I've had a long day. A good night's sleep and fresh eyes on it tomorrow will probably get this done but if yer reading this tonight and you happen to have something crafty up your sleeve for extending simplejson for things like this, let me know!

     

( Apr 23 2007, 10:50:21 PM PDT ) Permalink


20070422 Sunday April 22, 2007

Linux Virtual Memory versus Apache

I ran into a very peculiar case of an Apache 2.0.x installation with the worker MPM completely failing to spawn it's configured thread pool. The hardware and kernel versions weren't significantly different from other systems running Apache with the same configuration. Here are the worker MPM params in use:

ServerLimit         40
StartServers        20
MaxClients        2000
MinSpareThreads     50
MaxSpareThreads   2000
ThreadsPerChild     50
MaxRequestsPerChild  0
But on this installation, same version of Apache and RedHat Enterprise Linux 4 like rest, every time httpd started it would cap the number threads spawned and leave these remarks in the error log:
[Fri Apr 20 22:54:24 2007] [alert] (12)Cannot allocate memory: apr_thread_create: unable to create worker thread 

It turns out that a virtual memory parameter had been adjusted, vm.overcommit_memory had been set to 2 instead of 0. Here's the explanation of the parameters I found:

overcommit_memory is a value which sets the general kernel policy toward granting memory allocations. If the value is 0, then the kernel checks to determine if there is enough memory free to grant a memory request to a malloc call from an application. If there is enough memory, then the request is granted. Otherwise, it is denied and an error code is returned to the application. If the setting in this file is 1, the kernel allows all memory allocations, regardless of the current memory allocation state. If the value is set to 2, then the kernel grants allocations above the amount of physical RAM and swap in the system as defined by the overcommit_ratio value. Enabling this feature can be somewhat helpful in environments which allocate large amounts of memory expecting worst case scenarios but do not use it all.
From Understanding Virtual Memory
The vm.overcommit_ratio value is set to 50 on all of our systems but rather than fiddling with that, setting vm.overcommit_memory to 0 had the intended effect; Apache started right up and readily stood-up to load testing.

So, if you're seeing these kind of evil messages in your Apache error log, use sysctl and check out the vm parameters. I haven't dug further into why the worker MPM was conflicting with this memory allocation config; next time I run into Aaron, I'm sure he'll have an explanation in his back pocket.

                 

( Apr 22 2007, 08:19:57 PM PDT ) Permalink


20070421 Saturday April 21, 2007

The Users of Your Service Are Your Best Friends

I try keep my ride on the cluetrain rolling by listening to what users of the services I help maintain have to say. The Technorati support forums have provided me with a great opportunity to hear what problems Technorati's members are experiencing. For the uninitiated, Technorati's crawler analyzes web pages to identify blog posts, make them searchable and identify links that measure what the blogosphere is paying attention to. There are a fair number of blogs that get caught in our automated blog flagging; the service processes several million pings per day and amidst that throughput, there are going to be mistakes in the flagging heuristics (flagged blogs are, naturally called "flogs", sometimes they end up demoted as "splogs" but others, turn out to be legit blogs). I'm trying to reduce the mistake rate; the indexing hazards that folks run into are a source of much grief (it doesn't take much to find folks who are very vocal about such lapses).

So, I've been on a tear over the last few weeks chasing down problems in Technorati's crawler and identifying its failure conditions. It's code that, until recently, I've not been too intimate with but inheriting responsibility for its functioning has forced me to study it more closely and grasp a firmer command of python programming. A peculiar failure case that had me puzzled for a while involved blogs that had (sufficiently) well formed pages and feeds, there didn't seem to be anything wrong with the data that'd prevent us from indexing them and yet they consistently failed to get indexed. I first became aware of it in this topic

The issue moved to a new topic where an initial diagnosis I offered (corrupted gzip encoding from Apache 2.2's mod_deflate, I thought) didn't quite pan out. But follow-ups from Technorati users KilRoY66 and wa7son helped clarify that the culprit was the gzip encoding that wordpress was configured to do. Apache 2.2/mod_deflate, you're off the hook. Their blogs (TNTVillage blog and justaddwater.dk | Instant Usability & Web Standards, respectively) both used Apache 2.2 but they both are also hosted on wordpress.org installations. For reasons yet to be explained, python's gzip library detects the encoding returned by wordpress as corrupted. Thank you, Technorati members, for helping identify this issue!

I'm going to patch the code (based on Mark Pilgrim's openanything) to recover from encoding errors and raise a proper exception if it's truly unrecoverable (as it is currently, the code catches any exceptions from decompressing the bytes, prints a message and moves along, essentially swallowing a fundamental error). In the meantime, if you're not getting indexed by Technorati and you have wordpress' compression on, try turning it off and see if that makes a difference.

                 

( Apr 21 2007, 02:15:01 PM PDT ) Permalink


20070420 Friday April 20, 2007

WiFi on the Train: Blogging From BART

I'm currently about 60 feet under Market Street in downtown San Francisco, inside a BART station. But I'm connected to the wifi_rail network with 5 bars. I haven't fired up any YouTube streams yet but for IM, twitter updates and ...blogging, this is groovy; I'll take it!

I haven't seen any official announcements about BART's wifi system but as a serendipitous user, I hope it's here to stay. In fact, I hope it's extended to cover the track between stations, the transbay tube and the east bay stations as well! Maybe I'm being a little over-appreciative (greedy).

               

( Apr 20 2007, 07:57:26 PM PDT ) Permalink


20070419 Thursday April 19, 2007

A Giant Turn Around?

Could the extra-inning push last night, kicked off by Barry Bond's tying slash homer in the 8th be the harbinger of baseball to come? I'm quite impressed with how Armando Benitez and Jonathan Sanchez held back the Cardinals long enough for 12th inning surge from Randy Winn, Omar Vizquel and Rich Aurilia. We're seeing real solid playing from those guys and Ray Durham. The pitching rotation is solid, the losses that Matt Cain has suffered... are really an injustice. The guy's pitched fantastic, if we see the run support turn on he'll be putting up the W's. I expect Barry Zito's shutout the other day to be the first of many. Noah Lowry, Matt Morris and Russ Ortiz get props too, those guys and much of the roster are pretty damned solid.

Today's 6-2 romp over the Cards has me thinking that the Giants won't be spending too much more time down there at the bottom of the division. I think the offensive slump from the season's start can be declared officially over. What remains to be seen is whether they can sustain this kind of solid play day in and day out. I have faith they will! Let's Go Giants!

Now if only the temperatures felt like baseball weather; it's cold!

( Apr 19 2007, 06:57:28 PM PDT ) Permalink


20070416 Monday April 16, 2007

Character Encoding Foibles in Python

I was recently stymied by an encoding error (the exception thrown was kicked off by UnicodeError) on a web page that was detected as utf-8, the W3 Validator said it was utf-8 but in all my efforts to get a parsing classes derived from python's SGMLParser, it consistently bombed out. I tried chardet:

>>> import chardet
>>> import urllib
>>> urlread = lambda url: urllib.urlopen(url).read()
>>> chardet.detect(urlread(theurl))
{'confidence': 0.98999999999999999, 'encoding': 'utf-8'}
...and yet the parser insisted that it had hit the "'ascii' codec can't decode byte XXXX in position YYYY: ordinal not in range(128)" error. WTF?!

On a hunch, I decided to try forcing it to be treated as utf-16 and then coercing it back to utf-8, like this

parser.feed(pagedata.encode("utf-16", "replace").encode("utf-8"))
That worked!

I hate it when I follow an intuited hunch, it pans out and but I don't have any explanation as to why. I just don't know the details of python's character encoding behaviors to debug this further, most of my work is in those Curly Bracket languages :)
If any python experts are having any "OMG don't do that, here's why..." reactions, please let me know!

           

( Apr 16 2007, 11:28:31 AM PDT ) Permalink


20070415 Sunday April 15, 2007

San Francisco's 1980s metal scene in 2007

The underground metal scene of years gone by had reunion on a Thursday night at the Bottom of the Hill. The circumstance that summoned this event into being was a sad one, the tragic passing of Curtis Grant. Proceeds from the show went to Curtis' family. Amazingly, while so many of us have gone very separate ways, word still managed to get around. How surreal it was to see friends, roommates, x-bandmates, drinking buddies, partners in crime and everybody else (some of these guys I know from 7th grade) who emerged from the woodwork into the dimly lit, loud tweaky PA-system and drinks ambiance of Bottom of the Hill. Stranger still was running into people and, after so many years, not remembering their names or exactly who they were. But that didn't really matter, for on Some Enchanted Evening all shall gather whatever memories that still carry from decades gone by, re-introduce themselves and celebrate.

The evening's kick off with Mercenary and Mordred got things off to a bombastic start. American Heartbreak came out after them with a great set, they rocked! The Steve Scate's Mordred formation was awesome, so heavy! 20 years ago, I'd have never imagined that the cavemen-in-the-ice-berg could thaw out and turn it on but that's what it seemed like -- Ruthie's Inn 1985... ZAP! Bottom of the Hill 2007. Frozen in time... Sven still looks the same! Why isn't he all salt-n-peppa gray like me? Sven, where's your goddamned fountain o' youth? Maybe I'm just working too hard. I should find out what brand of vitamins he's been taking. And the years have been kind to Ron Quintana too, look at him here mugging it up with me. Metal Mania! Photo courtesy of umlaut, thanks

At the top of the bill was Anvil Chorus. Like the private Anvil Chorus reunion I wrote about a few years ago, this reminder of of what could have been, what should have been, a break-out act 25 years ago was mind blowing. Good coverage has been already been rendered by umlaut, it would be duplicative to go into the set they did in detail. Suffice to say, they are a superbly talented bunch and it was fantastic to see them perform! Thaen, Joe, Aaron, Doug and (whoever you were playing keyboards) - thanks!

Here's a lil "Blondes in Black" to get you in the moment:

But wait! There's more!


Mr. Wizard sez: "Dreezle, Drazzle, Druzzle, Drone. Time for this one to come home!" (to 1983):

Kudo's to Eric Lannon for getting this together for Curtis' family. Good luck to Thaen, on his way to Tokyo to tour with Vicious Rumours!

So I think I've had my fill of nostalgia for now but I understand the 25th anniversary of the show I founded on KUSF 90.3 FM, Rampage Radio, is coming up in a few weeks. So maybe I'll see you there and if you want to hear some Black Sabbath or Merciful Fate at 7am, I might just be there to dish it out for old times sake!

                   

( Apr 15 2007, 01:40:33 AM PDT ) Permalink


20070414 Saturday April 14, 2007

Finally, some Giants' offense

After a few weeks of sleep walking to the batter's box, it's invigorating to see the San Francisco Giants bring on the show of force in last night's win in Pittsburgh. With Barry Bonds hitting a pair of HR's (no. 736 and 737!) and Russ Ortiz reeling in the strikeouts, we're finally seeing the team that I was imagining going into opening day: lotsa long ball for the innings and tight pitching for the outings.

It's really irritating that the Major League Baseball's websites are so... last century. Where are the blogs, widgets, microformats and feeds? Just for giggles, I took my Technorati Favorites and plugged them into a Giants page. I also wrote a little script that puts the next Giant's game in the header. An hCalendar implementation shouldn't be too hard. It's really lame that MLB doesn't just put that on their web sites; CSV files and instructions for importing into Outlook is just so silly. This morning, it is raining like mofo in the Bay Area, hmm... maybe a fine day to write a CSV to hCalendar converter.

             

( Apr 14 2007, 12:01:09 PM PDT ) Permalink


20070413 Friday April 13, 2007

Throw Out The Bums

The Bush administration and their friends run the gamut from "that's fishy" (WMD's? In Iraq?) to "that's wrong" (poor judgement of intelligence) to "more corrupt than any Presidential regime in history" -- there's no salvaging this presidency, it's an unmitigated train wreck. The judgement is not just regarding the subterfuge of warring on Iraq premised on phoney Al-queda links, the misinformation around the "we're winning in Iraq" meme, nor the recently illuminated goose-stepping in the Justice Department. George Bush, with the aid of Dick Cheney and Karl Rove, will certainly be judged by future retrospectives as the worst president in American history. Let's pile on the recently divulged shenanigans of Paul Wolfowitz, was one of the primary architects of the President's Big Lies of Foreign Policy. Wolfowitz has been perking up his personal dalliances on the tax payer's dime! The details emerging in the news this week (see Wolfowitz Apologizes For 'Mistake' - At World Bank, Boos Over Pay for Girlfriend) underscores what a buncha corrupt and loathsome creeps these hypocritical neo-con bozos are.

Impeachment proceedings? Criminal prosecution? Nuremberg trials? I'm not sure where it should stop but there clearly is much to be done. Throw out the bums!

         

( Apr 13 2007, 08:50:47 AM PDT ) Permalink


20070412 Thursday April 12, 2007

Ain't it great when things just work?

This morning, I was chasing down a bug in some python code, fatal errors like this

UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 193: ordinal not in range(128)
was getting tickled in some debug logging code (IOW, it wasn't even what the codes core purpose was but a bug in the bebuggerin). I put some proper exception handling in the debug logging and voila: bug gone. It's great when things work!

Working on that over breakfast made me kinda late and I needed to drive into town today 'cause tonight I'm gonna see some friends from the old school. I'm gonna be late! I probably won't be able to pick up any carpoolers.

I rolled into the carpool pick-up near the Orinda BART station, waited a few minutes and promptly got two riders. The radio was warning about an injury accident on the upper deck of the Bay Bridge... darn, I'll be late rolling into The City. Well, I get to The Bridge and alles goot. I made into San Francisco in good time and got to what I needed to do. It's great when things work!

                   

( Apr 12 2007, 11:29:10 AM PDT ) Permalink


20070410 Tuesday April 10, 2007

Technorati's Blog Top Tags Widget

We've been working on technologies to support scalable widget publishing and serving over at Technorati. One product of that effort is the recently announced Blog Tag Cloud Widget. With this widget, your readers can browse your blog by pivoting on the tags in your posts.

View blog top tags
Some geeky details: The widget technologies leverage Technorati's internal event distribution system to trigger content generation with asynchronous publishing. There's still much to be done but we're shaking out the kinks to enable a broader repertoire of useful, timely and easy to use widgets. We've got boatloads of ideas for more widgets, check out the Technorati Tools page for more information about the Blog Top Tags and the other widget goodies we've cooked up. And we love to see what independent developers such as Doug Karr come up with; if you DIY with the Technorati API, tell us about it. But even if you don't feel like rolling your own, let us know what grabs your fancy. If there's data on Technorati that you think would be great to embed in your blog, let us know; maybe we'll make a widget out of it!


       

( Apr 10 2007, 05:07:24 PM PDT ) Permalink


20070406 Friday April 06, 2007

Embracing Hats At Technorati

Wow! I'm sure the executive search announcement (Embracing Change) that Dave Sifry just posted will elicit a range of responses from the blogosphere. While I expect a lot of speculation and innuendo around it, I probably have a unique perspective owing to my three years working with Dave at Technorati (this week was my third anniversary). I'd like to share some of that perspective.

It started for me in the Spring of 2004, I was trying to figure out to do next. At the time, I was frankly very skeptical of Technorati when a friend suggested that I make that my next stop. Most of the time that I visited, the website was unavailable or had PHP errors all over the place, what a trainwreck! Maybe I can fix it.

What I expected to be a short conversation with Dave that fateful day in March 2004 turned into one that lasted for hours. After much discussion about the impact technology developments have on publishing and social discourse, I was struck by Dave's insight, inspiration and passion. In turn, I committed myself to taking what I knew about scaling web sites, learning what ever I needed to scale for the blogosphere's unique requirements and fixing the technical problems that plagued Technorati; I joined Dave to make Technorati the real time engine that would provide micropublishers with the connective tissue of community.

In the years since then, I've worn many hats. Software developer, DBA, sysadmin or whatever-it-takes; I came to Technorati determined by-any-means-necessary to sustain and improve Technorati's state of the art. The company has grown (there were about five us back then). The blogosphere has grown (there were only a few millions blogs back then). And all of us working together at Technorati have grown as people. Today, I still collaborate closely with Dave and Adam. I lead the Core Services group and work with our fabulous front end (led by Dorion) and search engineering (led by Brian) teams as an architect of Technorati's evolution. In helping lead the reshaping of Technorati's infrastructure, we've sought the right path between oft conflicted goals of flexibility, economy and run time optimization. We haven't always gotten it right. Technorati's storied episodes of instability and poor performance were often the source of much sleepless grief and perhaps opportunity costs. There have been business directions that have led our attention up some blind alleys. There have been technical errors. Project execution errors. Hiring errors. And so on. But, if I may add without being too immodest, we've done a lot of things well; I think ultimately the right things have happened. It is an honor and privilege to work with these folks, except for how wonderful my kids are, I couldn't be prouder of them!

So that brings us to where we are today and my ongoing working relationship with Dave. Dave's not going anywhere else. He may do some hat trading. But I expect to enjoy the benefits of working with him... for as long as I can!

Through good times and bad, Dave provides vision and inspiration. Synthesizing new ideas, asking tough questions and supporting the creativity and enthusiasm of all us working on Technorati, Dave is a catalyzing force. Among the principles Dave demonstrates that are important to me is internal transparency. Being open with factual matters about the company, the markets we address and the competitive landscape enables myself and others at Technorati to contribute their smarts and creativity in ways that a closed environment would never benefit from. Because that openness isn't always extended to parties outside the company, Technorati has been accused of being secretive. Well, sure. We don't answer rumors or trumpet funding events. But being judicious about external transparency is part of life; we could spend all day fielding the probes and inqueries from outside parties but to what benefit?

Throughout my years working with Dave, we've made it a point to hire only great people at Technorati. "Good" isn't good enough and "can do the job" can't; we only hire great people, period (and we've passed on a lot of good people who could ostensibly do the job if there was doubt about their greatness). It seems entirely plausible that there may be great people other than Dave who can wear the CEO hat better than he can. There might not be. I support Dave's launch of this search to find out. I'm looking forward to meeting this person; if he or she exists, there are some incredible shoes to fill (and hats and a buncha good stuff in between)! However, I expect to continue collaborating with Dave irrespective of if or when there is another CEO. Regardless of what hat he's wearing and which one I am, I'm looking forward to working with Dave towards Technorati's continued success.

So, wow! I'm posting this as an embrace of Dave, a virtual hug and an assurance of my unflagging support.

( Apr 06 2007, 12:16:22 PM PDT ) Permalink