What's That Noise?! [Ian Kallen's Weblog]

Main | Next month (Jul 2007) »

20070629 Friday June 29, 2007

Power Bet

Powerset Last night I was among an invited group that Powerset brought in to witness how their natural language search sausage is made. It was actually kinduva cold cut platter: not exactly a meal but an interesting variety was offered for consumption.

When I was a kid, I thought that by 2007 we'd all have flying cars, rocket packs and computers would be all-seeing/all-knowing accoutrements on our wrists. I think all of us who ever watched Scotty verbally ask the Enterprise questions and get responsive answers in English sentences has had hunger pangs for satisfying natural language search. Powerset is trying to advance human-computer interfaces a little closer to that satisfaction, leap frogging previous efforts, by licensing Xerox PARC's technology and hiring a buncha heavy hitters to make it real.

Powerset COO Steve Newcomb introduced some of the sluggers in their line-up, walked attendees through the thinking behind their PR and release strategy and provided a peek into their search capabilities.

Among the impressive powersetters are people who have been-there/done-that with scaled-up search such as x-Yahoo!'s Chad Walters and Tim Converse (read Tim's post the other day about term proximity and linguistics, great stuff), as well as experts in natural language search with backgrounds at PARC and Ask Jeeves. As a company, they're not just-another-web2.0 rails app built by 2 guys and trying to get to the next level. Powerset is more of a bold bottled-lightning science experiment embracing ruby n' rails as a way to get it in front of people.

Powerset has signed up 10K people since announcing the availability of updates and previews on PowerLabs a few weeks ago. Newcomb characterized their labs preview effort as a way to use social software to guide product management decisions, "a mashup of Digg, Facebook and Google apps." I'm a big fan of transparency and community inclusion, it will be interesting to see how inclusive/closed this effort is.

OK, so after all of that, the "Where's the beef?" moment arrived. A side-by-side comparison interface was demonstrated with Powerset results on the left and Google results on the right. Explaining that the test index was scoped to Wikipedia, the goog results were similarly scoped down. The Powerset use case was demonstrated with a query like "What politicians were killed by disease?" On goog, the results are matching terms (and variants on their stems), "politicians", "killed" and "disease". Powerset matches semantically similar tokens and their grammatical relationships.

So Powerset's top result for that query highlighted Sir Edward Heath died from pneumonia on Wikipedia's page for Edward Heath. Highlighting a completely different snippet (none of the query terms were matched but the semantics were) that accurately answers the query is very impressive. Powerset is using Freebase's ontology and WordNet's synonym mappings to connect indexed sentence structures to the query. They do all of this analysis and mapping at index time, which undoubtedly raises the cost of indexing tremendously. They're making a big bet that the raised search results quality will pay those costs back.

When asked about the computational horsepower required to index web documents with the sentence structure decomposition and semantics mappings, Newcomb hedged at first ("Barney's gonna kill me", referring to CEO Barney Pell). But alas, he convinced himself (or did a good job method-acting conviction) that it was safe to reveal that it takes them about a second to grammatically analyze and index a typical document. Lamenting again about his confession, someone from the audience quipped the query, "Which CEO killed Steve Newcomb?" Yea, he didn't search their index for that.

On the subject of Google comparisons, Newcomb kinda squirmily described Powerset as reverent of ("not cocky about") what Google has accomplished but taking a different approach to web search. Doing side-by-side comparisons with Google as their demo does is pretty ballsy and it seems to get them in trouble; being positioned as a "Google killer" by their audience of search wonks and journalists when things are still very much at a proof-of-concept level seems rather premature. I think Powerset needs to reel that in lest they awaken a sleeping giant and fill him with a terrible resolve while they're still on the tarmac. If you've designed a new aircraft, you don't trumpet about revolutionizing aeronautics before the test pilots have taken off. Particularly if folks are proclaiming that Boeing is in trouble. When Powerset indexes a real web corpus, it will be interesting to see how successfully they can overlay web graph, clustering/disambiguation, time and other relevance components. I think that will provide a real moment-of-truth.

Powerset is making a big bet on natural language search as a transformative technology. They've got a lot of great people and a lot of great technology. All in all, the presentation felt a little dog-and-ponyish with the limited corpus but I'm looking forward to hearing more from them later this year when they release a major iteration. See also:


( Jun 29 2007, 10:41:46 AM PDT ) Permalink

20070616 Saturday June 16, 2007

Natural Hackasters

Hack Day: London, June 16/17 2007 I'm reading with amusement and wonder the events that unfolded at the Yahoo! Hackday in London. Apparently the Alexandra Palace main hall (the BBC's venue for this) has a roof that opens up. And it did. This was precipitated by a lightning strike on the building as a storm blew over (precipitated, storm: no pun left shall be unpunned). Yes, audience member laptops are open, PA system all setup... and it's raining inside the hall. Not to worry, all Londoners are equipped with umbrellas at all times. That's a fact. "I thought a bomb went off", sez Chad of the lightning strike when he was on IM a few hours later. Is the roof there like Chase Field where the Diamondbacks play baseball in Phoenix? I dunno, I'm checking out pictures of "Ally Pally" to assess. Anyway, power and wifi are back and the show goes on.

Follow along with Hackday London Lightning on Technorati's hackdaylondon tag stream.

( Jun 16 2007, 10:45:59 AM PDT ) Permalink

20070609 Saturday June 09, 2007

Disappearance of the Desktop Interface

I was sick of various computer OS desktop metaphors 10-12 years ago. At the time, I thought virtual reality technologies were gonna take over (anybody else remember VRML?). I remember the Windows 95/98 releases, lauded by Microsoft as such great advancements, striking me as just laughable in their utter lack of imagination (even if they were big upgrades from the Windows 3.x mess). When that "innovation" made it to Windows XP, I realized that Microsoft was hopelessly lost as far as OS interface design.

Since then, I've seen a lot of technology changes that I view as the harbingers of the desktop metaphor's demise. Graphics card technology that was once only found on $15-50k SGI pizza boxes workstations are now cheap as pizza. Jeff Han's demonstration of high resolution multi-touch applications at eTech and TED last year was fantastic. At TED again this year, the photosynth demonstration got a big round of "oohs" and "aahs" from a rapt audience (you must see the detail zooming, also check out this photosynth demo reel).

So when are we gonna see these technologies in our everyday lives? Apparently, soon. It's funny how different Apple and Microsoft's foray into this is. In a few weeks, Apple is coming out with a $500 phone (the multi-touch usage is demonstrated at 3:55 into this MacWorld TV report from last January). By the end of the year, we will reportedly see Microsoft's $10k coffee table appearing in hotel lobbies. Can't wait? Fishing in your pocket for an extra $10k? Into starcraft? There are some folks working on a multi-touch DIY kit (Microsoft: 0, Hackers: 1).

Putting on my futurist hat: Five years from now, Intel's 80-cores-on-a-fingernail chip, voice recognition audio inputs and multi-touch screens on commodity devices will make the desktop metaphor seem like a quaint joke. Kids born today will shake their heads in disbelief that desktops we're productive tools. I've yet to explain a command line interface to my kids, who are grade school age; as familiar and comfortable as those interfaces are to me, the youngins look at me typing in a shell window with puzzlement. In their youthful eyes, I may as well be composing vulcan legal tracts (the reality is probably more frightful, it might really be perl). Computing interfaces will fade away into our intuition.

I just wish the iPhone was coming out in time for father's day (yes, honey, that's a hint). In the meantime, I'm still putting up with Apple and Microsoft's OS interfaces, wincing at the trash cans, recycle bins, folder icons, etc. It'll be good riddance.


( Jun 09 2007, 10:23:46 AM PDT ) Permalink

20070601 Friday June 01, 2007

Web Spam As Signs of the Times

There was a time not long ago when Findory offered a credible value proposition for participants and consumers of the blogosphere. The idea of a blog recommendation and reader personalization service is a good one. I guess things didn't work out as planned at Findory. Earlier this year, Greg Linden announced that Findory was riding into the sunset.

The old Findory blog (@ http://findory.blogspot.com/) has been dormant for some time (the last posts from Greg were in 2005), now it's been taken over by a splogger who has been grabbing abandoned blogspot URLs (this one has PageRank of 3) and posting link farm links and German keywords to them. Sad.

I'd recommend holding on to your blogspot URLs forever; even if you're not using 'em anymore it's better to maintain the museum piece than contribute to the web spam problem.


( Jun 01 2007, 12:55:10 PM PDT ) Permalink