What's That Noise?! [Ian Kallen's Weblog]

All | LAMP | Music | Java | Ruby | The Agilist | Musings | Commute | Ball
« Previous page | Main | Next page »

20081214 Sunday December 14, 2008

OpenEdge vs. Net Neutrality vs. CDN

The Wall Street Journal reported today that Google Wants Its Own Fast Track on the Web, describing it as an example of the decline of support for net neutrality amongst the The Powers That Be (the usual suspects: Google, Yahoo, Microsoft, Amazon). Plenty of deals have been getting struck anyway between TPTB and data carriers (most prominently AT&T + Yahoo DSL) but outright transit preference doesn't seem to be an issue here. What Google appears to be getting into, called OpenEdge, sounds like an arrangement that amounts to co-locating their gear in the major carrier's datacenters. This would move serving capacity closer to the end-users of their services and thereby accelerate the user experience. Since it doesn't concern transit per se, this actually doesn't sound like a net neutrality issue at all, it sounds like another form of datacenter dispersion.

So what exactly is the big deal? All of the TPTB and loads of other online services have content delivery network (CDN) deals. Yahoo, Amazon, Facebook... they all operate or partner with a CDN in some shape or form (full disclosure: I've been working on a CDN evaluation for Technorati). With a CDN, publishers pay specifically to have their content cached at points-of-presence (PoP) around the intertubes that, through some DNS and routing magic, enables web content to get to end-users more quickly. The next step beyond a CDN is to put equipment in the carrier's datacenter. Here's what WSJ said

Google's proposed arrangement with network providers, internally called OpenEdge, would place Google servers directly within the network of the service providers, according to documents reviewed by the Journal. The setup would accelerate Google's service for users. Google has asked the providers it has approached not to talk about the idea, according to people familiar with the plans.
Asked about OpenEdge, Google said only that other companies such as Yahoo and Microsoft could strike similar deals if they desired. But Google's move, if successful, would give it an advantage available to very few.
It seems perfectly logical, actually.

Nonetheless, I am concerned about wavering support for net neutrality. Lawrence Lessig, fresh off of his Big News post concerning setting up shop at Harvard Law School, is quoted as saying

There are good reasons to be able to prioritize traffic. If everyone had to pay the same rates for postal service, than you wouldn't be able to differentiate between sending a greeting card to your grandma versus sending an overnight letter to your lawyer.
But the counter argument says that there's a big difference. Grandma isn't trying to compete with your attorney (at least, not usually). If the big guys are paying more to be faster, who will be able afford to challenge them? The intertubularly rich will get richer, the poor will be stay poor. The TPTB will ensconce themselves as dynastic media walking on paths paved with gold while all of us commoners walk in the gutter.

The dumb pipes should stay dumb. If an internet service wants to operate out of multiple datacenters, lease dedicated pipes to accelerate their inter-datacenter data distribution and peer with the carrier's PoPs proximate to their datacenters, mazel tov. This can be augmented with CDNs. It can even be taken to the next step by directly installing the carrier's datacenters. But at the network exchanges and pipes connecting them, everyone's packets should remain equal.

UPDATE GigaOM posted about a clarification from Google which says that the WSJ was "confused". The hubbub in that article really was misplaced, it's a CDN deal.


( Dec 14 2008, 10:52:25 PM PST ) Permalink
Comments [1]

20081210 Wednesday December 10, 2008

Cloud Hype, An Amazon Web Services Post-Mortem

In the last few years, the scope of Amazon Web Services (AWS) has broadened to cover a range of infrastructure capabilities and has emerged as a game changer. The hype around AWS isn't all wrong, a whole ecosystem of tools and services has developed around AWS that makes the offering compelling. However, the hype isn't all right either. At Technorati, we used AWS this year to develop and put in production a new crawler and a system that produces the web page screenshot thumbnails now seen on search result pages. But now that that chapter is coming to a close, it's time to retrospect.

There's a prevailing myth that using the elasticity of EC2 makes it cheaper to operate than fixed assets. The theory is that by shutting down unneeded infrastructure during the lulls, you're saving money. In a purely fixed infrastructure model, Technorati's data aquisition systems must be provisioned for their maximum utilization capacity threshold. When utilization ebbs, a lot of that infrastructure sits relatively idle. That much is true but the reality is that flexible capacity is only saving money relative to the minimum requirements. So the theory only holds if your variability is high compared to your minimum. That is, if the difference between your minimum and maximum capacity is large or you're not operating a 365/7/24 system but episodically using a lot of infrastructure and then shutting it down. Neither is true for us. The normal operating mode of Technorati's data acquisition systems follows the ebb and flow of the blogosphere, which varies a lot but is always on. The sketch to the left shows the minimum capacity and the variable capacity distinguished.

In response to some of the fallacies posted on an O'Reilly blog the other day (by George Reese), On Why I Don't Like Auto-Scaling in the Cloud, Don MacAskill from SmugMug wrote a really great post yesterday about his SkyNet system, On Why Auto-Scaling in the Cloud Rocks. Don also emphasizes SmugMugs modest requirements for operations staff. In an application with sufficient simplicity and automation around it, it's easy to imagine a 365/7/24 service having meager ops burdens. I think we should surmise that the cost of operating SmugMug with autonomic de/provisioning works because it fits their operating model. I understand Reese's concern, that folks may not do the hard work of really understanding their capacity requirements if they're too coddled by automation. However, that concern comes off as a shill for John Allspaw's capacity planning book (which I'm sure is great, can't wait to read it). Bryan Duxbury from RapLeaf describes their use of AWS and how the numbers work out in his post, Rent or Own: Amazon EC2 vs. Colocation Comparison for Hadoop Clusters. Since the target is to serve a Hadoop infrastructure, AWS must get a thumbs down in their case. Hadoop's performance is impaired by poor rack locality and the latencies of Amazon's I/O systems clearly drags it down. If you're going to be running Hadoop on a continuous basis, use your own racks, with your own switches and your own disk spindles.

At Technorati, we're migrating the crawl infrastructure from AWS to our colo. While I love the flexibility that AWS provides and it's been great using it as a platform to ramp up on , the bottom line is that Technorati has a pre-existing investment in machines, racks and colo infrastructure. As much as I'd like our colo infrastructure to operate with lower labor and communication overhead, running on AWS has amounted to additional costs that we must curtail.

Cloud computing (or utility computing or flex computing or whatever its called) is a game changer. So when do I recommend you use AWS? Ideally: anytime. If your application is architected to expand and contract its footprint with the demands put upon it, provision your minimum capcacity requirements in your colo and use AWS to "burst" when your load demands it. Another case where using AWS is a big win is for a total green field. If you don't have a colo, are still determining the operating charactersics of your applications and need machines provisioned, AWS is an incredible resource. However, I think the flexibility vs. economy imperatives will always lead you to optimize your costs by provisioning your minimum capacity in infrastructure that you own and operate.

There's also another option: instead of buying and operating your own machines and racks, you may be able to optimize costs by renting machines provisioned to your specs in a contract from the services that have established themselves in that market (Rackspace, Server Beach, ServePath, LayeredTech, etc). Ultimately, I'm looking forward to the emergence of a compute market place where the decisions to incur capital expense, rent by the hour or rent under a contract will be easier to traverse.


( Dec 10 2008, 11:53:19 PM PST ) Permalink
Comments [2]

20081209 Tuesday December 09, 2008

The Solar Decade

Ten years ago, you might have been advised that solar energy, while sounding nice, was a bad investment. The installations were failure prone and not cost effective. I don't know if I bought that then, I know of solar panels in San Francisco installed in the 80's that paid for themselves, just slowly. But what we're seeing isn't your father's solar panel. From Google's solar panels to residential rooftops, it seems pretty clear that the Economics of Solar Power Are Looking Brighter. Fast Company is running an article The Solar Industry Gains Ground that sounds a chord that we're hearing a lot of. Solar energy is getting more and more cost effective. What's projected is that the cost of solar power may share up-and-to-the-right properties of Moore's Law. The fabs that make the silicon enabling you to read this may also enable an energy giant leap forward. The Germans have their own "Solar Valley" and their industry projection graph appears pretty Moorish (look at the large yellow area).

The big lift off is 10 years away but the investment that has been made in the area and the advances being made seem to put the benefits close at hand. But the big win, when dependence on fossil fuels are on a clear decline, is at leat 10 and 20 years out. But I think it can happen, I think the solar decade is coming. It should be the coming decade. However, it will require an Apollo-mission like focus from the Obama administration to succeed. And I hope we can make it a reality.


( Dec 09 2008, 11:55:24 PM PST ) Permalink

20081205 Friday December 05, 2008

It's Only The Biggest Country In the World

Should the confirmed reports that Technorati is banned in China be worn as a badge honor? I understand the Chinese authorities value stability but these kinds of things, treating billions of people like little children that need to be sheltered, will ultimately destabilize them.

We've waited 18 years for Chinese Democracy, isn't that long enough? (sorry, couldn't resist the joke)

Best wishes to the Chinese people. At least most of you.


( Dec 05 2008, 06:12:19 PM PST ) Permalink

20081204 Thursday December 04, 2008

Technorati Releases Fixes Some UI Peeves

In general, I regard successful user interfaces as the ones that provide the least amount of hunting and astonishment. Noone is delighted when the things they're looking for aren't obvious, the data displayed requires lots of explanation and the paths through an application are click-heavy. In this regard, Technorati was long saddled with a user interface that I regarded as delightless. However, I see that changing now and I'm delighted to see that!

Technorati's front end was released today with a handful of significant improvements. One is a long standing peeve of mine: the tag pages we're conflated with keyword search. That meant that if your post was about the president-elect and you tagged it "obama", your expectation that the the posts aggregated at http://technorati.com/tag/obama would also be tagged "obama" would be disappointed; there would also be a bunch of keyword matches mixed in. That came out of last year's attempt to "simplify" the experience by making keyword search and tag browsing the same thing; which was, in all honesty, a George Bush level failure. Sure there are folks who don't know, "What's this 'tag' thing you're talking about?" But for the folks who do know what the difference is between browsing blog posts grouped by tags and keyword search results, the mix wasn't received as a simplification but as a software defect.

I tagged my post "obama" but all of these other posts aren't tagged "obama", what's going on?
I'm glad we've gone back to keeping search and tags distinct.

The other failed aspect of the prior design was the demotion of the search box. The form input to type in your search was sized down and moved to the right, as if it were a "site search" feature. Yes, we'd like folks to explore our discovery features but the navigation for those features weren't great and the de-emphasis on search was again a source of more puzzlement than anything. The release today puts the search box back where it should be: bigger and right in the middle of the of the top third of the page, yay!

Oh, and earlier today Technorati Media released its Engage platform to beta. This is a major step in opening up the ad market place for the blogosphere.

So far, the feedback I've seen on these releases have been thumbs-up. Check 'em out, there's some more goodies in the works but these things only get better with your feedback. And yes, we know there's still more to do, I'm certainly busy with the backend stuff with our cloud platform, ping systems and crawlers (but did you notice the screenshot thumbnails on the search result and tag pages? I need to shake out the latencies producing and refreshing those). Kudos to Dave White, the front end team and the ad platform team for getting these releases out. Onward and upward!


( Dec 04 2008, 06:43:39 PM PST ) Permalink
Comments [2]

20081203 Wednesday December 03, 2008

Can We Just Call It "Flex Computing"?

The moniker "cloud computing" has been overloaded to mean to so many things, it's beginning to mean nothing. When someone refers to it generically, you have to ask them to dismbiguate; which of these are they referring to?

  1. IT infrastructure offered as a services
  2. Hosted application functionality
  3. A virtualized server deployment
InfoWorld wrote What cloud computing really means last spring to help clarify the distinctions but still, I often have to stop folks when the use the C-word just make sure we're talking about the same things.

Examples of the first definition are services like Amazon Web Services (AWS) or GoGrid. They provide metered virtual machines, you pay for what you use and have full access (root) on the machines while you use them. Additional goodies such as load balanced clusters, storage facilities and so forth are part of the deal too. Capacity can be scaled up or down on demand and typically, in very rapid fashion. When Peter Wayner reviewed these guys last summer in InfoWorld, he was enamored with the GUI front ends. Call me old fashioned (or a dyed in the wool geek) but unless they're really saving me a lot of time, I have an aversion to the slick GUI's. For his part, Wayner complained about the AWS command line utilities. Actually, when I use AWS, I use a GUI for an overview of running instances, it's a Firefox plugin (Elasticfox) but what I really like about AWS is programmatic access. Integrating application deployment with command and control functionality is very powerful, my tool of choice is boto, a Python API for AWS.

The second definition refers to hosted application functionality, in years gone by they were referred to as Application Service Providers (ASP). The more modern label is Software as a Service (SaaS). However, these services have to provide more than a console for functionality, they have to provide web service API's that enable them to be integrated into other applications. SalesForce.com was an early leader in this space (remember the red cross-outs, "No Software"), their example and the proliferation of RSS is really want inspired the proliferation of APIs and mashups we see today.

The last definition refers to VMWare, Xen and so forth. By themselves, those aren't really cloud computing in my book. However, you can use them to create your own "private cloud" with tools like Enomaly and Eucalyptus. This is an area of great interest to me.

In his review, Wayner pointed out how very different all of the services are. I don't know why he included Google's App Engine at all in his write-up. Don't misunderstand, GAE is a great service but it more closely resembles an application container than infrastructure services.

I'm imagining IT infrastructure management interfaces coalescing around standards (de-facto ones, not ones fashioned out of IETF meetings). Eucalyptus is a good discussion point. Eucalyptus provides an EC2 "work-alike" interface on top of a Xen virtual server platform. So picture this: if the Rackspaces, ServePaths, Server Beaches and LayeredTechs of this world were to provide a compatible interfaces built on top of Eucalyptus, buying compute power by the hour would become more like buying gasoline. There may be pros and cons to this station or that but fundamentally, if you don't like the pumps at one gas station or the prices are too high, you can go to the gas station across the street. Given compatible interfaces, management of the infrastructure, be it with boto, Elasticfox or using services such as RightScale can be as dynamic as the server deployments in those clouds. Such a compute market place would unleash new rounds of innovation as it eases starting up and scaling online services.

The Eucalyptus folks will be the first to fess up that their project is more academic in nature that industrial strength. However, it is the harbinger of AWS as a standard. Yes, I'm referring to AWS as a standard because of the level of adoption its enjoyed, the comprehensive set of APIs it provides and the rich ecosystem around it. What I foresee is that the first vendor to embrace and commoditize a standard interface for infrastructure management changes the game. The game becomes one of a meta-cloud because computing capacity will be truly fluid, flexibly shrinking and growing with hosted clouds, private clouds and migrating between clouds.


( Dec 03 2008, 10:44:46 PM PST ) Permalink
Comments [3]

20081202 Tuesday December 02, 2008

Social Media Backlash Against Cheaters and Fleshmongers

As long as there is any media, pornographers will figure out how use it to purvey their wares. The other week, I mentioned on the Technorati blog that I'd been focusing on some spam scrubbing efforts, including removing porn. Apparently we're not the only social media service taking a look at the bottom line impact of miscreant activities. A few related items of interest percolated recently.

Social network service provider Ning announced their End of the Red Light District. The high infrastructure costs, lack of revenue and administrative burdens (DMCA actions) were among the reasons cited. Sounds very familiar, we get our share of that kind of pointless nonsense at Technorati too.

Today, YouTube posted that they were going to crack down or reduce the visibility of porny videos. YouTube's measures include

As expected in these cases, the trolls come out to cry foul. But this isn't about free speech or puritan ethics, the issue more closely resembles the tragedy of the commons. It's really very simple: these parasitic uses consume a lot of resources but bring no benefits to the host and degrade the service for other users.

Also today, Digg Bans Company That Blatantly Sells Diggs was reported by Mashable. Apparently Digg has directed a cease-and-desist at USocial.net's practice of selling diggs.

It seems to be an accepted truism that social media oft demonstrates, All Complex Ecosystems Have Parasites. Yep, I've talked to folks from Six Apart, Wordpress, Tumblr, Twitter and elsewhere. We're all feeling the pains of success. Over the past month at Technorati, we've purged about 80% of the porn that was active in the search index. Sure, we're not spam free yet but the index is getting a lot cleaner.


( Dec 02 2008, 11:42:34 PM PST ) Permalink
Comments [2]

20081201 Monday December 01, 2008

System Gaming and Its Consequences

Technorati's authority metric is based on a real simple concept: the count of the unique set of blogs linking to you in the trailing 180 days constitutes your authority. By its very nature, it's a volatile metric. The top 100 of a few years ago bears little resemblance to the one today. When some folks observe their authority rising, they twitter w00ts of joy; when it's falling they complain bitterly that Technorati is "downgrading" them.

Authority is not a perfect metric (crawl coverage variations, etc) nor the only important measurement of a blog (traffic and comments are other metrics we'd like to measure), however it is one that Technorati has been objectively calculating for years.

What I find surprising is the surprise (or denial) that some people find when they learn there are consequences to gaming the system. On a fairly regular basis, someone comes up with the wholly unoriginal idea, "Hey, add your URL to my list of links, re-post it and urge others to follow suit to make your Technorati authority explode!" Or some variant of a viral link exchanging scheme. Some folks take the news graciously, "Oh, that's not OK? I had no idea. It won't happen again." But some of these folks get downright hostile, as if the blog authority metric is their god given right to game. These are probably the same people who expect appreciation on their home's property value to be a god given right. News flash: it's not. Since it's (apparently for some) not obvious: the attention you garner in the blogosphere and the price someone will pay for your house are driven by market forces. If your authority is dropping, create posts that are link-worthy. There's no shortcut. Blogs engaging in viral linking schemes stand a good chance that indexing will be suspended or the blog removed altogether from Technorati's index.

Use the blogosphere to converse, to entertain, to teach and to learn. We'll do our best to measure it and to build applications with those measurements. If you want to play games, get a Wii.

( Dec 01 2008, 10:40:34 PM PST ) Permalink

20081130 Sunday November 30, 2008

Big Is The Problem

I usually don't rant about economics but I wasn't shopping on "Black Friday" (nor will I be tomorrow on "Cyber Monday") - I'm trying to figure out how to tighten my belt. How it is that I, someone outside of the real estate, finance and auto industries that are so problem plagued, am getting caught in our economy's downdraft? Well, let's see.

Last January, Business Week raised the question "When is an institution too big too fail?" Until September of this year, the financial industry's downward spiral meandered along, like a persistent flu. There were bank failures but the conventional wisdom seemed to be that this was the market at work, winnowing the weak. The bad news ebbed and flowed: mortgage failures, rising oil prices and the weak dollar were countered by stimulus package checks, housing sales leveling off or even rising (where prices crossed their local tipping points) and vibrant web 2.0 and green enterprises. There had been bank failures this year but it took the evaporation of really Big institutions, Lehman Brothers and Merrill Lynch to put Business Week's question on everyones lips. To free market purists, the answer is obvious: whatever may come, let the failures fail. But the reality is that when an enterprise is so big that its failure disrupts significant portions of the overall national and global economy, whatever may come of its failure won't be good. Everyone suffers and bigness is the problem. When these companies become indispensable institutions, we should be afraid.

It seems for years there's been a breakdown in accountability. Loan originators could resell their loans and write new ones, no harm no foul. Right? But one of the key problems with that system is that the originators don't have any skin in the game. The have a money merry-go-round and whoever is left holding the paper (big institutions and their investors) draws the short straw. It's total madness. To date, all of the bank failures have resulted in consolidation in some form or another. Lehman is absorbed by Barclays. Merrill by BoA (which already absorbed Countrywide). The big are getting bigger as the competitive field shrinks. Ironically, this perpetuates the problem: bigness. What happens when Barclays or BoA start wobbling next? Now we have yet bigger institutions that are again too big to fail.

Among the remedies dismissed by free-market adherents is one of the Federal Government investing, taking an ownership stake in the banking, insurance and auto giants who have exposed themselves to risk that has subsequently blown up in their faces. "The government won't know any better how these companies should be run" goes the admonishment. But as if it isn't clear by now, the executives paid the big bucks to know how they should be run apparently don't either. As Newsweek explains in The Monster That Ate Wall Street - How Credit Default Swaps Became a Timebomb, the financial industry had no shortage of creativity when it came skirting the liquidity requirements imposed on them in the years following the S&L crisis. Is it really such a surprise? Michael Lewis (Liar's Poker, Moneyball) recounts in The End of Wall Streets Boom (Portfolio Magazine / December 2008), there were those calling Bullshit but things were just going too damned well for those alarms to be heeded.

It's unescapably clear now that the old adage applies, "if it's too good to be true, it probably is." Until recently, I thought this was only impacting me with the difficulty I had getting my mortgage. But no, the cavalier rating agencies ("the fox was guarding the hen house"), excessively leveraged financial arrangements and detached accountability have led us down this financial rabbit hole into what some now describe as a death spiral. It's not just a Wall Street problem, it's spillover to Main street has cacaded down Sandhill Road. Here's the ominous and infamous slide deck from Doug Leone and friends at Sequoia Capital:

These slides were cited during Technorati's company meeting last week around the layoffs and salary cuts. That really dropped the dark cloud of what's happening in the broader economy close to home.

As bummed as I am about seeing colleagues depart and seeing my paycheck shrink, I'm actually optimistic about the future. Valuations on real estate seem to be reaching reality: they're hitting thresholds that people can afford with conventional financing. Technology continues to fuel innovation and innovation holds the potential to re-shape markets. Come Inauguration Day, it looks like Obama is coming into office surrounding himself with a team of economic advisors who are committed to preserving free markets but are also not so steeped in ideology that they're paralyzed about how to intervene.

I'm looking forward to this cloud lifting. That's my rant.


( Nov 30 2008, 06:05:51 PM PST ) Permalink

20081129 Saturday November 29, 2008

Getting Past Bad Checksums in MacPorts

Back in the 1990's I used FreeBSD fairly extensively. One of my favorite things about the FreeBSD project was the "ports and packages" system for installing libraries and application software. Since Mac OS X is, essentially, BSD with a lot of updated chrome, it's not surprising that there's a well functioning "ports and packages" system for it, MacPorts. While it's not perfect, MacPorts seems to function and dovetail nicely with everything I use my Mac for, more so than Fink. Sure, dpkg/apt-get seems to work OK on Debian, every effort I've encountered to apply that model elsewhere has left me disappointed... anyway, yum seems to work well enough, I don't expect to use Debian again.

Recently I found myself with a port that would not install,

port install postgis
would bomb out:
"Target org.macports.checksum returned: Unable to verify file checksums" postgis
It's not a very helpful error message. After some RTFM ("man port", imagine that), I figured there musta been some cruft in the way, so I did this:
port -d selfupdate
port clean --all postgis
port install postgis
And I'm in business with the latest version of PostGIS. Yes, I coulda installed all of that stuff by hand but MacPorts generally has just what I need in a time-saving way. Note, I do all my MacOS X system administration as root so I'm not typing "sudo" all of the time.


( Nov 29 2008, 03:10:44 PM PST ) Permalink

20081128 Friday November 28, 2008

Redistributing the Karma

Since Technorati announced pay cuts for the staff earlier this week, I've been a little worried. The mortgage, an upcoming bat mitzvah (nothing opulent, really), doctor bills... the world won't wait for the economy's doldrums to turn around. I think I'll find ways to to tighten our belts (bag lunches, cancel the gym membership, etc) but if you're currently more fortunate than I am and so inclined, this PayPal Donate button is a way you can help.

If I end up with more than needed, I'll simply donate the excess to a worthy charity.

( Nov 28 2008, 09:28:29 PM PST ) Permalink

20081127 Thursday November 27, 2008

Topic Clustering Visualized in Library Search

Public service announcement: your low-tech dowdy public libraries have slicked up high-tech. The old days of long searches through card catalogs and filling out forms in triplicate are gone. Since moving to the east bay several years ago, I've been impressed with the Contra Costa County Library's online catalog that searches all of the branches in the country, online reservations and inter-branch transfers. One of my favorite features is the visual topic clustering.

When searching for "django", a hub-and-spoke is displayed with related nodes such as "reinhardt" and "guitar" as well as misspell candidates. The search results are pretty good too, the first result is for a Gypsy jazz guitar (Django Reinhardt's signature style) instructional video by the main guy from Hot Club San Francisco (Paul Mehling can often be found gigging here in the east bay at the Left Bank in Pleasant Hill, good stuff). Overall, the selection of books, CD's and videos matching "django" was what I expected.

As fond as I am of Gypsy jazz, I'm also interested in the web application framework written in the Python programming language. Changing my query to "python django" brings up a different visual cluster with some of the same cluster terms ("reinhardt" and "guitar") but adds some new ones "monty", "boa" and "computer". The search results were exactly what I wanted: The Definitive Guide to Django: Web Development Done Right by Adrian Holovaty and Jacob Kaplan-Moss and Sams Teach Yourself Django in 24 Hours by Brad Dayley. I'm planning on using django (the python web app framework) for a project (not work related) and, while the online docs are pretty good, having a book (or two) to refer to is definitely welcomed.

All said, I'm a fan of the search and clustering technology enabled by AquaBrowser that the CCC library is using, it's had me wondering how well it would perform against the more volatile data set flowing through Technorati.


( Nov 27 2008, 11:43:13 AM PST ) Permalink

20081126 Wednesday November 26, 2008

Wordpress Security Revisited

The incidence of Wordpress compromises I wrote of in the spring is still high but the rate of new infections has dropped considerably. A lot of people learned of their blogs' affliction because they were not getting indexed by Technorati. Props to the folks from Google and the Wordpress team for getting the message out too.

Yesterday's release of Wordpress 2.6.5 doesn't target SQL injection or XML-RPC vulnerabilities, this time it's a cross site scripting vulnerability.

The security issue is an XSS exploit discovered by Jeremias Reith that fortunately only affects IP-based virtual servers running on Apache 2.x. If you are interested only in the security fix, copy wp-includes/feed.php and wp-includes/version.php from the 2.6.5 release package.
2.6.5 contains three other small fixes in addition to the XSS fix. The first prevents accidentally saving post meta information to a revision. The second prevents XML-RPC from fetching incorrect post types. The third adds some user ID sanitization during bulk delete requests. For a list of changed files, consult the full changeset between 2.6.3 and 2.6.5.
read the full post
So jump on it Wordpress users, time to update!


( Nov 26 2008, 07:09:11 AM PST ) Permalink

20081125 Tuesday November 25, 2008

Fifteen Hiccups Of Fame

It's been a long time since I've felt hopeful about the outcome of an election. I remember well the civil rights anti-war marches of my childhood. The recent years have felt like a return of profound delusional corruption and polarization that marked the Nixon era. The most peculiar aspect of it is how it came to a head with desperate gasp of Sarah Palin's VP nomination. That the questions about her qualifications were questioned at all demonstrates the height of delusion. While the republican fringe ran to embrace her, those with a brain could only ask "WTF?" and cross party lines. If she had been equipped otherwise between those legs, it would never have happened. It's like reverse sexism, if it had been a man with that background and view points, he would have been laughed out the door as a naive hillbilly. (Yep, I am a reasonably educated urban elitist, so?) Instead, we were treated to tragic comedy in slow mo. The graphic included here is a chart of the blogosphere's mention trajectory for Sarah Palin over the prior 100 days.

The timeline starts off with the igloo phase: practically nobody has heard of her and nobody is talking about her. Then there's the nomination at the republican convention. Followed by interviews, the Tina Fey phenom, the election and ...back to the igloo. So long, Caribou Barbie.

All of the talk about her PR representation, book deals, etc are for naught; she's proven at every opportunity that she has nothing to say that is meaningful and contributing to moving our society forward. Nonetheless, it's amusing to read the conservative bloggers who talk about Sarah Palin as "the future of the republican party." As long as they stick to that meme, they're assuring themselves falling further adrift of where this country is going. Bon voyage, don't let that iceberg hit you in the butt on the way out!


( Nov 25 2008, 02:12:45 PM PST ) Permalink

20081123 Sunday November 23, 2008

This Blog Is Not Dead!

At long last, I'm reviving this blog from dormancy. A lot has happened since my prior posting here. In no particular order:

... but wait, there's more! Things you may even care about ;)

But they'll wait for another post. It's nice to be back!

( Nov 23 2008, 04:17:54 PM PST ) Permalink