Sunday March 18, 2007
1979: Black Flag played at the Temple Beautiful
2007: Henry Rollins is on Twitter
black flag
henry rollins
twitter
temple beautiful
Thursday March 15, 2007 Do you ever meet somebody who reminds you of someone else and have to remind yourself, "No, don't mix up the two"? An odd moment fell upon me when I accidentally found myself watching Some Kind of Monster on TV the other night. There was a flashback. And then a flashforward. Ya see, when I was a teenager, our crowd, we hated the music industry and wanted to subvert it by any means necessary. Disco music and the just-following-orders spineless vermin that ran the press establishment and radio stations (the program managers) amused and sickened us. I had a radio show on KUSF and actively tapped underground music scenes, looking for that undiscovered stroke of worldchanging noise. Perhaps my efforts of the last 15 years or so building applications on the web are an extension of the same jaded disdain I have for the media establishments. They're in-bred pay-for-play following-orders brain-washers. Let me tell you how I really feel. Complacency is not an option.
I haven't seen them for a few years but when the lot of us were teenagers, I used to fervantly headbang with my pals in Metallica. They lived in a house on Carlson Street in El Cerrito. We drank a lot. And listened to New Wave of British Heavy Metal acts like Tank and Angelwitch. And though they weren't Britons, we listened to a lot of Merciful Fate, too. Here's a picture of us, long straggly hair, pimples, listening to Venom no doubt and expressing a whole cargo ship full of fervor.
Fast forward to the 21st century. In the past few years, I've run into Jeff Veen a number of times in professional and social situations. I don't hob-nob in the web 2.0 "scene" much but when I do, there's Veen. He's not famous for too-loud/too-fast/too-go-f*k-yerself sounds of pummeling. He's not famous for alienating fans trading his goods on peer to peer networks. But he's known for his web design sophistication. He's usually a bit happier, too. See, look at the picture. Happy guy!
Hetfield rides tricked out choppers. I don't make it to San Rafael much except to visit Bubbe (I hope I make it through the long haul the way she has!). But if I lived in San Rafael I'd probably run into Hetfield all over the place; our kids are similar ages so we'd probably share Thin Lizzy on a boombox while the offspringlings are in the sandbox. When I was living in The City, I'd run into Kirk Hammett all of the time (like, the vegetable isle at Whole Foods).
On the other hand, I work in the South-of-Market part of The City and Veen is oft spotted tooling around in a mini-cooper. I don't think he's into Merciful Fate or Thin Lizzy. But then, I've never asked him.
"Hey, what's James doing driving 'round here in that? Oh, yea... that's Jeff Veen."
Hetfield and company have composed sonic booms such as "The Frayed Ends Of Sanity" (And Justice For All)
Veen on the other hand published about The Art and Science of Web Design
Hetfield goes on stage and exhorts about anger, misery and suffering. 50 thousand pairs of finger-and-pinkie horns are raised in response. And 50 bucks will get you a t-shirt.
Veen goes on stage, dressed a little smarter and discusses design principles. 50 million web pages get their style sheets revised (a subject I'm inept with, judging from my markup skillz). I don't know about the web design merchandising potential, would you pay 50 bucks for a microformats t-shirt?
Hetfield doodles on his guitars.
So they're about the same height and share some conversational mannerisms. Maybe their hairlines are about the same. The spectacles and goatee look might do it, too. But (rational voice) these guys are nothing alike and the context I know them from are completely different. And yet the introverted sensing part of me always draws the association when I run into one of them face to face or flipping TV channels and stumbling upon VH1; there is the odd familiarity of Hetfield when I run into Veen. I should invite Veen to share a bottle of stolichnaya and crank up Black Sabbath to see how he takes to it, maybe it'd just come naturally.
<
Hetfield
|
Veen
>
metallica dejavu jeff veen james hetfield kusf
( Mar 15 2007, 11:52:20 PM PDT ) Permalink View blog reactions
Wednesday March 14, 2007 Clocks have received (or are overdue for) day-light savings time adjustments, flags with Matt Cain and Barry Bonds are on the lamp posts of 3rd Street, kids are at play; recreational softball/baseball on the weekend and Odyssey of the Mind state championships are coming up. The hills are green and blossoms and blooms abound. Spring has sprung.
Have a nice day!
( Mar 14 2007, 01:30:52 PM PDT ) Permalink View blog reactions
Sunday February 25, 2007 In case you haven't noticed, there's been an unmistakable groundswell around OpenID in recent months. The proliferation of new web 2.0 services and the resulting "password fatigue" (except for those who are using OpenID) are contributing mass to the movement but the adoption of OpenID by established services (Technorati, Digg) and big players (AOL, Microsoft) are catalyzing acceleration. I'll cite these headlines as evidence:
Tim Bray raised a lot of great questions about OpenID. I'll summarize a response by simply saying that one of the virtues of using OpenID for URL-based user-centric authentication is the granularity of control available, it's all "opt-in":
Given current implementations, I'm probably not ready to use OpenID for online banking or verification that I'm old enough to buy wine (but I'll be grateful if you ask, I miss getting carded). However, I see no reason why the standards and practices can't be advanced to support those activities. The potential for phishing and man-in-the-middle attacks are a concern but there a lot of applications today where there are many benefits to the opting-in parties but few for the attacker.
Right now, if Tim's comment authentication system was OpenID-enabled, I'd be able to use my Technorati profile URL (http://technorati.com/profile/spidaman) to sign-in to post a comment on his blog. For "low-gravity" authentication requirements (blog comments), OpenID works great, today. For more rigorous authentication , user-attribute verification and trust requirements (like credit score lookups) there's a lot of great discussion underway. What I find heartening is that there seems to be broad acceptance of the Laws of Identity and increasing understanding that there's a big difference between the identity requirements for uploading photos and trading stocks in your IRA account.
URL-based authentication will likely go through the same growing pains raised by using email addresses to identify people. Back in the day when there were email addresses that people paid for and those were distinguishable from free ones, mailing list policies against subscribers with, say, hotmail addresses were often implemented. We may be approaching an era where some URLs are more equal than others, I dunno. But in the meantime, there's a lot of useful services you can use with OpenID today. If you haven't tried OpenID, do so right now by logging in to your Technorati account and then use your profile URL to log into Zooomr; this stuff is easier to use than it is to explain. I wouldn't be surprised if Yahoo! and/or Google get on the bus in the next few months. As the snowball gains mass, you should know how and when to utilize user-centric authentication systems such as OpenID.
openid identity technorati aol digg microsoft zooomr timbray
( Feb 25 2007, 08:17:17 AM PST ) Permalink View blog reactions
Saturday December 02, 2006 In my wild, youthful daze of ... indulgence, I could count on Sam Kress to goad me on to indulge more. We shared a common bond, loathing all that spankles and poofs, reveling in the too-loud-and-too-fast-so-suck-it-up sounds of the day. I learned this day via my bud @ Umlaut of Sam's passing. Crap! Sam was one of those people who leaves on indelable mark on your memory with his fervent exhortations and swaggering enthusiasm. Even just recently (at the Godsmack show), I thought of Sam when I put down a shot of Jack Daniels (yea, I know, I should stick to the cabernet, they say resveratrol is much healthier, good 'nuff excuse for me). I last saw Sam at a reunion party of sorts (a gathering of old-schoolers) about a year ago, I hadn't seen him in maybe 15 years or so but that mischievious gleam in his eye was still there.
Heathen were formed in 1984 by guitarist Lee Altus and drummer Carl Sacco. Even without a bassist, this lineup played a single gig, on April 21st, 1985. Then, Jim Sanguinetti left to form Mordred and was replaced by Doug Piercey on guitars. Sam Kress, who was a better songwriter than vocalist, was kicked out in late 1985 and David Godfrey of Blind Illusion was asked to join.Yea, Dave's a better singer but Sam could growl like a mofo.
read more
So long, Sam. I don't drink much whiskey any more but next time I have the occasion, I'll be raising it for you. I'll probably drink straight out of the bottle and pass it around, just for old times sake.
old school metal heathen sam kress
( Dec 02 2006, 04:20:45 PM PST ) Permalink View blog reactions
Friday November 17, 2006 The new Technorati link count widget provides a way for bloggers to display how many links a blog post gets. Doing it in Roller is easy, add the velocity macro below to WEB-INF/classes/weblog.vm and then call that macro from the default blog page template (Weblog).
#macro( showCosmosLink $entry )
<script src="http://embed.technorati.com/linkcount"
type="text/javascript"></script>
<a href="http://technorati.com/search/$absBaseURL/page/$userName/#formatDate($plainFormat $entry.PubTime )"
rel="linkcount">View blog reactions</a>
#end
Tantek has more details about the release on the Technorati Blog.
( Nov 17 2006, 08:31:30 PM PST ) Permalink View blog reactions
Wednesday November 15, 2006 Last night, Casey Duncan gave a nice presentation on how Pandora partitions and populates their PostgreSQL data warehouse. The San Francisco PostgreSQL Users Group meeting discussion, Horizontal Scalability with Postgresql: A Case Study, covered how Pandora segments their data set, uses pgpool connection aggregation and Slony for replication and log shipping. I also had great conversation with Greenplum CTO Luke Lonergan, they're definitely pushing the envelope of what's possible with open source database technologies. Kudos to Pandora and Greenplum, great stuff.
postgresql greenplum pandora slony pgpool datawarehousing
( Nov 15 2006, 10:50:56 AM PST ) Permalink View blog reactions
Sunday November 12, 2006 An interesting introduction came over the transom recently. I've read Kim Cameron's blog before but the honest truth is: I've really been flumoxed by the wide range in the cast of characters and agendas in the identity fray. Some seem overly concerned with identity as a line of business, others concerned with seeing themselves at the center of the discussion. Meeting Kim was a treat, even though he had the cards stacked against him coming from Microsoft, we had a great conversation. When I think of Microsoft I think of the many aspersions; "the Borg", "the evil empire", "The Man", "the big cathedral", "stifling monopolists", "makers of the Blue Screen Of Death", "vendor lock-in creeps", "virus and security-hole mongering dumbos." OK, I'll stop. Of course the reality is that good people also show up in bad places and they make good things happen nonetheless. C# looks and the .Net framework does great stuff for developer productivity. There's a lot of innovation happening in Microsoft's search and online services divisions. To be fair, a lot of Microsoft bashing is another form of bigotry that we have to get beyond. Microsoft has a lot great people and their executive leadership has done a lot of really bad things, so move along. The good guys inside the cathedral need constructive engagement lest they never prevail over the Matrix; more than anyone they (and Melinda) have the capacity to draw the Sith away from the Dark Side (re "constructive engagement": I'm thinking Clinton's Sino-American oppositional/collaborative stance that rides on the inevitable, not Reagan's failure vis-a-vis South Africa, which was wimpy coddling of the anti-divestment movement).
Speaking of the Jedi and Neo architype, characters and ranches in Santa Barbara, endorsements from Doc Searls always get my attention:
When the conversation started to heat up after DIDW, the Neo role was being played by a character with the unlikely title of "Architect", working inside the most unlikely company of all: Microsoft. Kim Cameron is his name, and his architecture is the Identity Metasystem. Note that I don't say "Microsoft's Identity Metasystem". That's because Kim and Microsoft are going out of their way to be nonproprietary about it. They know they can't force an identity system on the world. They tried that already with Passport and failed miserably.I prefer to think of the various roles of Identity Providers, Relying Parties and People as part of an ecosystem. But metasystem is fine, let's just stick to that vernacular. Kim is the author of Laws of Identity. Again citing the same article from Doc for a nice summarization:
User Control and Consent: digital identity systems must reveal information identifying a user only with the user's consent.
Limited Disclosure for Limited Use: the solution that discloses the least identifying information and best limits its use is the most stable, long-term solution.
The Law of Fewest Parties: digital identity systems must limit disclosure of identifying information to parties having a necessary and justifiable place in a given identity relationship.
Directed Identity: a universal identity metasystem must support both "omnidirectional" identifiers for use by public entities and "unidirectional" identifiers for private entities, thus facilitating discovery while preventing unnecessary release of correlation handles.
Pluralism of Operators and Technologies: a universal identity metasystem must channel and enable the interworking of multiple identity technologies run by multiple identity providers.
Human Integration: a unifying identity metasystem must define the human user as a component integrated through protected and unambiguous human-machine communications.
Consistent Experience across Contexts: a unifying identity metasystem must provide a simple consistent experience while enabling separation of contexts through multiple operators and technologies.
This is powerful stuff. I'm very pleased with our implementation of OpenID to support blog claiming but I know that this is the tip of the iceberg. There are people on the web who aren't authoring and sharing; they may not have nor want a URL that they can use for their identity. So while I'm committed to extending our support for OpenID, I'm also looking beyond it. The Laws are exemplary guiding principles in my exploration of the topic. Kim and Doc joined Kristopher Tate (the Zooomr dude), Tantek and myself to talk about CardSpace, Microsoft's implementation of an identity metasystem.
After discussing some of the high-level issues facing the web, the blogosphere and user generated content participant created artifacts in general, we dived deep on CardSpace. Since CardSpace will be shipping with Vista (as well as distributed for Windows XP), by my estimation the coming ubiquity of user-centric identity isn't something to ignore. As we worked through the CardSpace workflow with Kim, Tantek and I came up with this diagram (Glossary: "IDP" = "Identity Provider", "RP" = "Relying Party", CardSpace is a page embedded app so there's both interaction via the browser and directly in the OS). This is of course just Microsoft's implementation but the Good Thing is that they aren't clutching it tightly, folks working on open source implementations (keep an eye on the OSIS working group) will make sure that the identity metasystem isn't a Borg in sheeps clothing.
Identities on the contemporary web suffer from a lot of accountability, authenticity and siloization deficiencies. Pings, trackbacks and comments all suffer from these and in turn we all do in the form of web spam. Reputation systems (such as Technrati's authority ranking) mitigate some of these problems but there is still much to do. I'm really pleased to have met Kim, he's one of the good guys and I look forward to working more folks pushing the online identity envelope. If you're going to be joining Internet Identity Workshop coming up, I'll see you there!
identitymetasystem cardspace technorati openid microsoft
( Nov 12 2006, 01:25:24 PM PST ) Permalink View blog reactions
Sunday October 29, 2006
Everyone knows what a great product Movable Type is. But if you find yourself in care of a Movable Type deployment that nobody seems to be able to login to with superuser privileges, it may seem pretty hopeless; if you need to perform privileged operations, especially if the installation is backended by a sleepycat, er, Oracle BerkelyDB database, the data is somewhat opaque. AFAIK, MT doesn't seem to ship with any "break glass with this little hammer if the superuser was hit by a bus" contingencies and with BerkelyDB there's no SQL command prompt; in fact, the only way to dig into it is to write some code. So I was fiddling with just such a MT-3.33 installation; I had an account but not much in the way of privileges. After opening the BerkeleyDB files with DB_File, dumping contents with Data::Dumper and going through some of the MT libraries, I found what I was looking for. Here's the Perl I hacked up to grant myself superuser privileges:
#!/usr/bin/perl
use strict;
use DB_File;
use lib qw( /path/to/MT-3.33/lib );
use MT;
use MT::Serialize;
use MT::ConfigMgr;
my $serializer = MT::Serialize->new(MT::ConfigMgr->instance->Serializer);
my %hash;
tie %hash, 'DB_File', '/path/to/MT-3.33/author.db', O_CREAT|O_RDWR, 0666, $DB_BTREE or die $!;
my $data;
while (my($k,$v) = each %hash) {
my $rec = $serializer->unserialize($v);
if (${$rec}->{'name'} eq 'Ian Kallen') {
$data = ${$rec};
last;
}
}
$data->{'is_superuser'} = 1;
my $frozen = $serializer->serialize( \$data );
$hash{'12'} = $frozen;
untie %hash;
For other fixes to Movable Installations, consider MT-Medic.
( Oct 29 2006, 09:35:21 PM PST )
Permalink
View blog reactions
Thursday October 19, 2006 As I announced on the Technorati Weblog, we rolled out support for blog claiming with OpenID. I'm really proud of the work that Chris and the team have done to make this a reality. If you're not familiar with OpenID, here is one good place to start. Sure, I'm well aware of the concerns about phishy user interface vulnerabilities. The idea of logging in without a password may seem weird.
One weird thing, for new users, is that instead of logging into an OpenID-using site (like Zooomr) with a user name and password, you just give it your personal OpenID URL -- and no password. Then your browser pops over to your authenticating site (like myopenid.com) to verify that you want to use your persona on the new site. This is bound to initially confuse people, and since users may not be asked for a password, it can also appear to be less secure, although it is not.Frankly, I'm not certain what the best resolutions are for those concerns. However I'm more comfortable with adopting OpenID "as-is" and evolving as the technology advances then sitting around waiting for it to be perfected. Welcome to now.
ZDNet: OpenID has a potential cure for Website password overload - Rafe Needleman
Distributed identity ideas have been gestating for a long time while identity cathedrals have been built and fallen. If your blog is your voice, your URL can be your identity.
( Oct 19 2006, 11:42:04 PM PDT ) Permalink View blog reactions
Whenever I look at page to page, post to post, blog to blog and domain to domain relationship statistics (and permutations across them) interesting things often emerge. Microsoft's Live Search recently released a linkfromdomain operator that can help dig into these linking relationships. For instance, linkfromdomain:arachna.com ruby returns the pages that I've linked to that have ruby in the text. Combined with the site operator, I can do a search of the pages I've linked to on Technorati with linkfromdomain:arachna.com site:technorati.com.
Looks like the blogosphere is noticing, within the last two days Technorati has seen 57 links to the linkfromdomain announcement blog post. Kudos to MSN's search team for a cool innovation.
One apparent problem with their crawls is javascript/flash-plugin handling, the site:youtube.com linkfromdomain:technorati.com SERP shows pages referenced from Technorati's most linked-to YouTube videos, however all of the SERP items have the text
Hello, you either have JavaScript turned off or an old version of Macromedia's Flash Player. Click here to get the latest flash player.heh!
search livesearch msn technorati
( Oct 19 2006, 06:56:16 AM PDT ) Permalink View blog reactions
Wednesday October 18, 2006
This was on NASA's Astronomy Picture Of The Day site a few days ago, I haven't been able to close the browser tab with it... I just keep gazing at the surreality of it.
In the shadow of Saturn, unexpected wonders appear. The robotic Cassini spacecraft now orbiting Saturn recently drifted in giant planet's shadow for about 12 hours and looked back toward the eclipsed Sun. Cassini saw a view unlike any other. First, the night side of Saturn is seen to be partly lit by light reflected from its own majestic ring system. read onNASA goes on to explain that the eclipse revealed newly detected strata of rings around Saturn. ( Oct 18 2006, 10:45:16 AM PDT ) Permalink View blog reactions
Tuesday October 17, 2006 Between Google's extensive use of employee shuttles, their green data centers proposal last month and yesterday's announcement Google to Convert HQ to Solar Power, I'm really impressed with the ecologically conscientious initiatives they're taking! Personal note: the solar installation will be led by Energy Innovations, EI president Andrew Beebe is a friend from years ago who I've long lost touch with but I was very pleased to see his name associated to this project.
( Oct 17 2006, 06:52:10 AM PDT ) Permalink View blog reactions
Saturday October 07, 2006
It's broadly appreciated how scaling up is usually driven by business demand, but the requirements for scaling down are rarely as appreciated. Questions about how web 2.0 business scale up abound these days. As the challenges of service growth and business plans stress technical infrastructure, startups try to squeeze everything they can out of their architecture with a number of widely accepted practices. However, scaling considerations for the other direction are oft neglected.
End-to-end testing that doesn't require duplication of production infrastructure is a strategic advantage. I know of a financial analytics system run by a large institution that is untestable. This system has cron jobs, data feeds and query systems built on top of Perl code going back at least a decade. The inputs and outputs are so convoluted, that the system is untestable. So if this code is making the bank that owns it tens of millions of dollars every day (it is!), what's wrong with that? Well, it could be probably be more profitable if it could be changed and optimized safely. As it stands, the folks maintaining the code don't really know what modifications might break the system and with income produced at that scale, who wants to risk it? So look at the systems you're working on now, think about the "scaling up" considerations you've made and ask yourself: Is a system testable in a developer's environment? Can they unit test? Can they perform functional tests? Do the tests require access to resources only available at the data center? Is "now" hardcoded to the present in your code? Using scaled down database, messaging, caching and application runtimes that have no dependencies on a connected network and production infrastructure should be considered up front in your design consideration.
If a system makes assumptions about the process space it runs in that allows for functionality to be accessed from other runtimes, bravo: you may be headed in the right direction of service oriented architecture and horizontal scaling. But can the application stack be collapsed? This is like the OMG-moment when folks first started running J2EE application tiers over remote interfaces and realized that they've ended up with so much complexity and overhead, they have no choice but to scale up. That complexity can have all kinds of expensive side effects with how effectively systems can be triaged when they ail.
Businesses are run be people. People make mistakes. Wetware is imperfect. When you buy a long term commitment to a data center, you may be assuming liabilities that will outlive the business proof. Make sure the hardware footprint you're signing up for is one you can sustain it or you can get out of it. When you build gratuitous tiers, the costs of taking them out when it's time to consolidate functionality can be stifling. So ask yourself: If systems scaled up to meet business objective that aren't met, can you "retreat" from the scale-up offensive?
Every time I see a system that's hard to test, has sysadmins overwhelmed or are not meeting business objectives and has to be reeled in, I'm reminded of the importance of thinking about scaling in both directions. No, I haven't read the book yet but as someone burdened with too much stuff at home, I've got it on my list.
web 2.0 unit testing functional testing technical operations system architecture software
( Oct 07 2006, 03:53:58 PM PDT ) Permalink View blog reactions
Saturday September 30, 2006 I find it really fascinating to see the acceptance of a publishing paradigm that lies in between the micropublishing realm of blogging, posting podcasts and videos and "old school" megapublishing. There are of course magazines; your typical piece in the New Yorker is longer than a blog post but shorter than a traditional book. But there's something else on the spectrum, for lack of a better term I'll call it minipublishing.
If you want to access expertise on a narrow topic, wouldn't it be cool to just get that, nothing more, nothing less? For instance, if you want to learn about the user permissions on Mac OS X, buy Brian Tanaka's Take Control of Permissions in Mac OS X. TidBITS Publishing has a whole catalog of narrowly focused publications that are bigger than a magazine article but smaller than your typical book. O'Reilly has gotten into the act too with their Short Cuts series. You can buy just enough on Using Microformats to get started; for ten bucks you get 45 pages of focused discussion of what microformats are and how to use them. Nothing more, nothing less. That's cool!
What if you could buy books in part or in serial form? Buy the introductory part or a specific chapter, if it seems well written, buy more. Many of us who've bought technical books are familiar with publish bloat, dozens of chapters across hundreds of pages that you buy even though you were probably only interested in a few chapters. Sure, sometimes publishers put a a few teaser chapters online hoping to entice you to buy the whole megilla. Works for me, I've definitely bought books after reading a downloaded PDF chapter. But I'm wondering now about buying just the chapters that I want.
publishing microformats macosx media micropublishing minipublishing
( Sep 30 2006, 07:04:31 PM PDT ) Permalink View blog reactions
Wednesday September 27, 2006
Colonel Jessup has assumed control of Newsweek:
Ignorance is bliss
How meta:
See ya at the gulag.
media newsweek ministry of truth afghanistan taliban bush
( Sep 27 2006, 04:16:31 PM PDT ) Permalink View blog reactions
Tuesday September 26, 2006 At today's Intel Developer Forum, Google is presenting a paper that argues that the power supply standards that are built into today's PCs are anachronistic, inefficient and costly. With the maturing of the PC industry and horizontal scaling becoming a standard practice in data center deployments, it's time to say good-bye to these standards from the 1980's.
John Markoff reported in the NY Times today
The Google white paper argues that the opportunity for power savings is immense, by deploying the new power supplies in 100 million desktop PC's running eight hours a day, it will be possible to save 40 billion kilowatt-hours over three years, or more than $5 billion at California's energy rates.Nice to see Google taking leadership on the inefficiencies of the PC commodity hardware architectures. ( Sep 26 2006, 06:02:09 PM PDT ) Permalink View blog reactions
Google to Push for More Electrical Efficiency in PCs
Monday September 25, 2006 The other week I reflected on the scaling-web-2.0 theme of the The Future of Web Apps workshop. Another major theme there was how social software is different, how transformative architectures of participation are. There was one talk that stood out from Tom Coates, Greater than the sum of its parts. A few days ago the slides were posted; I poked through 'em since and they jogged some memories loose, I thought I'd share Tom's message, late though it is, and embellish with my spin.
Tom's basic thesis is that social software enables us to do "more together than we could apart" by "enhancing our social and collaborative abilities through structured mediation." Thinking about that, isn't web 1.0 about structured mediation? Centralized services, editors & producers, editorial staff & workflow, bean counting eyeballs, customer relationship management, demographic surveys and all of that crap? Yes, but what's different is that web 2.0 structured mediation is about bare sufficiency in that it's better to have too little than too much, the software should get out of the way of the user, make him/her a participant, not lead him/her around by the nose.
Next, Tom highlighted that valuable social software should serve
Citing The Success of Open Source , he likened social software participants motivations to this ranked list of open source contributor's motivations
Here are some social software "best practices":
futureofwebapps-sf06 social software virtual community
( Sep 25 2006, 10:24:40 AM PDT ) Permalink View blog reactions
Thursday September 21, 2006 I mused about people-powered topic classification for blogs after playing with the Google Image Labeller the other week. It seems like a doable feature for Technorati because the incentives to game topic classification are low.
That same week, Rafe posed a question about community driven spam classification:
Why couldn't Blogger or Six Apart or a firm like Technorati add all of the new blogs they register to a queue to be examined using Amazon's Mechanical Turk service? I'd love to see someone at least do an experiment in this vein. The only catch is that you'd want to have each blog checked more than once to prevent spiteful reviewers from disqualifying blogs that they didn't agree with.The catch indeed is that the incentive is high for a system like this to be gamed. Shortly after blogger implemented their flag, spammers
(read the rest)
"Bloggerbowling" - the practice of having robots flag multiple random blogs as splogs regardless of content to degrade the accuracy of the policing service.As previously cited from Cory, all complex ecosystems have parasites. So I've been thinking about what it would take to do this effectively, what would it take overcome the blogosphere's parasites bloggerbowling efforts? The things that come to mind for any system of community policing are about rewards and obstacles. For example
At the end of the day, I don't have the answers. But I think Rafe, Doc and so many others concerned with splog proliferation are asking great questions. Technorati is currently keeping a tremendous volume of spam out of its search results but, at the end of the day, there's still much to do. And this post is the end of my day, today.
spam splog splogs technorati virtual community blogs web spam
( Sep 21 2006, 11:06:22 PM PDT ) Permalink View blog reactions
Wednesday September 13, 2006 A few weeks ago, Adam mentioned some of the shuffling going on at Technorati's data centers. Yep, we've had our share of operational instability lately, when you have systems that expect consistent network topologies and that has to change, I suppose these things will happen. It seems a common theme I keep hearing in conversations and presentations about web based services: the growing pains.
This morning, Kevin Rose discussed The digg story: from one idea to nine million page views at The Future of Web Apps workshop. Digg has had to overcome a lot of the "normal" problems (MySQL concurrency, data set growth, etc) that growing web services face and have turned to some of the usual remedies, rethinking the data constructs (they hired DBA's) and memcached. This afternoon, Tantek was in fine form discussing web development practices with microformats where he announced updates to the search system Technorati's been cooking, again a growth induced revision. Shortly thereafter, I enjoyed the stats and facts that Steve Olechowski presented in his 10 things you didn't know about RSS talk. And so it goes, this evening it was Feedburner having an episode. "me" time -- heh, know how ya feel <g>
While Feedburner gets "me" time, Flickr gets massages when they have system troubles. Speaking of Flickr, I'm looking forward to Cal Henderson's talk, Taking Flickr from Beta to Gamma at tomorrow's session of The Future of Web Apps. I caught a bit of Scaling Fast and Cheap - How We Built Flickr last spring, Cal knows the business. I've been meaning to check out his book, Building Scalable Web Sites.
Perhaps everybody needs a therapeutic message for the times of choppy seas. When Technorati hurts, it just seems to hurt. Should it be getting meditation and tiger balm (hrm, smelly)? Some tickling and laughter (don't operate heavy machinery)? Animal petting (could be smelly)? Aromatherapy (definitely smelly)? Data center feng shui? Gregorian chants? R.E.M. samples?
futureofwebapps-sf06 palaceoffinearts flickr feedburner digg technorati microformats memcached
( Sep 13 2006, 09:26:42 PM PDT ) Permalink View blog reactions
Monday September 04, 2006
Hey, I'm in Wired! The current Wired has an article about blog spam by Charles Mann that includes a little bit of my conversation with him. Spam + Blogs = Trouble covers a lot of the issues facing blog publishers (and in a broader sense, user generated content participant created artifacts in general). There are some particular challenges faced by services like Technorati that index these goods in real time; not only must our indices have very fast cycles, so must our abilities to keep the junk out. I was in good company amongst Mann's sources, he talked to a variety of folks from many sides of the blog spam problem: Dave Sifry, Jason Goldman, Anil Dash, Matt Mullenweg, Natalie Glance and even some blog spam perps.
I've also had a lot of conversations with Doc lately about blog spam and the problems he's been having with kleptotorial. A University of Maryland study of December 2005 pings on weblogs.com determined that 75% of the pings are spam AKA spings. By excluding the non-English speaking blogosphere and not taking into account the large portions of the blogosphere that don't ping weblogs.com, that study ignored a larger blogosphere but overall, that assessment of the ping stream coming from weblogs.com seemed pretty accurate. As Dave reported last month, by last July we were finding over 70% of the pings coming into Technorati to be spam.
Technorati has deployed a number of anti-spam measures (such as targetting specific Blogger profiles, as Mitesh Vasa has. Of coures there's more that we've done but if I told you I'd have to kill you, sorry). There are popular theories in circulation on how to combat web spam involving blacklists of URLs and text analysis but those are just little pieces of the picture. Of the things I've seen from the anti-splog crusader websites, I think the fighting splog blog has hit one of the key vulnerabilities of splogs: they're just in it to get paid. So, hit 'em in the wallet. In particular, splog fighter's (who is that masked ranger?) targetting of AdSense's Terms of Service violators sounds most promising. Of course, there's more to blog spam than AdSense, Blogger and pings. The thing gnawing at me about all of these measures is their reactiveness. The web is a living organism of events, the tactics to keeping trashy intrusions out should be event driven too.
Intrusion detection is a proven tool in the computer security practice. System changes are a distrurbance in the force, significant events that should trigger attention. Number one in the list of The Six Dumbest Ideas in Computer Security is "Default Permit." I remember the days when you'd take a host out of the box from Sun or SGI (uh, who?) and it would come up in "rape me" mode. Accounts with default passwords, vulnerability laden printing daemons, rsh, telnet and FTP (this continued even long after the arrival of ssh and scp), all kinds of superfluous services in /etc/inetd.conf and so on. The first order of business was to "lock down" the host by overlaying a sensible configuration. The focus on selling big iron (well, bigger than a PC) into the enterprise prevented vendors from seeing the bigger opportunity in internet computing and the web. And so reads the epitaph of old-school Unix vendors (well, in Sun's case Jonathan Schwartz clearly gets it -- reckoning with the "adapt or die" options, he's made the obvious choice). Those of us building public facing internet services had to take the raw materials from the vendor and "fix them". The Unix vendors really blew it in so many ways, it's really too bad. The open source alternatives weren't necessarily doing it better, even the Linux distros of the day had a lot of stupid defaults. The BSD's did a better job but, unless you were Yahoo! or running an ISP, BSD didn't matter (well, I used FreeBSD very successfully in 90's but then I do things differently). Turning on access to everything but keeping out the bad guys by selectively reacting to vulnerabilities is an unwinnable game. When it comes to security matters, the power of defaults can be the harbinger of doom.
The "Default Deny" approach is to explicitly prescribe what services to turn on. It's the obvious, sensible approach to putting hosts on a public network. By having very tightly defined criteria for what packets are allowed to pass, watching for adversarial connections is greatly simplified. I've been thinking a lot about how this could be applied to providing services such as web search while also keeping the bad guys (web spammers) out.
Amongst web indexers, the big search services try to cast the widest net to achieve the broadest coverage. Remember the mine is bigger than yours flap? Search indices seemingly follow a Default Permit policy. On the other extreme from "try to index everything" is "only index the things that I prescribe." This "size isn't everything" response is seen in services like Rollyo. You can even use Alexa Web Search Platform to cobble your own index. But unlike the case of computer security stances, with web search you want opportunities for serendipity; searching within a narrowly prescribed subset of the web greatly limits those opportunities. Administratively managed Default Deny policies will only get you so far. I suspect in the future effective web indexing is going to require more detailed classification, a Default Deny with algorithmic qualification to allow. Publishers will have to earn their way into the search indices through good behavior.
The blogosphere has thrived on openness and ease of entry but indeed, all complex ecosystems have parasites. So, while we're grateful to be in a successful ecosystem, we'd all agree that we have to be vigilant about keeping things tidy. The junk that the bad guys want to inject into the update stream has to be filtered out. I think the key to successful web indexing is to cast a wide net , keep tightly defined criteria for deciding what gets in and to use event driven qualification to match the criteria. The attention hi-jackers need to be suppressed and the content that would be misappropriated has to be respected. This can be done by deciding that whatever doesn't meet the criteria for indexing, should be kept out. Not that we have to bid adieu to the yellow brick road of real time open content but perhaps we do have to setup checkpoints and rough up the hooligans who soil the vistas.
spam web spam splog splogs adsense technorati wired
( Sep 04 2006, 11:10:15 PM PDT ) Permalink View blog reactions
Saturday September 02, 2006 I spent way too much time last night giving Google some free labor. The Google Image Labeler is kinda fun, in a peculiar way. In 90 second stretches that AJAX-ishly links you to someone else out there in the ether, you are shown images and a text box to enter tags ("labels" is apparently Google's preferred term, whatever). Each time you get a match with your anonymous partner, you get 100 points. The points are like the ones on Whose Line Is It Anyway, they don't matter. And yet it was strangely fun. The most I ever got in any one 90 second session was 300 points. Network latency was the biggest constraint, sometimes Google's image loading was slow. Also, the images are way too small on my Powerbook ... this is the kinda thing you want a Cinema Display for (the holidays are coming, now you know what to get me).
So what if Technorati did this? Suppose you and some anonymous cohort could be simultaneously shown a blog post and tag it. Most blogging platforms these days support categories. But there are a lot of blog posts out there that might benefit from further categorization. Author's are already tagging their posts and blog readers can already tag their favorite blogs but enabling an ESP game with blog posts sounds like an intriguing way to refine categorization of blogs and posts.
tagging esp game google image labeler mechanical turk
( Sep 02 2006, 12:31:26 PM PDT ) Permalink View blog reactions
Wednesday August 30, 2006 Last week I was in Albuquerque for some family time and relaxation. It was truly wonderful to see the desert in full bloom; the monsoonal flow of weather coming up from the Gulf of Mexico this time of year has brushed the whole landscape with lovely shades of green. The weather was mild, the raspberries lucious and abundant and, though the trout weren't biting, the rivers roared beautifully; it was really great. No, I didn't accept payola from the New Mexico visitors bureau, really, my gushing is legit.
Anyway, I also took the opportunity to do some geeky oogling at the Eclipse Aviation facility in Albuquerque. I'm not normally an airplane nerd but last Friday, I was. What interested me about this company is that they are producing a truly disruptive technology. Commercial aviation and metropolitan airports are high ceremony affairs; security lines, taking off your shoes and taking out your laptop, finding the right carousel to get your luggage... and praying that it shows up there in tact. The Eclipse jets will commoditize high altitude cruising in a pressurized cabin at speeds that aren't too far behind the big boys (and twice the speed of propeller planes) and do so at a price point on par with the cost of many single family homes in the San Francisco Bay Area. What will that mean for you? What would it mean to you if 2 to 3 hour rides up and down the west coast of the United States are cheap and abundant? If a two day drive or commercial jetliner and airport rigamarole can be replaced with fast, low ceremony travel, then the world gets a lot smaller again. That would mean a lot to me! Welcome to reality, the smaller world where commodization is good.
Eclipse is a newcomer, a startup (I know bit about disruptive startups) in the aviation industry. As you'd expect, they're doing things differently. Which isn't a surprise given founder Vern Raburn's pedigree. Raburn is an early Microsoft and Lotus guy, has been a pilot since he was teenager and has a passion for innovation (along with some cash to throw behind it). Eclipse's friction stir welding process for joining the aluminum shell results in a light but strong hull (without relying on composites); most planes are pieced together with rivets. Eclipse has extensive IT infrastructure that provides flight plans, collects metrics on the planes while they're in flight, detecting component failures and poised to assist from a state-of-the-art operations center. The avionics are displayed on redundant touch screens and the controls are vastly simplified over what you find in traditional aircraft. Here are some specs on the Eclipse 500:
So, do the math and this works out a lot cheaper than flying most piston engine planes (costs per hour may be lower but you're in the air twice as long with those). OK, I admit I don't have one and half mill to drop for one of these babies (however, I have aspirations to be a "qualified buyer") but still, the potential to bring this kind of travel within easy reach is at hand. Even if you don't buy one of them, using one like you'd use a cab seems like huge improvement over current modes of air travel. On your next trip, as you endure the TSA confiscating that toothpaste you forgot was in our carry-on luggage, imagine jet travel that operates more like a car service, like a cab. DayJet is going to provide exactly that using fleets of Eclipse jets. On the factory floor, I saw a few DayJet-logo'd planes getting prepped for delivery. Apparently a gaggle of "air taxi" services similar to DayJet are in the works, they'll also be powered by fleets of Eclipse 500's. We'll embark on the era of very light jets (VLJ), when the first customers start taking delivery of their aircraft within the next month. This may not be Kitty Hawk but I do think this will be rank high in the list of significant aviation events.
aviation jets civil aviation eclipse aviation disruptive technologies
( Aug 30 2006, 09:09:06 PM PDT ) Permalink View blog reactions
Tuesday August 29, 2006 In this corner: Doc is going to attack kleptotorial splogs by employing cleaner living through better licensing (a creative commons flavor). And in this corner: Elliott Back says he is a victim. He has been slammed by Scoble (and Scoble was gracious enough to apologize). I have no sympathy for Elliott Back. Sure, he's just the gun maker, not the shooter. But weapon makers producing wares without safeties get sued for negligence. Basically, any tool that programmatically harvests and posts other people's feeds should at least have the common decency to not ping. If you re-inject something into the update stream that you've appropriated from someone else, you're scamming the update stream. This isn't about quoting or citing, this is about fraudulent pings, "I've updated my blog (nevermind the fact it's with OPP)" -- keep your feed harvesting to yourself, please.
( Aug 29 2006, 09:51:57 AM PDT ) Permalink View blog reactions
Monday August 28, 2006 The MySQL query cache has rarely been of much use to me since it's a pretty much just an optimization for read-heavy data. Furthermore, if you have a pool of query hosts (e.g. you're using MySQL replication to provide a pool of slaves to select from), each with its own query cache in a local silo, there's no "network effect" of benefitting from a shared cache. MySQL's heap tables are a neat trick for keeping tabular data in RAM but they don't work well for large data sets and suffer from the same siloization as the query cache. The standard solution for this case is to use memcached as an object cache. The elevator pitch for memcached: it's a thin distributed hash table in local RAM stores accessible by a very lightweight network protocol and bereft of the featuritus that might make it slow; response times for reads ands writes to memcached data stores typical clock in at single digits of milliseconds.
RDBMS-based caches are often a glorified hash table; a primary key'd column and value column. Using an RDBMS as a cache works but it's kinda overkill; you're not using the "R" in RDBMS. Anyway, transacting with a disk based storage engine that's concerned with ACID bookkeeping isn't an efficient cache. MySQL has the peculiar property of supporting pluggable storage backends. MyISAM, InnoDB and HEAP backends are the most commonly used ones. Today, Brian Aker (of Slashdot and MySQL AB fame) announced his first cut release of his memcache_engine backend.
Here's Brian's example usage:
mysql> INSTALL PLUGIN memcache SONAME 'libmemcache_engine.so' ; create table foo1 (k varchar(128) NOT NULL, val blob, primary key(k)) ENGINE=memcache CONNECTION='localhost:6666';
mysql> insert into foo1 VALUES ("mine", "This is my dog");
Query OK, 1 row affected (0.01 sec)
mysql> select * from foo1 WHERE k="mine";
+------+----------------+
| k | val |
+------+----------------+
| mine | This is my dog |
+------+----------------+
1 row in set (0.01 sec)
mysql> delete from foo1 WHERE k="mine";
Query OK, 1 row affected (0.00 sec)
mysql> select * from foo1 WHERE k="mine";
Empty set (0.01 sec)
Brian's release is labelled a pre-alpha, some limitations apply, your milage my vary, prices do not include taxes, customs or agriculture inspection fees.