Podcastplayer.org news

2005/4/13

RSS, storage, and the myth of the “long tail”

Filed under: — Frank @ 11:01 am

Anyone who’s interested in podcasting and media in general will probably have enountered a lot of trumpeting about “the long tail". This is the idea that although there are potentially rich pickings in the servicing the “most popular” of something, in reality there can actually be a larger potential market in “the rest". There’s plenty of web material available if you want to read more, for example this blog, this wikipedia entry, or this wired article.

This all sounds wonderful. The idea of empowering customer choice by making the whole “back catalog” available is an enticing prospect.

BUT, similar forces to those that pushed everything from superstores to TV stations into concentrating on the single largest identifiable group are at work in podcasting and other alternative media. Even though most pundits seem to prefer to ignore them.

Consider these recent blog entries:

From Digital Strips : The Web Comics Podcast with Zampzon & Daku:

we’re running a little tight on server space so we are going to have to trim down the show archives a bit. I will limit the archived shows to 8 episodes at a time. Now is your last chance to grab our earlier ones.

From Dave’s Chalkboard:

I didn’t download the podcasts as the episodes were being aired the first time because I didn’t want to listen to them so soon. Now that I want to listen to them, they are not available.

Supporting the long tail with anything other than hot air costs. It costs in storage space. It costs in index complexity. It places an ever-increasing burden on the freedom to change site designs and structures.

The most insidious part of all this, though is the way that RSS has become almost entirely a “what’s new” mechanism. Finding a few “most recent” podcasts, or blogs, or whatever is easy. Finding anything else is ridiculously hard. As an example, I recently discovered RocketBoom. I liked the few I received from the feed and wanted to download some older issues. But they are not in the feed. Instead I had to trawl through a complex and somewhat irritating set of “archive” web pages, each of which tried to force me to play the show in-page rather than offering a simple download link. In the end I wrote a small script in ruby which guessed at archived filenames and sat in the background trying the next one then sleeping for a while. Still didn’t get a complete set though.

As more and more podcasts, videoblogs, digital photos, independent music and other large media files hit the limits of storage, I predict we are going to see a huge shakeout of old stuff. In turn, the culture will subtly change, and people will take to pre-emptively grabbing stuff “just in case” rather than relying on it being on the net if they need it. Unfortunately this will just move the burden from storage to bandwidth, increasing costs for everyone.

So. if you can, please please commit to keeping all your old material available. And provide RSS lists of the old stuff, so it can be grabbed by regular RSS media-catcher software.

podscope - We’re listening. You’re searching

Filed under: — Frank @ 10:27 am

Another intriguing idea. A speech-recognizing search engine that “listens” to podcasts and indexes the words. The site is full of bullish claims, but I think I’ll wait until I see it in action before jumping on the bandwagon.

Podscope is the Internet’s first spoken-word search engine for audio and video podcasts.

Theoretically, parsing words from something like a podcast should be a better deal than real-time speech-recognition. The software can take as long as it likes (within reason) trying different approaches to get a reasonable result. What worries me, though is the diverse nature of podcasters and podcast content.

All real-time speech-recognition systems that I’m aware of require some sort of “training", to get a grip on how the speaker uses even well-known words. Attempting to process an unknown podcast which may be in any language, in any accent, may be a mixture of voices, may have background music or chunks of non-spoken content seems a tall order.

My guess is that they will initially just “cherry pick” words that they are pretty sure about, and simply not index the rest. The trouble is that this is often the opposite of what’s needed when providing a searchable index. When searching you quickly learn that searching for rarer, more-specific words provides a better result; but these are just the kind of words that an automatic parser will lack the context to recognize.

Maybe they’ll get smart and support a wiki-style mass-participation system to allow anyone to correct words and feed back into teaching the system about hot ideas and specific podcasting styles.

Read more at: podscope - We’re listening. You’re searching

Podshows.com

Filed under: — Frank @ 10:09 am

I’ve seen this all over the blogs, but I had to mention it. A bunch of ex-names from BBC radio are making shows available in a paid podcast format. The deal seems to be that they put together a “radio"show lasting an hour or so, play some music (but only 60% of each track), do the DJ talk thing, and generally massage the nostalgia of people who used to listen to radio back when it seemed to matter.

You pay roughly the same as an iTunes song, but get an hour of part-songs and relatively mindless blithering. Seems inoffensive enough, but I can’t say that I’ll be rushing to buy any.

Read more at Podshows.com

mrbrown: mrbrown’s podcast workstation

Filed under: — Frank @ 10:00 am

As I browse the net, I’m looking to collect details of the hardware and software that people use to create their ‘casts. Here’s the setup used by “mrbrown":

This is where all it happens. My little “studio” for the mrbrown show.

Read more at: mrbrown: mrbrown’s podcast workstation

Creative Commons License
This site is licensed under a Creative Commons License

I listen to IT Conversations

Listed on BlogShares

Powered by WordPress