Refined RSS feeds

posted Aug 6 @ 11:29 PM by Jason Kottke · gift link

Refined RSS feeds

I took a few minutes recently to make sure the RSS feeds for kottke.org are correct, validate, and such. In addition to a problem that Brent noted in my remaindered links feed, I’ve received several emails lately about my feeds not working in some RSS readers and aggregators. The MT RSS templates I was using predated the xml_encode attribute, so there was most of my problem right there (well, along with a lack of understanding of the intricacies of RSS on my part). So, with help from the RSS 1.0 spec, the RSS 2.0 spec, the default MT templates, and Brad Choate’s Non-Funky MT RSS 2 Template, here are the RSS files for kottke.org for your enjoyment, approval, perusal, applause, and jeers:

RSS 1.0 with short excerpts
RSS 1.0 with medium-length excerpts
RSS 1.0 with full posts
RSS 2.0 with medium-length excerpts
RSS 1.0 feed of kottke.org remaindered links

As of right this minute, all of these feeds validate. When I get the chance to play around with a newsreader (I don’t use them), I’ll tinker a bit more to improve feed usability.

In the process of repairing my RSS files, I’ve gained a new understanding of the ongoing battle over RSS that has been going on since forever (it’s the web version of the Hundred Years War). I’d never really looked at the RSS code or read the specs before, and I was struck by how much more human readable an RSS 2.0 file was compared to an RSS 1.0 file. Reading and then writing my RSS files, I “got” the 2.0 file right away, but I still don’t really understand what the 1.0 file is all about…it felt a little kludgy and inelegant. I also got the sense that the 1.0 format is more usefully complex (powerful?) by default than the 2.0 format.

Obstensibly, RSS files are meant to be written by machines to be read by machines (robot to robot) so human readability shouldn’t matter. But looking at the bigger picture, human readability of something like RSS can be important in developing new RSS-related memes (I’m using meme here in the traditional sense, not in the “it’s on Daypop Top 40” sense that seems to have taken over the blogger mindspace). If hardcore developers of RSS readers and authoring tools are the only ones technically savvy enough to understand RSS files, the pool of potential memes is limited by the size and narrow focus (not to mention, for the most part, gender) of that group. But if the format is fairly human readable (more like HTML 3.2 markup than, say, Perl code), you’re going to get more people from different backgrounds hacking away at it, coming up with new memes that could be useful in the long run.** From that standpoint, RSS 2.0 has the advantage over the more powerful but less readable RSS 1.0.

And just to keep completely rambling on and on, the above is what Dave was getting at with his comment about XML as a literary space (I think). Ben’s rebuttal about namespaces and literature is an important point as well, but namespaces could be seen as locking things down too tightly: poetry only for poets, writing only for writers, code only for coders, etc. And if Shakespeare had been constrained by namespaces, how could he have made all those wonderfully bawdy puns? ( Are we talking rooster or phallus here? Both! Take that, semantic web!)

** For an example of what I’m talking about, look no further than HTML. Like RSS, HTML started out as a fairly simple robot to robot markup language…there was no reason to be able to “view source” or edit the code by hand. It should have worked like:

human -> interface -> HTML markup -> browser -> human

with users never needing to see the HTML at all. Instead, browsers exposed the markup, because it was fairly human readable all sorts of people learned HTML by viewing source, those people hacked away at the code by hand, readers became writers, it got messy, and the web *exploded* in almost every way conceivable.

Reader comments

dowingbaAug 06, 2003 at 11:50PM

Yeah, I learned HTML by "viewing source", and my site is pretty much the opposite of "valid". But I have Mozilla and IE, and I test it on both frequently, so at least it's a bit compatible.

ZachAug 07, 2003 at 12:28AM

What exactly is the purpose of an RSS feed. I'm so lost.

AnilAug 07, 2003 at 12:41AM

Everybody likes the view source parallel between HTML and RSS, but it's not an appropriate comparison. At the simplest level, I know of two sites that manually create RSS feeds. Two. And they're both personal weblogs by people who are picky about hand-coding.

This reality is not due to the complexity of the format. For all the feeds people like to trumpet, whether it be the New York Times, or Amazon, or About.com, exactly zero are hand-made. The visual complexity of an XML format is as arbitrary a manner of judging a format's utility as it is to complain that JPEG images are hard to edit in a text editor. Consider syndication to be a binary format which happens to be human-readable. That's what it is.

The fact that you see the feeds at all is an artifact. MT shouldn't require you to see template code, no site should have anything as ridiculous as a bright orange button that reveals gibberish when a user clicks on it. The only people who read syndication feeds, regardless of format, are people using tools that consume them. You can't create an image by viewing a JPEG in a hex editor. Photoshop doesn't have a "view source" command. And that's okay.

More to the point, syndication is a great thing. But its specifics, its formats, those are arbitrary and best left as hidden details. I could make a full spec for the Microsoft Word format and offer it in the public domain, but that wouldn't make the format open and it wouldn't mean that people would start using Word more often because they could manually turn characters bold. I care about formats my mom can use to express herself, and if I insist on having her manually wrap her words in <tags>, she'll just keep on calling me on the phone, because that's not how humans communicate.

Namespaces and other extensions give us the ability to say *more* about our words, to give us more context. That's the *opposite* of locking people down. And they're optional, so ambiguity can be preserved in content while not being allowed in presentation.

Having only one place to put my text in a syndication format, and then having it be undecided whether it's text or HTML or some gibberish combination of both, and escaped or unescaped... how does that make it View Source friendly? I don't see that.

The problem is, people have (as we always do) confused a technology for its applications. The magic isn't RSS, the magic is syndication. The magic isn't people manually creating feeds, it's consistently being able to pull information into one place. With email, we get the benefits of aggregating information from many sources into one place. We also see the dangers of loosely defining and not checking data, as it becomes trivial to fake or corrupt any part and make things like spam.

The way we make syndication fail, the way we limit it to boys who love tech, is by focusing on its technology. Writers don't want to edit markup, and we've seen that choice over and over. Choosing a format because it can be manually edited is the way to ghettoize syndication forever. Call up the ghosts of WordPerfect and see if people choose Reveal Codes or Word's WYSIWYG.

Instant messaging has no way to view source on its messages, and people are still somehow able to communicate and make it work on phones, PCs, PDAs. Do you know any normal person who views headers on all their email? Who telnets in to port 25? No? And yet, somehow, email has succeded. Even with non-geeks. Astounding. Cars are more reliable now that they're more of a sealed box that people can't tinker with. I don't have to manually set an IRQ to plug in my mouse anymore. Would USB have been better replaced by a series of DIP switches on the front of my machine that gave me easier access to system interrupts?

People are getting this wrong because we're all geeks. A format that's expressive, even at the expense of being opaque to non-geeks, wins. Focusing on the technology and forcing end users to learn a family of XML dialects? That's not what's going to let people express themselves.

(Pre-emptive note to anybody who disagrees: If you believe in human-readability of your markup and in the power of XML, and your website isn't valid XHTML, you're contradicting yourself.)

Jonathan BruderAug 07, 2003 at 4:02AM

Frankly, it seems that both sides are missing the boat. It's true, Grandma will never want to hack around in the RSS 2,385.2 spec. Also true: If one happy Trott or the other hadn't started hacking around, there would be no Six Apart. We can fairly assume that some threshold of complexity existed that, had it been crossed by the related technologies, would have prevented kottke.org and moveable type and slashdot and ama-whatsit.fuk.

To remain forward-thinking on this issue, we must acknowledge increasing convergence between the way machines and humans communicate. I know that I communicate *every day* by wrapping my words in tags, and not to be snarky, Anil, but you do too. The smart thing that you and I do is keep the wrappers and build machines to refill them. It might not be that long until the semantic web *gets* the zen-like essence of the two-fold cock; or until we have rss aggregators implanted in our earlobes. Heaven knows I'm not the only one that's written code on the bus, and to me, that's close enough to tagged. Human intelligence learns from machine intelligence have been converging since the first thrown rock.

Ultimately, there will be either a compromise or a schism between the two camps; let us hope that 1) there is a compromise. Also, there are two types of competing compromise: a compromise of least sacrifice, and a compromise of greatest efficiency. These are not necessarily mutually exclusive, but be they or not, let us put our priorities in the right place and hope for 2) a compromise of greatest efficiency.

Looking for 1) a compromise and 2) that that compromise be the compromise of greatest efficiency, several tasks must be accomplished.

Some seriously accepting brainstorming needs to take place. This is already happening, but it needs to be happening faster, like "finish reading this becursed comment and solve the rss problem now" speed. Past ridiculous speed to ludicrous speed. The brainstorming must focus on the only certain fact: RSS feeds must be read by BOTH HUMANS AND MACHINES, whatever the reason. After all, something like a well-devised post-RSS spec could both bring semantic machine intelligence to par with Willie "the Shake" and bring consumer-level programming intelligence beyond "set the clock."

A Plan:
1) Enlist a group of linguists unassociated with either side of the debate to develop the spec.
2) Propose a syntax based on plain language, and aggregators and readers empowered with adaptive semantic learning software.
3) Fund by normal means.

KOAug 07, 2003 at 4:23AM

I suppose the more readable RSS 2.0 spec is also easier to parse for both man and machine?

Anil: When you open a html page, you do see a whole lot of tag soup, its just that the browser renders the page. So maybe the next version of Mozilla needs to have a RSS parser built in by default, or hands it off to a relevant app. One of the reasons for using XML is that it's easier to make sense of xml files then binary, so it makes sense to try to keep encoding as simple as possible. You give the microsoft word format as an example of how we don't need to concern ourselves with the internal structure/encoding of a format, well that that just be easily given as a bad example, as no one beyond microsoft can deal with Word files properly. It's always going to be better to have an open, easy to use/read format, even for a simple thing like rss, then to have some mess of code only understandable by a program.

You seem to me missing one thing, it is predominantly the *geek* sites/people who are focusing on the tech behind RSS. Once they have it all ironed out no one beyond a few die hards is going to bother. I am sure whatever the RSS spec is, in the end it's the tools like MT which will be implementing it, not end users. The only reason I have an RSS feed is because MT makes it by deafult, and I'm not going to change it unless Tom posts the MT templates he came up with for the RSS feeds, or a new version of MT comes out.

brian wAug 07, 2003 at 8:38AM

So, Mr. Kottke, why don't you use a newsreader? I'd love to hear your reasons for holding out.

jkottkeAug 07, 2003 at 9:49AM

Anil, I can't tell whether you're agreeing or disagreeing with me. From the length of the reply, I'm assuming it's a rebuttal.

For all the feeds people like to trumpet, whether it be the New York Times, or Amazon, or About.com, exactly zero are hand-made.

In a way. But my point is that someone at the NYTimes is hand-coding the templates that their server is using to output RSS...the process is more manual than you assert. Which is, as you say, insane because we should all be using RSS editors to produce valid RSS files instead of mucking directly with the code.

But that's not the way it is. And all I'm saying is maybe that's not a bad thing and that there are advantages to RSS being human readable. As you said, syndication, not RSS, is the point, but maybe syndication takes off if it's human readable and maybe it doesn't take off if it's not.

(Pre-emptive note to anybody who disagrees: If you believe in human-readability of your markup and in the power of XML, and your website isn't valid XHTML, you're contradicting yourself.)

Or just lazy.

megnutAug 07, 2003 at 10:20AM

What exactly is the purpose of an RSS feed. I'm so lost.

The purpose of an RSS feed is to allow other programs to syndicate your website's content. Just like the NY Times put its articles out on the wire (via AP or Reuters, etc.) for other newspapers to print, RSS allows other programs (such as news readers) and web sites to display your content. It's simply a format for syndication.

BenAug 07, 2003 at 10:21AM

If you believe in human-readability of your markup and in the power of XML, and your website isn't valid XHTML, you're contradicting yourself.

That's like saying if you're a usability expert and you have a terribly unusable site, you're... oh, wait.

NickAug 07, 2003 at 10:28AM

Anil, I love your point about the absurdity of brightly colored "XML" buttons on Web pages that reveal a confusing tag soup upon being clicked. My only problem is that I can't figure out a better way to distribute and promote RSS feeds. Sure the tag helps autodiscovery to some extent, but as for methods other than that and the (admittedly confusing) orange buttons I'm at a loss.

jkottkeAug 07, 2003 at 10:39AM

So, Mr. Kottke, why don't you use a newsreader? I'd love to hear your reasons for holding out.

Newsreaders are a fast, efficient way to read web sites. With a few improvements, people with RSS readers could very well be regularly reading 10,000 weblogs (well, getting the gist of them anyway). I don't use one because I miss the context and serendipity I get from reading web sites.

Take a look at Mena's TypePad site. I like getting a dose of Mena's personality (oranges are fun!) while reading what she has to say. The way a piece of writing is presented (roughly the "design" of it) is sometimes as important as what it has to say. And hey, a new photo album! Oh, and she's listening to that CD...I should pop over to Amazon and see if that band's new album is out. You don't get that context and swerve using newsreaders. A rough analogy might be getting a soda from a machine vs. at the friendly neighborhood corner store. The soda machine might be faster and less hassle, but you'll miss out on conversation with the shopkeep and an impulse cookie purchase...it all depends on personal preference and what you're after.

Steven GarrityAug 07, 2003 at 12:39PM

I'm with kottke when it comes to reading weblogs in context (and Mena's site is a great example - is it possible to have a crush on a website?).

That said, I use an RSS aggregator and love it. I use it more to look for updates on the sites I read, rather than reading the full entries in the RSS reader.

Dave WinerAug 07, 2003 at 12:56PM

Jason, congrats -- I agree with (almost) everything you said.

To Anil, it very much matters that the format be transparently simple, even no user ever reads it. That's the protection developers need that the format is implementable.

Compare SOAP to XML-RPC, for example. You could implement all of XML-RPC in a week, some people have done it in a day.

You could never finish a SOAP implementation. The spec is too open-ended, offers too many options. If a group of vendors decide to interopp among themselves and only themselves, they can easily do it with SOAP and no one can say they're not conforming to the spec. This is enough to confuse most people, developers included, and the result is interop among a select group of products. Usually it means no interop.

If everyone did straight Kottke-compatible RSS, there would be no chance for monkey business. I've already seen some discussions where people have said "That won't work with xxx" and people sort of shrug it off as "too bad." Well RSS works with everything. It can't not work with something. It's too transparent.

Now if MT gets bought by IBM maybe you want to play the game that way (I kind of hope not) and maybe that's what Google is doing (a side-deal with IBM maybe, lots of behind the scenes meetings, usually a bad sign). But I hope even if you cash out and get rich that you still want this market to have lots of little guys doing cool stuff. In order for that to happen, imho, it's got to be simple.

Dave WinerAug 07, 2003 at 1:20PM

It sounds like I was saying Jason is to be congratulated because I agree with him. Not so. He is to be congratulated for having the guts to think for himself, and write about a subject that he isn't an expert in. Predictably, the experts swoop in and tell him not to worry about it, they'll take care of it. Imho this is why you can't trust developers. You should insist on understanding it. If they say it would take too long to explain it to you, tell them you have the time. If they still say it would take too long, tell them to simplify it so it doesn't take so long. The only way we escape the pit that HTML browsers fell into is if users insist on not being bullshitted by vendors.

xianAug 07, 2003 at 1:22PM

Jason, curious about why you settled on RSS 1.0 for most of your feeds (given your analysis), and also how you generate two different length excerpts.

Tim BrayAug 07, 2003 at 1:32PM

I don't think it's reasonable that Jason should have to make the effort to understand the differences between half a dozen formats and generate multiple feeds.

Dave WinerAug 07, 2003 at 1:40PM

Tim, I totally agree with that.

That's the shame of RSS.

I'd love to figure out a way to simplify it for him, but it's not in my power to do so.

Honestly, the main power is with Rael Dornfest. Just renaming the RDF format something other than RSS would help sort it out. It wouldn't mean any less support for the format, ironically it would probably mean more support for it, because when confusion is lifted, more people move.

Why should Jason have to make such a political decision? How embarassing for our industry that we expose such confusion to a person who has decided to adopt and deploy our technology.

When I work with a publisher to get their RSS feed together, if they publish an email address in their feed, they get flamed by people who advocate the RDF format. I have to warn them about this in advance, or advise them to leave out the email address.

It's not an acceptable situation. But there's nothing I can do about it other than plead the case.

thewindAug 07, 2003 at 1:42PM

True, people can use IM without knowing the markup, but
how many developers will integrate IM into their software?
I would have no idea because it is closed. If it had something
like RSS then we could all think we could do it which means
we would thinking about lots of cool new projects.

DannyAug 07, 2003 at 1:42PM

I agree with Anil, particularly in re. view source :

http://dannyayers.com/2003/musings/view-source.htm

All this talk of the simplicity of RSS 2.0 only makes sense if you only do simple news-style publishing/reading with it. People now expect more - the jobs feed is a good example. When you try adding extension to RSS 2.0 the complexity rockets, because it has no consistent extension mechanism.

Basically "the protection developers need that the format is implementable" - doesn't work for RSS 2.0 + extensions, unless you implement all the extensions in all possible combinations. Which is a worse case than the one suggested for SOAP.

RSS 1.0 has a formal base and uses a framework, so the complexity of extensions is avoided. All being well Pie/Echo/Atom will include a similar mechanism, whilst avoiding the syntactical ugliness of RDF/XML.

Re. two cocks - the semantic web can already tell the difference, see WordNet (the data is available in RDF) :

http://www.cogsci.princeton.edu/cgi-bin/webwn1.7.1?stage=1&word=cock

"...if MT gets bought by IBM..." - the old Fear, Uncertainty and Doubt trick is getting a bit tired Dave.

Dave WinerAug 07, 2003 at 1:43PM

Tim, it's not a half a dozen formats, there are two formats, that's bad enough, but no need to make it sound worse than it is. I wrote this up in my Political FAQ.

KjellAug 07, 2003 at 1:54PM

Of course it's not reasonable (to make the efford to understand a half-dozen formats). So if everyone will just get together and agree on a format, we'll all be better off. ;-)

(I mean, there's some really nice stuff in the latest Echo snapshot, but really, very little that couldn't be accomplished with an RSS 2.5. But let's not get off on that tangent here)

And incidently, to go with one of the previous comments, I'm in the extreme minority that hand-codes his RSS, mainly because I haven't bothered to script the RSS->XHTML conversion yet. However, human-readability is important for me even for the scripting, because if I have to write the code to output the RSS, I need to understand the markup, tag hierarchy, and so forth. RDF is NOT built for human (programmer) consumption, period. It's a powerful fomat, sure, but it makes me feel icky whenever I am forced to look at it.

Karl DubostAug 07, 2003 at 2:01PM

Jason, do you think my feed is complex? RSS 1.0 only.

I do not write my feed by hand. I'm writing my pages in XHTML 1.0 and I have created an XSLT which generates my feed. The problem is not so much about readability in the sense of the name of tags :) but more in the way you organize it in your file.

For nick and the button: Autodiscovery is the answer, if you put the right link tag in your header.

Dave Winer: The SOAP and XML-RPC comparison is not a good one, I guess. I'm not able to create an implementation of SOAP nor XML-RPC, but I can design an RSS 1.0 feed by hand, It took me one morning to do it. For the RSS 2.0 Spec I guess I will be able to understand it too.

Jason: One day I remember I tried to transform an RSS 1.0 feed in n3 to explain it to someone. I don't know if it can help you.

http://www.la-grange.net/2003/05/05-feed.n3
http://www.la-grange.net/2003/05/06#n3-rdf

The advantage for ***my own use*** is the sharing of information and combination of different source files very easily. The power of RDF is that you have the right to be messy in your updates and for me it's essential.

xianAug 07, 2003 at 2:07PM

Danny, if the cock thing is a pun, then you don't want to distinguish between the two meanings, you want to reflect on both meanings simultaneously, which requires a sort of fuzzy logic-type cognitive dissonance, doesn't it? How does RDF handle puns? And doesn't explaining a joke famously spoil it?

jkottkeAug 07, 2003 at 2:24PM

Jason, curious about why you settled on RSS 1.0 for most of your feeds (given your analysis)

When I first did my feeds with MT, the RSS options were 0.91 and 1.0. Figuring that 1.0 was better than 0.91 based purely on math (1.0 > 0.91), I went with 1.0. I just implemented the RSS 2.0 feed and will probably be adding more options with that flavor.

and also how you generate two different length excerpts.

Here's the MT code:

Full post:
<$MTEntryBody encode_xml="1" remove_html="1"$>

Medium length excerpt:
<$MTEntryExcerpt encode_xml="1"$>

Short length excerpt:
<$MTEntryBody remove_html="1" trim_to="100"$>

The reason I omitted the encode_xml attribute in the short excerpt file is that MT trims the entry down to 100 characters after it encodes it for XML, meaning that if in the process of encoding for XML MT puts opening and closing CDATA tags around the text, the closing CDATA tag is chopped off if it's not in the first 100 chars. Because of the omission, I'm pretty sure the short excerpt file will fail on some newsreaders eventually (I have no idea under what circumstances) even though it currently validates.

Rael DornfestAug 07, 2003 at 2:54PM

Howdy,

First and foremost, Jason, I feel your pain -- both with respect to my own sites and tools and some of the history we've suffered over the past 3 years.

Second, I feel I should respond to Dave's mention of my position:

Honestly, the main power is with Rael Dornfest. Just renaming the RDF format something other than RSS would help sort it out.

I am the (rather inactive, for various reasons) chair of the RSS 1.0 WG (RSS-DEV). I do not have any more of a vote than the rest of the WG members. We've talked about a name name a number of times, taking polls and so forth. If memory serves, on each and every one of these I've voted _for_ a name change, affected in one form or another. The overall vote, however, was not in favour.

I've since posted an extensive essay on the topic of RDF (or my issues with it), RSS 0.91x-style with namespaces (i.e. my original rssx proposal, various suggestions by Sam, Mark, et al for RSS 2.0, and the final landing spot, Dave's reformulation thereof.

I'm not fond of some of the shenanigans that have brought us to 2.0. I am fond of the lion's share of 2.0 -- it's all I was after in the first place.

So, beyond all this, and the various times I've attempted to build bridges (again, see my extensive post mentioned above), what is it in my power to do?

And I ask that honestly.

Rael DornfestAug 07, 2003 at 2:55PM

P.s. I read the fine-print about not using p and br whilst checking out my post posted as it should have. Must wear glasses!

Matt HaugheyAug 07, 2003 at 3:04PM

I've been thinking about the code readability thing a lot lately, I've had to hand-code RSS templates for MetaFilter, Blogroots, and my old personal site because I didn't have any slick weblog software that did it for me. It wasn't too hard to download a feed and basically copy everything over. While I've only done 0.91 feeds, the 1.0 feeds do look more complex, but not so much that it would be impossible to create 1.0 feeds as well. I would guess it's another few minutes of work reverse-engineering what database information goes where, and preparing the timestamp formats.

When it comes to the question of whether or not making readability a design goal or requirement for a syndication format, I'm not sure I support that. HTML is simple yes, but it is also powerful, and its simplicity isn't affected by the numerous extensions to HTML. Editing raw HTML was also the only way to author documents online for many years.

When looking at syndication formats, things are a bit different. While readability is great the one time I as a developer need to create a feed, the vast majority of people using and creating syndication don't have to do that. The audiences are clearly different, with HTML being something anyone and everyone can/should write, and syndication formats being something a small subset of developers have to deal with, while everyone else just uses their automatic tools. Also, I don't see the current crop of syndication formats having the same extendible power found in HTML. With HTML I can write a three line HTML page that says Hello World, and I can also (by using javascript) code the entire game environment of Wolfenstein in under 5k of code. Now the question is, as a HTML writer, I can understand the Hello World example, but can I make heads or tails out of the Wolfenstein one? Should I be able to (because I can't understand a single line of it after "script").

Then the question becomes, are the syndication formats flexible enough to offer both human readability and power uses? Because if the capabilities of a language or spec are diminished due to the readability requirement, then it seems like the trade off just isn't worth it. Syndication formats were indeed designed for robots to talk to other robots in robot talk, and while it'd be nice if developers could understand them and reverse-engineer them by hand, it's not worth it if the vocabulary is limited due to this constraint.

I wouldn't want to be in the situation someday where my bank can't offer me a syndication feed of my checking account activity because the technology didn't allow encryption or security key exchanges in lieu of making the format readable by humans.

jkottkeAug 07, 2003 at 3:06PM

Rael, I fixed the spacing in your comment.

nickAug 07, 2003 at 3:06PM

anil's post is the type of close-minded mentality that bothers me with tech-heads. i get real tired of people trying to suggest that we must do something this way or that.

from what i can tell, all kottke was saying was: "wouldn't it be cool if the markup was understandable to a human." there's nothing wrong with that. that doesn't hinder anyone's ability to use it the way it is right now. all it does is open another door so another handful of people can understand how it works. and that's really beneficial.

to start getting into things like, "If you believe in human-readability of your markup and in the power of XML, and your website isn't valid XHTML, you're contradicting yourself," is just being argumentative. i understand anil works for a company who's goal (like many others) it to make a type of technology easy to approach and use for many people. that is a great thing - for some people. soem people like to tinker, and that's okay, too.

DannyAug 07, 2003 at 3:08PM

xian - wasn't me dissected the pun!

Heh, pun detection might make quite a good Semantic Web challenge - same word, two different meanings used simultaneously...hmm..not easy!

There's quite a bit of disinformation creeping into this thread -

RSS 0.90 != RSS 0.91 != RSS 1.0 != RSS 2.0 etc. The version tag is a big hint.

Rael Dornfest doesn't have the power to change the name of RSS 1.0, the standard is maintained by a community group. On the other hand Dave could have single-handedly prevented any confusion by making RSS 2.0 a unified format.

Anyhow, as Kjell suggests, we should have a unified format (and API) in the near future anyway.

I think it extremely unlikely that any advocate of RDF would get an email address out of a feed to flame the publisher. Perhaps we could see some evidence?

Dave WinerAug 07, 2003 at 3:40PM

Rael, if you're serious about asking what you can do, you could go a bit further, and write an essay and publish it on O'Reilly alongside the ones that Edd has been posting lately that say RSS 1.0 is the way to go. O'Reilly still seems to want to have its way in this space, or maybe it's just Edd, but do something to balance the confusion coming from that corner. As my colleague John Palfrey said today, think about all the things we have to do, why try to stop something good from happening. Your use of the word shenanigan here (I think) to describe my work is an example. Why do you say things like that Rael? What's the point?

Greg GershmanAug 07, 2003 at 3:44PM

Most people dont create HTML by hand (anymore). They use a tool. RSS is usually created with a tool as well.

As the tools grow, we have the ability to make the Web more understandable and easier to navigate by making the data better, in as painless a way to the user as possible. Why should we compromise on the power of RSS 1.0 simply because RSS 2.0 is easier to read?

A nice example, and this isnt talked about much, is that RSS 1.0 allows you to represent an individual post indepedent of a channel. This is very powerful, yet there is no way to do this in 2.0 (that I know of). RSS is a syndication application, while RDF/RSS 1.0 is more of a technology. When all is said and done, we want the best technology, not an application, as the foundation of the Web.

jkottkeAug 07, 2003 at 4:01PM

Dave said: Your use of the word shenanigan here (I think) to describe my work is an example. Why do you say things like that Rael? What's the point?

I don't want to speak for Rael, but I think he was describing everyone's actions re: the development of RSS, not yours exclusively.

Also, Rael & Dave, you're starting to drift into a conversation that might be best conducted via email and not in this thread. Thanks.

Rael DornfestAug 07, 2003 at 4:09PM

Thanks, Jason, for clearing that up. And you're absolutely right about us taking this offline. I had, of course, to respond to Dave's initial comment (and did it in as peaceable manner as possible), but there's no need to continue littering your lovely site.

I now return to my RSS-less quietude and you to your thread interrupted.

HorstAug 07, 2003 at 5:41PM

Um, sorry to interrupt your discussion, but I've been following it for a while now, and not being a tech geek, my question as nothing but a very 'umble owner of a weblog is really which spec I should be using. Are there any consequences if I prefer RSS 2.0 over 1.0 or vice versa? At the moment I'm offering both formats, but I really feel I'm unnecessarily confusing my readers by offering them two RSS formats without knowing myself just why I'm doing this. Help, anyone?

GregAug 07, 2003 at 5:57PM

Let's define which humans we're really talking about when we say "human readable." Most humans don't want to read an RSS file, nor do they want to view HTML source. Only a very small subset of humans (which I will call "geeks" for lack of a better term) are concerned with the underlying formats.

You think the web took off fast because of "view source"? Bah! Think of what would have happened with web publishing in 1995 if there had been an effective way of publishing web sites without having to reverse engineer HTML from "view source" and hand-code your own pages. The web didn't succeed because of "view source"; the web succeeded in spite of "view source."

The reliance on people who would reverse engineer HTML source and the concurrent lack of effective personal publishing tools held back the web. Weblog tools revolutionize personal web publishing because they overcome the "view source barrier;" they allow your typical, non-geek human to publish to the web simply and effectively without ever having to view HTML source to do so. Why can't we expect the same sort of transparency from tools that produce and consume RSS feeds?

So when we speak of "human readable" HTML or RSS, we're actually talking about "geek readable" formats. And I'm really not concerned about geeks. Anyone who makes the effort to understand RSS certainly has the skills to understand, with a little more effort, RDF. I'm concerned about the users who want to be able to use the web effectively without having to open the hood.

I fear that the adherence to "human (aka geek) readable" as a threshold not to be bypassed puts us dangerously close to getting stuck in the same kind of human-hostile web development environment of 1995, where users who are already experts in carpentry or law or teaching or pastry-making are expected to "view source" to learn complex new skills to participate in web publishing.

Anil is right: the bright orange XML button is a hostile user interface. It implies that you already know what the acronym means and what to do with it; it communicates nothing to non-geek humans. If non-geeks learn to use it, they learned in spite of the interface, not because of it.

The world doesn't want "human readable" formats and orangle XML branding; the world just wants functionality -- "syndicate my content" and "aggregate these other people's content." They don't care about the formats, and they shouldn't! They should care about dance and medicine and geology and pastries and all the other human specialties, and about communicating all that interesting stuff with each other via the web.

Geeks and developers should care about getting the technology out of their way so they can do it.

(FYI, this exact same debate took place on my weblog last week, albeit with much less heavy-hitters involved.)

Ken MacLeodAug 07, 2003 at 6:37PM

I saw this mentioned a couple of times with no specific answer: Do weblog publishers need to provide both an RSS 0.9x/2.0 feed and an RSS 1.0 feed?

The answer is "no". A publisher can choose whichever format better suits their publishing needs. Clients will read either format.

nickAug 07, 2003 at 6:52PM

you are dead wrong about 'the view source phenomenon.' that is precisely why people became fascinated with the web. it was a love for tinkering an getting under the hood at that point. self-publication has *become* a real draw for people but only marginally inspired people like myself to start creating web pages in 1995. it's the same fascination people had and still have with building their own PCs, cars, houses, etc.

specifically to the web, the ability to understand a tool and pervert it to do different uninended things was only possible by making the source accessible. all the hacks and crazy javascript functions people wrote wouldn't have been possible without the ability to get in there and make a mess of the code.

again, i think that tools to make it easy for people who don't want to see the mechanics of the web are great to extend tools further in society. and that's why developing tools like MovableType is wonderful, for example.

what i am hearing here is a little different. it sounds as if some people are pushing tools that have readability. and others are pushing extensibility. both are necessary. one for one type of user, one for another. i value both usertypes because they have helped develop technology for the web in their unique way.

personally, if the code for a JPEG binary was remotely understandable, you could probably make some interesting stuff. someone would make some interesting imagery i bet. that ablilty should be cherished, not shunned just because you might not find it interesting.

WilhelmAug 07, 2003 at 7:46PM

With all due respect, you're both wrong about the 'view source phenomenon' and the web taking off, because you're talking about totally different audiences, different times, and different definitions of taking off. So there.

'View source' helped get the first wave of unix techies on board and interested enough to make it into a big, widely implemented deal.

Dreamweaver, Word, Frontpage and their ilk snagged another gigantic wave of not-quite-techies who were more interested in the design than the implementation.

Weblogs, 6 Degrees, Livejournal, Friendster and their ilk snagged another gigantic wave of non-techies who were more interested in the pragmatics than the design or the implementation.

The momentum is building off that last wave towards the next wave: the people who are interested not in the pragmatics, or the design, or the implementation, but rather the innocent, naive, un*intentional* use.

Consider TCP. The first several hundred people were undoubtedly very excited about signalling strength, dB loss, reflectivity, etc., and so on. The next several thousand people were pretty interested in how to build a NIC. The next several million people really liked buying NICs so they could put them in their server and run networked apps. The next several billion people are pretty sure they know how to open up the Internet and read it.

Anil, bless his heart, is absolutely right. Nobody in wave 4 is going to give a damn if they can read the source. In fact, we've lost in a major way if you can still read the source, because we didn't create an expressive enough medium for them.

The RDF guys are right, but they're probably one wave too early.

GregAug 07, 2003 at 7:50PM

Nick, "view source" is precisely why geeks became fascinated with the web. Now geeks are people too :-), but they're a tiny subset of "people."

People became fascinated with the web (or, more generally, the Internet) because of easy-to-use email (e.g. AOL, Yahoo mail, Hotmail), instant messaging, content that was personally useful to them (job classifieds, personal ads, recipe archives, hometown newspapers, online courses, movie listings, etc.), e-commerce (eBay, Amazon, etc.), and so forth.

More to the point, people only began to become fascinated with web publishing when weblog authoring tools made "view source" irrelevant to web publishing.

I don't think people will become fascinated with content syndication until the technologists can manage to make "human-readable RSS" irrelevant to content syndication and re-focus their efforts on making human-usable software.

Nick, "view source" is precisely why geeks became fascinated with the web. Of course, geeks are people too :-) . . . but they're a tiny subset of "people."

People -- typical consumers, not technical consumers -- became fascinated with the web (or, more generally, the Internet) because of easy-to-use email (e.g. AOL, Yahoo mail, Hotmail), instant messaging, content that was personally useful to them (job classifieds, personal ads, recipe archives, hometown newspapers, online courses, movie listings, etc.), e-commerce (eBay, Amazon, etc.), effective search tools (Google) that let them find all that stuff, and so forth.

More to the point, people only began to become fascinated with web publishing when weblog authoring tools made "view source" irrelevant to web publishing.

I don't think people will become fascinated with content syndication until the technologists can manage to make "human-readable RSS" irrelevant to content syndication and re-focus their efforts on making human-usable software instead of human-readable formats.

GregAug 07, 2003 at 7:53PM

Oop! Note to self: when you copy and paste a draft from your text editor, remember to delete the previous stuff in the textarea. Doh!

AnilAug 07, 2003 at 8:41PM

it's the same fascination people had and still have with building their own PCs, cars, houses, etc.

My point exactly. What percentage of people build their own houses or cars? Would you forgo airbags because they're damned hard for built-it-yourself drivers to install in their steering wheels?

Crap, I've been suckered into another car analogy, and I haven't even owned a car in years.

And I don't believe I was saying "we must do something this way or that", I'm just stating my opinions about user experience with technology, the same as everyone else here.

I have an interest in tools, of course. But I'm certainly not against tinkering. We've probably got more small developers like us tinkering with our system than anybody. I just think that the audience of people who just want to express themselves or connect with others by reading their work is at least 2 orders of magnitude larger than the number of people who find XML markup, in any format, expressive.

malAug 07, 2003 at 9:35PM

Anil said... "Instant messaging has no way to view source on its messages"

Contrair Monfrair sp? heres a snip from one of my saved logs...
Sorry I don't know how the editor will present this stuff...

mal: duh

maa:

This is from a contest on Long Island. The requirements were to use the two words Lewinsky (The Intern) and Kaczynski (the Unabomber) in
a limerick.

Here are the winners:

malAug 07, 2003 at 9:40PM

Ok, maybe I'll have better luck showing what I mean by replacing the > with []...

[BODY BGCOLOR="#ffffff"][B][FONT COLOR="#ff0000" FACE="Times New Roman" SIZE=3]mal[!-- (11:13:03 PM)--][/B]:[/FONT][FONT COLOR="#000000"] [B][/FONT][FONT COLOR="#008080" FACE="Verdana" SIZE=2]duh[/FONT][BR]
[BODY BGCOLOR="#ffffff"][FONT COLOR="#0000ff" FACE="Times New Roman" SIZE=3]maa[!-- (11:14:58 PM)--][/B]:[/FONT][FONT COLOR="#000000"] [BR]
[BR]
[/FONT][FONT COLOR="#0000ff" FACE="Century Gothic" SIZE=2]This is from a contest on Long Island. The requirements were to use the two words Lewinsky (The Intern) and Kaczynski (the Unabomber) in
a limerick. [BR]
[BR]
Here are the winners: [BR]

Dave WinerAug 07, 2003 at 11:07PM

Okay Anil let's try this another way.

If you have a choice between readable and not readable, would you opt for the latter, all things being equal?

Ken MacLeodAug 07, 2003 at 11:13PM

Apropos of nothing, I present Ken MacLeod's RSS Political FAQ.

TremendoAug 08, 2003 at 12:47AM

Dave: If you have a choice between readable and not readable, would you opt for the latter, all things being equal?

Do you mean like a choice between SVG and Flash? I do love SVG, even use it (a little bit) *because* it's human readable. But you and I know that just about everyone opts for the non-readable Flash.

I agree with Matt, the format is meant to be understandable by software bots. It's not necessary to limit it's power just so that the candidates for CA governor can read it it they happen to stumble on it.

Jonathan BruderAug 08, 2003 at 2:13AM

I know I was a little obtuse when I responded to Anil, so please bear with me this time:

Before I came back into this conversation, I reread what Jason wrote. I reread Dave's epiphony and Ben's response. I drew the following conclusions.

THE STATE OF THE BLOGOSPHERE

It's no trick of logic to see that both Dave and Rael are mired by their environments. Bogged down, if you will. From what I gather, most people deeply involved with this issue seem to feel this way.

Rael goes as far to openly acknowledge this. Additionally, if Dave's high-pressure environment didn't affect his decisions, I can't imagine what could. After all, we draw our conclusions from the sum of our experiences.

With each party so embattled, it is pretty difficult to foster resolution.

WHY SOMETHING MUST CHANGE

Both parties want a fast solution. No such solution can arrive without a significant reduction in the tension between those who desire extensibility and those who do not. For both Pete's and Christ's sake, it says something (not sure what =] ) that history is being hammered out in Jason Kottke's comments. Fine comments, but in the comments nonetheless.

Finding a route to accord is the first step to providing the best possible evolution of syndication technology. Stop looking at trees! There is a forest.

HOW TO WIN THE METAGAME

The tension that divides the fronts is a symptom that, metascopically, something is about to give. Remember what Jason said about seeding memes? Strictly speaking, the result of this war will cause a lot of ripples in the pond.

The issue really boils down to whether syndication should further diversify or further unify. This is a rift in the blogosphere, and the noosphere. To heal this rift, it would help to have a strategy.

This issue is much more important than we yet know. As we approach a time when syndicated content is pervasive not just to geeks, but also to consumer markets, we must make decisions about the way we will interact with machines in fifty years. The community of online publishers must create a vision of the future.

Then, we must shoot for that vision as if it were part of the FY2003 project plan. We must accomplish it as efficiently as possible, without delay. If we take our eyes from the metascope for even a second, the world will end up with another hollow standard, subject to reopen for the same argumentative volley when upgrade time rolls around. That future is dangerously close.

Instead, let's think about what is possible if we elect some other, as-yet-unspoken-for alternatives. Outside the box. This dialogue is innately too far to the fringe to dismiss solutions because of their dissimilarity with previously implemented solutions.

What if we build aggregators and readers smart enough to do some simple context discernment without the explicit declaration of a namespace?*
Wouldn't that make us all happy? What if we could write our feeds in plain, un-marked-up text? Or a mixture of several languages, for that matter?

There are other solutions to be found, and those that have participated in this comment thread must lead the way in the discernment of those solutions, because innovation must be championed by giants to be successful. If we shift our focus from the positions of the two camps to the two directions in which those camps travel, surely there is a Rome where both roads meet.

The key now, it seems, is to find that Rome, and spare naught to get there.

=] Jonathan Bruder

* The machines we are talking about are uniquely positioned to receive data for dynamic learning -- they have thousands of publications to draw on. This idea, and hundreds like it, are not too far fetched.

AnilAug 08, 2003 at 2:36AM

If you have a choice between readable and not readable, would you opt for the latter, all things being equal?

Absolutely. I 100% support that, and I think it's critically important. I just think that, all things being equal, work should be tough for developers and easy and flexible for users, instead of the other way around.

Karl DubostAug 08, 2003 at 5:15AM

Just a small comment. Start of this thread. RSS 1.0 is not unreadable.

Dave WinerAug 08, 2003 at 5:45AM

Anil, I'm glad you agree that all things being equal readable is preferable to not readable. Why don't we pass that by the Blogger folk and see if ithey agree? Maybe we've got a first design principle. We need more of those, written down, committed to.

anandAug 08, 2003 at 8:05AM

RSS is a bandwidth hog which ever way you look at it. What is required is a ping style notification mechanism ( a pubsub notification mechanism ), something like mimir on top of jabber, as mentioned in the article mentioned on http://www.jabber.org/people/interviews/ralphm.php.

Dave WinerAug 08, 2003 at 8:34AM

Anand, RSS 0.92 defined a ping protocol for notification. It's in RSS 2.0. It's the cloud element. It works. The problem's been solved. If someone else wants to run a notification server, it got too expensive for UserLand, any Radio site can participate, and it's easily implemented for other brands of blogging software.

jkottkeAug 08, 2003 at 11:20AM

Anil, you think human readability of RSS (or any syndication format for that matter) should be avoided, yes? Then why not compile the code into binaries for syndication purposes, like programmers do when developing applications? What's with all these text files lying around on computers? HTML and XML as well...they don't need to be human readable in use either. Compiling everything would avoid the problem you're talking about of all those pesky humans sticking their noses into situations where they don't belong.

WilhelmAug 08, 2003 at 12:14PM

Anil didn't say that, Jason. It's your blog, but you might consider failing to knock down straw men.

AvdiAug 08, 2003 at 1:59PM

I disagree with the idea that RSS 1.0 vs. Everything Else is the battle of unreadable vs readable. I don't think I'm an exceptional programmer, and the first RSS 1.0 feed I saw didn't strike me as "unreadable", it struck me as "information rich". Compared to it RSS 2.0 and Atom are marginally more readable on first glance, but we're talking about a fractional difference. And RSS 1.0 conveys more information, because explicit namespaces give the first-time reader some context for the terms being used. Even if you have no knowledge of RDF you can learn enough from a 1.0 feed to mimic it successfully; a few hours of study is sufficient to fill you in on the data model behind it and give you a good idea for how to build on it.

A lot of people seem to think the difference is on par with an opaque binary format vs. an XML format. I'm the first to support a programmer-readable format over an opaque format; but that's not at issue here. What's at issue is a slightly-more-readable XML format, which has a somewhat less well-defined data and extension model; and a slightly less readable XML format which is grounded in a very well-defined model (RDF). Given the choice between those two, for this programmer the preference for the latter is easy. With RDF I know not just how the data is serialized, but how the data is intended to be viewed abstractly, and I have a box full of tools at my disposal ready-made for manipulating that data and merging it with other sources of data. I can spend my time thinking about interesting uses of the data rather than how to interpret the data. I wouldn't trade all semblance of readability for this, but thankfully I don't have to. In an industry that forces me to deal with things like cramped packet bitfields, inscrutable CSV files, and procmail scripts, RDF/XML is, frankly, one of the more readable and reflective formats around.

nickAug 08, 2003 at 2:22PM

anil, i too got the impression that "you think human readability of RSS (or any syndication format for that matter) should be avoided" rather than embraced equally. (perhaps this is why i agreed with jason's comments.) i don't think that just because typically developers have had a more difficult task, it wouldn't be wise to make that task easier.

it really doesn't matter that a small number of people participate in the type of tinkering i am referring to. we're probably in that situation because code in any situation is not terribly readable. but right now, making the ability for more people to tinker with a type of technology doesn't come at a cost to extensibility. i don't see how this negatively impacts the power of the tool.

WilhelmAug 08, 2003 at 2:30PM

It's all about the tradeoff. In general, the historical precedent has been that protocols which are expressive and featureful but hard to read 'win' over protocols that are less expressive, less featureful, but easier to read. This might be seem to be a paradox, but it derives from the fact that the people who use the protocols outnumber the ones who read the protocols by many orders of magnitude.

Karl DubostAug 08, 2003 at 2:50PM

Readability?

karl% dict rss

DICT error: 552 no match [d/m/c = 0/0/106; 0.000r 0.000u 0.000s]

karl% dict rdf

4 definitions found

*** Source: The Free On-line Dictionary of Computing (09 FEB 02) ***
RDF

{Resource Description Framework}

Oooopsss. I wonder when people will stop to think that RDF is less readable than XML. It's a myth.

As Advy says, rss 1.0 is information rich and in some cases, it can be very useful. For my own defence, I must admit that I was not convince at the start, since I'm using it for different applications, I'm all for it.

RDF simplifies the management of your data, it doesn't make it more complex. I would even say that's harder to share data between two XML formats. Sharing two sets of RDF information is straitghforward.

jkottkeAug 08, 2003 at 3:23PM

Anil didn't say that, Jason. It's your blog, but you might consider failing to knock down straw men.

Simmer down, Wilhelm. I'm not trying to knock anything down, I'm asking questions. I really would like to know why RSS isn't compiled for syndication...that would seem to offer some advantages over leaving it in plain text (smaller files, ensures machine readable only (no view source), etc.). And he did say not human readable feeds was "critically important"...which is actually stronger than what I said: "should be avoided".

AnilAug 08, 2003 at 4:13PM

I think human readability is a vague enough term that people would disagree on it's meaning, so I'm loathe to get into a pissing match about it. That being said, it's a nice and important goal. When it conflicts with functionality, flexiblity, expressiveness, or user experience, I think it should be placed second.

Is that any clearer?

We shouldn't binary encode because there are tons of tools implemented already for the processing of standard XML. And XML compresses on the wire pretty effectively.

WilhelmAug 08, 2003 at 4:43PM

Sorry about misunderstanding your intent, Jason. The Intarweb is so frustrating. If only there were a way to attach semantic metadata, like intent, to web pages.

MarkAug 08, 2003 at 6:40PM

re: "I really would like to know why RSS isn't compiled for syndication"

Try mod_gzip (or mod_deflate)

nickAug 08, 2003 at 7:58PM

i guess i have to agree to disagree. for my point of view, i do think readbility is as important as functionality, flexiblity, expressiveness, or user experience - precisely because it helps them develop. but assuming i am your shoes, and i think that readability is a second priority - how does it conflict like you say? how does achieving readability hinder functionality, flexiblity, expressiveness, or user experience?

anandAug 09, 2003 at 3:05AM

Dave : The cloud element looks promising. Maybe, I will write a plugin in nucleus to test it out.

Wonder why nobody else has played around with it.

Matt HaugheyAug 09, 2003 at 2:40PM

how does achieving readability hinder functionality, flexiblity, expressiveness, or user experience?

If readability is the core design goal, features have to go on the chopping block. Go back to my earlier comment, how could I get my bank to provide a syndication feed of my last ten checking account transactions, and have that code be human readable by geek developers? What if I wanted to provide a syndicated feed for paid subscribers only? What if I envisioned a feed from my cell phone provider that gave me my last ten placed and received calls?

All the examples I just gave add new complications to syndication. User management, security, encryption, and authorization are things not traditionally found in RSS 2.0. So while human readability is great if you can get it, I'd prefer as a user to have the most powerful and wide-ranging applications possible.

AvdiAug 09, 2003 at 2:59PM

As a developer, all I want is sufficient readablility. To me, that doesn't mean "the simplest possible format", it just means that it's readable enough I can eyeball it and tell what's going on when it doesn't work the way I expected. So a binary format imposes a hefty barrier to my getting involved in something. On the other hand XML, namespaces and all, meets and exceeds my sufficient readability requirement. I click on the feeds listed above and find no signifigant difference between versions except that the 1.0 feeds contain more information. I honestly wonder what difficulty RSS 1.0 feeds pose to the programmers who find them "too complex" - are you trying to parse them with regexes or something? What's the barrier?

This thread is closed to new comments. Thanks to everyone who responded.

Stay Connected