On pubsubhubbub (Part 2) – Get with it, PuSH, you’re supposed to be realtime.

Or at least that’s what I thought you were supposed to be. But that’s not what I’m seeing. What I am seeing is the groundwork for a real time network– link rel=”hub” has been added to every Blogger feed and every FeedBurner feed, no? What I am not seeing are the real time feed updates coming from that network.

I setup My Status Cloud as both RSS Cloud and PuSH enabled. But when I post a new Tweet or cloud message, I can only rely consistently on Dave Winer’s RSS Cloud hub to pass my update information on. The “official” pubsubhubbub server is hit and miss. Whether it’s rate limiting or being lazy, in my little decentralized 140 character network, not every status update is pushed to me immediately by PuSH. Some are grouped together after two updates have been sent. That’s not real time.

I’ve subscribed to many feeds that are PuSH enabled through My Status Cloud. Or at least the FeedBurner feed published indicates that they are. When do I get the updates? Often a large amount of time after they are published. Whenever I’ve gotten a notification from an RSS Cloud server, it is usually within seconds, sometimes up to a minute.

You. Are. Random. That’s the perception I have. There are so many feeds that I’m passed PuSH notifications throughout the day for – with old content and no new content. Fat pings, useful? Yes. More fat pings than necessary? Not so much.

I’ll be honest. I haven’t taken the time to read through the complete documentation to see if I can figure out how the server end of things is supposed to work behind the scenes. When I decide to build a server, I will. Maybe I’m missing an explanation for the sporadic-ness that is coming out of there, but it really should be resolved. If we’re going to be real time, let’s be it already.

11 Replies to “On pubsubhubbub (Part 2) – Get with it, PuSH, you’re supposed to be realtime.”

  1. I have the opposite problem. The “official” rssClolud server goes down and misses updates. I guess there are rough edges around all new servers.

  2. The two circumstances are different. There are always rough edges, but….The rpc.rsscloud.org server you are probably referring to is maintained by Dave with no promise of uptime (correct me if I'm wrong, obviously) as a place to test your implementation. It is possible that the server can be rebooted for changes at any time. No big company is providing a constant connection here.The WP plugin for rssCloud creates a server on every blog that installs it. Problems have been few and far between with this.The pubsubhubbub server hosted by Google has been pushed into every blogger feed and implemented heavily in FeedBurner feeds by Google. It pushes updates from multiple IPs, indicating a network of hubs that they are using to guarantee uptime. By doing this, they have told me that they are ready for real time.

  3. The “official” PubSubHubBub server that you are probably referring to is the server written by Bret Slatkin and deployed to Google App Engine. You can find the source for the server at the PubSubHubBub website.A quick browse of the source will show you that it's a simple app. There's no “network of hubs”. All App Engine applications use multiple IP addresses for connections. The number if IP addresses used by these apps should not be used to infer a guarantee of uptime.By the way, an application running on App Engine can not subscribe to rssCloud because these applications do not use the same IP address for inbound and outbound connections.

  4. Yes, referring to the official PubSubHubBub server that Google employee Bret Slatkin wrote and deployed. The one that Google then Pushed to their Blogger feeds.Got it, no multiple hubs. But by using multiple IPs from the AppEngine network, a distributed network and uptime is inferred.My point is – if you're going to push yourself into millions of feeds as a solution, then you are ready to go.BTW, I'm not arguing against PubSubHubbub here. I want it to work. That's all that I'm getting at.

  5. All applications running on App Engine are forced to use multiple IP addresses. You should not infer that applications running on App Engine have a guarantee of distribution or uptime. Recent blog posts from the App Engine team indicate that applications run in a single data center at a time. The apps are single homed, not distributed across multiple data centers.Plugging the hub into many feeds seems like a great way to bootstrap and test the realtime cloud, even if there are some rough edges.

  6. Cool, I'm not a big AppEngine guy. Not trying to argue the architecture. My perception is that it's stable and stays up.Plugging the hub into many feeds is a great way to test. But, two things.1) When you're big like Google and you announce the implementation, the perception given to me, the developer, is that you're ready.2) When I, the developer, do start using it, I'll get a perception on how it's working and I'll share it.If others have details on how it's working for them, I'd love to share examples. I'm having fun working with both rssCloud and PubSubHubBub and coding > arguing.

  7. Google did not make any announcements about Brett Slatkin's hub. It's not a Google product.There are rough edges on all the work that's going on. We should cut everybody some slack including Brett Slatkin.

  8. Google did make announcements though.http://adsenseforfeeds.blogspot.com/2009/07/wha… – Google announces PubSubHubBub support in FeedBurner feeds for AdSense, notifying a “Google-run Hub”.http://googlereader.blogspot.com/2009/08/pubsub… – Google announces PubSubHubBub support for shared items in Reader.http://buzz.blogger.com/2009/08/blogger-joins-h… – “All blog post feeds now contain a “hub” element, and will ping Google's hub on every post update.”http://googlecode.blogspot.com/2009/08/towards-… – “we have gone a step further and added PubSubHubbub support to Google Alerts.”I'm not trying to push any blame on Brett. Google owns this now and should help any issues along. I've written about some of the issues I'm seeing with Google's PubSubHubBub Hub. Constructive discussion about the issues I've been seeing is definitely welcome.

  9. Hey Jeremy, Thanks for reporting your experiences so far. How long was the sample period for your testing? I wonder what your results would be over the course of a week. It would be great to see some more data on end-to-end latency, retry attempts, duplicate deliveries, bandwidth, etc, especially if it were broken down by feed type.Otherwise, what is your subscriber's average latency for handling notifications? The reference Hub is defensive about delivering to subscribers that track many feeds and are slow to respond. So, if you're taking over 5 seconds, you may see slowdowns. It's best to process incoming notifications asynchronously if you can.

  10. Hi Brett, thanks for stopping by.This comment stream aside, the original post was written more as my perception than science. Rereading, it's a pretty unorganized perception. Ahh, late nights. There's more too the rambling, but if you come away with one thing from the above, it's that I don't see the FeedBurner stuff being real time as I thought it would.I'll be grinding through the data more closely as the week goes on. The initial conclusions are based on a snapshot look at the initial 24 hours or so of use.I can't answer to the latency yet, but I also can't imagine it being too high. Not a perfect answer, I know, but the server is on Amazon's EC2 and overall latency (network and system) seems low. Almost the only traffic coming in is from rssCloud and PubSubHubBub notifications. From watching Dave's rssCloud log (light pings), the time posted is usually less than .300 seconds.The feeds that I've noticed the most issues with are from FeedBurner. My guess is that the delay and re-pushes are due to the ping scheduling between publisher->FeedBurner->PuSH. Once publishers start pinging directly to the hub instead of relying on a middle man, I would think that these issues clear up. See previous post directed at publishers.The feeds that I've noticed the best response with are from Google Reader shared items. Again, perception, but things seem to run pretty smoothly here.My generated twitter-link feeds seem to be sporadic when done in quick succession. @mmastrac pointed out after I posted last night that this could be a “race” between the feed writing and the hub reading if things are happening quickly enough. I still need to explore that.You've given me a bunch of stuff to look at. I'll do what I can to start logging and parsing all of it and then provide the results. Hopefully I find a few problems with my code to fix along the way. 🙂

Comments are closed.