Archive for CleanUp

feeds.wordpress.com

Wordpress.com feeds got a rather ugly feature (not available from the webview) recently: they add a load of links and images to the end of each items content.

Currently its 8 images, only one is of limited use, showing the current number of comments to the post. The other images are transparent.

One can only speculate on their purpose by reading the URLs. They are called ‘categories’, ‘tags’, ‘delicious’, ’stumble’, ‘digg’ and ‘reddit’.

The last image points to ’stats.wordpress.com’, likely a counter for tracking feedreaders etc.

Overall, these images load rather slow, taking up to 8 secs compared to 0,3 secs of an average icon download. That’s the main reason why i’ve decided to remove them from the aggregated planet posts.

In case you are curious, here is the regex to clean out the html source:

re.compile('<[img |a ]+.*src="http://[feeds|stats]+.wordpress.com/.*?>(</a>)*')

Comments off

atom:content

I’ve added a few blogspot feeds to planetzope.org recently and realized, that blogspot feeds completely dropped the item summary and instead provide item content only.

I’ve changed the aggregator accordingly - now rss, rdf and atom content information is included with the feeditems. A fallback to description/summary is still provided.

Comments off