cartulary v0.7.1 Release Notes

Release Date: 2017-11-22 // over 6 years ago
  • 🚀 This release is mainly focused on the aggregator and how it handles dead feeds. I define a dead feed as either having a fatal HTTP response code (400's or 500's) or having an XML parsing error that prevents any usable data from being extracted from the feed.

    We keep an error counter for each feed and increment it by either 4 for a 4xx response, or 5 for a 5xx response. We increment by 1 if the feed had a fatal parsing error. Once per day, the "clean_bad_feeds" script runs and looks for a set of conditions, which if all are met, the feed is marked as dead in the newsfeeds table.

    The death conditions are: error count greater than 1000 [AND] the last http response code was 400 or greater [AND] the last time a sub-400 response code was received was more than 30 days ago.

    Upon death, the feeds are still not deleted. Instead they are just marked dead. This way their historical items are still searchable, they just don't clog up the aggregator anymore.

    I use some fake HTTP status codes to indicate other errors when pulling the feed like if the connection is reset or the request times out. These are in the 900's and each increment the error counter by 1. 'ENOTFOUND' increments by 10, because the hostname doesn't exist anymore.