We're reviewing our approach to archiving the college's news stories and I'm curious as to how other folks are doing it.
Right now our news site (http://www.lafayette.edu/about/news/) is a WordPress site with about 7 years of stories in it.
We have an older, pre-WordPress site as well that has another decade or so of articles in it.
We're debating a few different approaches, but the two big ones are:
1) Maintain a comprehensive archive of all the articles
2) Maintain just the last X number of years (where X = 3 or 4 years) and come up with a process for moving articles into a less prominent archive inaccessible to Google.
The primary concern with #1 is that the comprehensive archive would inevitably degrade (e.g. bit rot from bad links, out of date information, etc.) while polluting our search results with less relevant information (e.g. an article about a faculty member's cancer research from 2007 returning higher than one from 2013).
Approach #2 lets you have a more focused, more current collection of news stories, but has its own problems. There's less for search engines to discover in the first place and that could impact traffic. Archiving or unpublishing older articles will lead to broken links unless you perpetually maintain redirects to the archive.
My personal inclination is to go with Option #1 and manage what's ultimately an SEO problem through a combination of managed collections in the college's own search appliance and judicious use of robots.txt tags. If we find an old article that's particularly prominent in search results, then my thinking is that we should leverage that article to include references to more current and/or relevant stories.
Bringing this back around to WordPress, I'm curious as to what your approach is, and what tools you leverage in WordPress to implement it.