The boilerpipe library provides algorithms to detect and remove the surplus 'clutter' (boilerplate, templates) around the main textual content of a web page.
Instance/web API running at http://boilerpipe-web.appspot.com/.