Is there anything out there to convert html to plain text (maybe a nokogiri script)? Something that would keep the line breaks, but that\'s about it.
If I write som
Is simply stripping tags and excess line breaks acceptable?
html.gsub(/<\/?[^>]*>/, '').gsub(/\n\n+/, "\n").gsub(/^\n|\n$/, '')
First strips tags, second takes duplicate line breaks down to one, third removes line breaks at the start and end of the string.