I use mutt to read my email, and every now and then I get sent a message in HTML format with no plain text alternative. I don't like to load these files in a browser, since it'd go ahead and fetch any images, run scripts and so on with potential privacy risks. In other words, a message from a dubious source might phone home and confirm my email address or track me or whatever, just from opening their HTML in my browser.
So generally I just mind-parse the HTML. In more obfuscated cases (like the garbage output as newsletters by various websites) I manually pipe the message through lynx or similar.
Then when I reply to the message I have to pipe it through again if I want to quote something other than the HTML code.
I got fed up of this and looked for a solution. It consists of two parts -- changing up the entries in my mailcap file so that filtering the HTML to plain text is preferred to opening up a browser; and telling mutt to automatically filter text/html files using the rules it finds in the mailcap file. I've added some redundancy in to the mailcap entries so that it works both on my main machines (where I prefer pandoc since Markdown is nice to read, then I prefer lynx to either w3m or html2text, since lynx displays the links as references at the bottom) and on my phone (where only lynx is available).
In ~/.mailcap:
text/html; pandoc -f html -t markdown; copiousoutput; description=HTML Text; test=type pandoc >/dev/null text/html; lynx -stdin -dump -force_html -width 70; copiousoutput; description=HTML Text; test=type lynx >/dev/null text/html; w3m -dump -T text/html -cols 70; copiousoutput; description=HTML Text; test=type w3m >/dev/null text/html; html2text -width 70; copiousoutput; description=HTML Text; test=type html2text >/dev/null
In ~/.mutt/muttrc:
auto_view text/html
Now HTML is automatically piped through one of those programs to turn it into plain text, and when I reply the quoted text is the plain text version rather than the raw HTML.
 
No comments:
Post a Comment