Hunting for Unicode in Emacs
Hunting for Unicode in Emacs
<nerdy>
Emacs has wonderful Unicode support. Copy and paste text from a Word document and Emacs will happily preserve your smart quotes, ellipses, and em dashes. There isn’t a canonical way, however, to strip these “special” characters into their more sane ASCII counterparts.
The unix command tidy does a good job of converting Unicode characters but you are left with ugly HTML equivalents like € instead of the usual quote character. We’ll need an alternative for Emacs, preferably written in Emacs lisp.