Yesterday I discovered what I thought was an odd bug in BibTeX. For one of the journal articles in my bibliography, I had the BibTeX entry
journal = {Human–Computer Interaction},
but it was appearing in my bibliography as HumanComputer Interaction.
The error turned out to be that the innocent-looking hyphen between “Human” and “Computer” was actually a Unicode en-dash. I didn’t intentionally insert it, but I guess I must have copied that bit of text from a Unicode-enabled website or email. LaTeX and BibTeX are happiest plain ASCII characters, and once I changed that character, it looked fine.
But that got me wondering if I had anymore Unicode characters in my dissertation project. They’re nearly impossible to find by hand, so I wrote this little perl one-liner to find them for me:
perl -ne "print if /[^[:ascii:]]/" *.bib *.tex
I discovered 3 more bad dashes, and also a smartquote tossed in as well.