« 10 Handy Things to Know about the Lucene / Solr Source Code | Main | New Phrase for determining Sentiment Analysis / Customer Interest »

November 30, 2011

Odd Google Translate Encoding issue with Japanese

Was translating a comment in the Japanese SEN tokenization library.

It seems like if your text includes the Unicode right arrow character, Google somehow gets confused about the encoding.  Saw this on both Firefox and Safari.  Not a big deal, strangely comforting to see even the big guys trip up on character encodings.

OK: サセン
OK: チャセ
Not OK: サセンチャセ?



TrackBack URL for this entry:

Listed below are links to weblogs that reference Odd Google Translate Encoding issue with Japanese:


The comments to this entry are closed.