« 10 Handy Things to Know about the Lucene / Solr Source Code | Main | New Phrase for determining Sentiment Analysis / Customer Interest »

November 30, 2011

Odd Google Translate Encoding issue with Japanese

Was translating a comment in the Japanese SEN tokenization library.

It seems like if your text includes the Unicode right arrow character, Google somehow gets confused about the encoding.  Saw this on both Firefox and Safari.  Not a big deal, strangely comforting to see even the big guys trip up on character encodings.

OK: サセン
OK: チャセ
Not OK: サセンチャセ?

Google-translate-encoding

TrackBack

TrackBack URL for this entry:
https://www.typepad.com/services/trackback/6a00d8341c84cf53ef0162fd0c24ff970d

Listed below are links to weblogs that reference Odd Google Translate Encoding issue with Japanese:

Comments

The comments to this entry are closed.