Limited character set - our system recognizes a total of 6,763 simplified Chinese characters and 5,401 traditional ones (for Chinese computing geeks, it’s all of the characters in the GB–2312 standard and the characters from the commonly-used half of the Big5 standard), so some rare characters may not be recognized simply because they’re not in the database. Very neat handwriting might occasionally work, but officially only printed text is supported. Printed text only - the templates which our system matches characters to are based on common printed Chinese fonts rather than on handwritten characters. Here are some specific limitations to keep in mind when using our OCR system: (this is all a convoluted way of asking you to be patient if things don’t work perfectly every time we’re steadily working to bring this even closer to character recognizer perfection, but in the meantime we hope you’ll find it accurate enough to be useful in its current form) Limitations So while handwriting only has to contend with one character at a time, and can even be forgiven for getting that character wrong as long as the correct character is among its top 5 matches, OCR has to deal with multiple characters and get every one of them exactly correct in order to seem like it’s doing its job. Handling lots of characters at once also means that even if gets a higher percentage of them accurate on the first try, if just a few of those are incorrect it’ll still feel as if it got the entire block of text wrong. On top of which, because OCR must recognize multiple characters at a time, there’s less of an opportunity for it to show you its other, less likely matches like the handwriting recognizer does. OCR is also up against some psychological hurdles compared to handwriting input while a mis-recognized handwritten character can be chalked up to one’s poor handwriting / incorrect stroke order, with a printed character there’s nobody to blame but the recognition software. However, while the handwriting recognizer always has a very clear picture of the character you drew - it knows exactly where every stroke is located, where it starts / ends, what order strokes were drawn in, where it overlaps other strokes - the OCR system has to contend with a much murkier one characters on a camera image can be small, grainy, and out-of-focus, and the same calligraphic flourishes that make printed Chinese text so pretty to look at also make it harder to see the underlying structure of each character. Much like our handwriting recognizer, our OCR system works by matching characters to templates in a database it turns the image of the character into a simple mathematical structure, identifies its key features (lengths / positions / curvatures of strokes, etc), then searches through its database of 10,000+ Chinese characters to find the one that most closely matches that pattern. We strongly recommend that you try out that demo version before purchasing this module this is a relatively new feature, not just for us but for Chinese dictionary software in general, and while we’re working hard to improve it, there are a number of limitations and you may not find that it works well enough to be usable for you yet. You can download the demo or purchase the paid version through “Add-ons,” or buy it directly from our online store. There’s also a demo version available - no limit on lookups but it only gives you Pinyin and no definition. The Pleco Optical Character Recognizer system is a paid add-on module if purchased, you can access it through the “Live Video” and “Still Image” options in the sidebar menu. Pleco Instruction Manual Accessing the OCR System Pleco Instruction Manual OCR Table of Contents
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |