OCR – Early results encouraging

Well after a couple of days of effort, I’ve hit my first major milestone. Using screen grabs from Words with Friends boards on the iPhone, I’m now able to parse out the board. Given the input:

Image

My board parser produces the following output.

            V Q
           YE A
           SR R
            BAM
             D
       CLEFT AT
J     SO     GI
IS   ZOO KITTEN
BE   AXLE  RI N
?POOFS I  HAND?
 T  O  E CINE
    ROUSTER
   DARN  LEG

Now there are a couple of mistakes here where letters have been mis-recognised and more worryingly, there are a couple of spots where the code hasn’t even worked out that there is a tile present. Nevertheless, for a simple algorithm I’m pretty happy with this as a first pass!

The broad approach here is to first locate tiles on the board using colour as a guideline. For each tile we then try to recognise the character represented on the tile. Since there are only 26 distinct tiles, this is a reasonably straightforward task (compared for example with recognising handwriting!)

Obviously I’m going to revisit the OCR code and train on more board positions (I only trained on 8 boards to get this level of recognition) but I think this validates the basic approach.

Beyond training, other things I might need to take a look at are:

1. Getting rid of the red circle with the score in it. This is definitely corrupting my character recognition. You can see the mis-recognised A was marred by the score.

2. If you kind of squint at the screen, you’ll see that there’s a colour gradient across the tiles. Tiles at the very top and bottom of the screen are darker yellow while tiles in the central band of the board are much lighter. Since my tile recogniser relies on detecting colours, it may be the paleness of these central tiles that’s causing them not to be detected.

Anyway, it’s late now so I’m off to bed but will post a much more detailed description of the algorithms used to find tiles and recognise letters tomorrow.

For those of you who can’t wait, you might like to sneak a peek at Peter Frey and David Slate’s paper, Letter Recognition Using Holland-Style Adaptive Classifiers.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s