Message bodies with REST GET

So today I ran into an issue testing a RESTful service API. The API allows me to query a resource using GET and to provide a couple of optional parameters. I can provide these as either request parameters on the URI or as a JSON payload in the request body.

Testing with curl works fine and I am able to put my parameters into the JSON message body and get an appropriate response from the service using a call like this:

curl -i \ -H "Accept: application/json" \ -H "Content-Type: application/json" \ -X GET -d '{"parameter":"value"}' \ http://myservice.com/resource

Testing with a couple of REST client tools ( restclient and  REST console) however failed. On further investigation it turns out that these tools are not packaging the JSON payload into the request body, presumably because it uses a GET rather than POST or PUT method.

This made me wonder whether it was actually even legal to supply a message body with a GET request.

The HTTP specification says in section 4.3

A message-body MUST NOT be included in a request if the specification of the request method (section 5.1.1) does not allow sending an entity-body in requests.

Section 5.1.1 redirects us to section 9.x for the various methods. None of them explicitly prohibit the inclusion of a message body. However…

Section 5.2 says

The exact resource identified by an Internet request is determined by examining both the Request-URI and the Host header field.

and Section 9.3 says

The GET method means retrieve whatever information (in the form of an entity) is identified by the Request-URI.

Which together suggest that when processing a GET request, a server is not required to examine anything other that the Request-URI and Host header field.

In summary, the HTTP spec doesn’t prevent you from sending a message-body with GET but there is sufficient ambiguity that it wouldn’t surprise me if it was not supported by all servers.

Advertisements

Augmented Reality Tutorial Pt2. The Camera Preview

Last time I pointed out where you could download the various tools necessary to follow this tutorial. This time we’re going to make something happen. But first an overview of how Augmented Reality works.

About AR

Augmented Reality is all about adding extra information to the raw data we collect through our senses to provide an enhanced or augmented view. Mostly this has meant visual information though there’s no reason why it couldn’t be audio. As an aside, Shazam is a great example of an audio based AR app, it ‘listens’ to music playing and then provides you with information about the artist and song. It’s not too hard to imagine this integrated with e.g. Google Glass to provide you with a ticker in the bottom of your field of view that constantly kept you updated on the background music to your life.

However, most people think of AR in terms of video and I’m no different so we’re going to implement a video based AR system in this tutorial. The basic approach to AR is shown in the diagram below. The flow looks like this:

ARFlow

1. The camera captures an image of the world

2. The AR examines the picture for some sort recognisable features – perhaps a face, a building, some text or something else. This can require some quite sophisticated image processing code.

3. Based on what it finds, the AR app then looks up some additional data to ‘augment’ the view with. This may be text, a 3D model or mesh or something similar

4. The AR app embeds this additional data in the image. It might float over the top of the recognised feature or, in the case of a 3D model, it may require sophisticated lighting and shading to make it blend in as if it were part of the image

5. Finally the app shows this picture to us on the screen of our AR device. If we can do this 30 times per second then we have a cool, immersive AR application.

Getting Started

The first thing we need to do is to be able to enable the camera and to start capturing video frames and displaying them in the screen of our device. Create a new project with a blank Activity and create the following files:

activity_main.xml
<RelativeLayout xmlns:android="http://schemas.android.com/apk/res/android"
   xmlns:tools="http://schemas.android.com/tools"
   xmlns:opencv="http://schemas.android.com/apk/res-auto"
   android:layout_width="match_parent"
   android:layout_height="match_parent"
   android:paddingBottom="@dimen/activity_vertical_margin"
   android:paddingLeft="@dimen/activity_horizontal_margin"
   android:paddingRight="@dimen/activity_horizontal_margin"
   android:paddingTop="@dimen/activity_vertical_margin"
   tools:context=".MainActivity" >
<org.opencv.android.NativeCameraView
   android:id="@+id/main_view"
   android:layout_width="wrap_content"
   android:layout_height="wrap_content"
   opencv:show_fps="true"
   opencv:camera_id="any" />
</RelativeLayout>

This layout simply defines a single view which will display the frames retrieved from the default camera.
The main items to note from the layout is that we need to include a new namespace, xmlns:opencv="http://schemas.android.com/apk/res-auto" definition in order to use OpenCV components in layouts.

We then go on to define a NativeCameraView. The line opencv:show_fps="true" will enable the view to automagically show the number of frames per second being processed which will give us a measure of how efficient our code is. Note also opencv:camera_id="any" which will cause the device to use the first camera it finds as the one to provide video frames. The camera_id could be set to be a specific ID on devices with multiple cameras.

ARDemo1 Manifest.xml
<?xml version="1.0" encoding="utf-8"?> 
<manifest
    xmlns:android="http://schemas.android.com/apk/res/android"
    package="com.sindesso.ardemo1" 
    android:versionCode="1" 
    android:versionName="1.0" > 

    <uses-sdk 
        android:minSdkVersion="17" 
        android:targetSdkVersion="17" /> 

    <uses-permission android:name="android.permission.CAMERA"/>
    <application 
        android:allowBackup="true" 
        android:icon="@drawable/ic_launcher" 
        android:label="@string/app_name" 
        android:theme="@style/AppTheme" > 

        <activity
            android:name="com.sindesso.ardemo1.MainActivity" 
            android:label="@string/app_name" >
            <intent-filter> 
                <action android:name="android.intent.action.MAIN" /> 
                <category android:name="android.intent.category.LAUNCHER" /> 
            </intent-filter> 
        </activity> 
    </application> 
</manifest>
MainActivity.java

Within the manifest it is important to ensure that we have added the permission <uses-permission android:name="android.permission.CAMERA"/> to allow camera access.

package com.sindesso.ardemo1; 

import org.opencv.android.BaseLoaderCallback; 
import org.opencv.android.CameraBridgeViewBase;
import org.opencv.android.CameraBridgeViewBase.CvCameraViewFrame; 
import org.opencv.android.CameraBridgeViewBase.CvCameraViewListener2;
import org.opencv.android.LoaderCallbackInterface; 
import org.opencv.android.OpenCVLoader; 
import org.opencv.core.Mat; 
import android.os.Bundle; 
import android.app.Activity; 
import android.util.Log; 
import android.view.Menu; 
import android.view.WindowManager; 

/* 1. Note that we implement CvCameraViewListener2 */
public class MainActivity extends Activity implements CvCameraViewListener2 { 
    /** For logging */
    private final String TAG = getClass( ).getCanonicalName(); 

    /** 2. The Camera View */
    private CameraBridgeViewBase mOpenCvCameraView;

    /** 3. Load and initialise OpenCV */
    private BaseLoaderCallback mLoaderCallback = new BaseLoaderCallback(this) { 
        @Override public void onManagerConnected(int status) {
            switch (status) { 
                case LoaderCallbackInterface.SUCCESS:
                    Log.i(TAG, "OpenCV loaded successfully"); 
                    mOpenCvCameraView.enableView(); 
                    break; 

                default:
                    super.onManagerConnected(status);
                    break; 
            }
         }
      };
      public MainActivity() 
          Log.i(TAG, "Instantiated new " + this.getClass());
      }

      @Override
      public void onCreate(Bundle savedInstanceState) {
          Log.i(TAG, "called onCreate");
          super.onCreate(savedInstanceState);
          /* 4. Ensure that screen doesn't turn off */
          getWindow().addFlags(WindowManager.LayoutParams.FLAG_KEEP_SCREEN_ON);
          setContentView(R.layout.activity_main);

          /* 5. Add myself as callback for camera */
          mOpenCvCameraView = (CameraBridgeViewBase) findViewById(R.id.main_view);
          mOpenCvCameraView.setCvCameraViewListener(this);
      }

      @Override
      public void onPause() {
          super.onPause();
          /* 6. Stope view when app is paused */
          if (mOpenCvCameraView != null)
               mOpenCvCameraView.disableView();
      }

      @Override
      public void onResume() {
           super.onResume();
           /** 7. Call the asynch loader on another thread */
           OpenCVLoader.initAsync(OpenCVLoader.OPENCV_VERSION_2_4_3, this, mLoaderCallback);
      }

      public void onDestroy() {
          super.onDestroy();
          if (mOpenCvCameraView != null) {
              mOpenCvCameraView.disableView();
          }
      }

      /* 8. Called every time the camera grabs a frame. */
      public Mat onCameraFrame(CvCameraViewFrame inputFrame) {
          return inputFrame.rgba();
      }

      @Override
      public void onCameraViewStarted(int width, int height) {}

      @Override 
      public void onCameraViewStopped() {} 
}

Code analysis

The Action code itself is worth walking through in some detail.

  1. The Activity class implements the interface CvCameraViewListener2 which means it must also implement the following methods:
    • public Mat onCameraFrame(CvCameraViewFrame inputFrame)
    • public void onCameraViewStarted(int width, int height)
    • public void onCameraViewStopped()

    Of these the only one were interested in is onCameraFrame which wil be called every time there is a new frame to process. This is described in more detail below.

  2. The camera view private CameraBridgeViewBase mOpenCvCameraView controls when camera can be enabled, processes the frame, calls external listeners to make any adjustments to the frame and then draws the resulting frame to the screen.
  3. This block of code is called before the Activity constructor is called. It’s basically some boilerplate to force the OpenCV library to load and initialise. The line mOpenCvCameraView.enableView(); enables the view so that it can start to capture and display frames from the camera. More will be said about this code later bit for the time being we will leave it at this.
  4. Once we get started and onCreate is called, we ensure that the screen is kept on and set up the layout.
  5. This code mOpenCvCameraView.setCvCameraViewListener(this); adds the Activity as a listener to the view. Each time that a frame is produced, our method onCameraFrame(CvCameraViewFrame inputFrame) will be called, giving us a chance to process the image and find any useful or augmentable objects in it before it’s displayed.
  6. stops the camera
  7. and restarts it when the app is paused and resumed
  8. This method is the guts of our AR capability. Here we are passed a copy of the frame and given an opportunity to process it before it’s displayed. We are expected to return a Mat instace which is an OPenCV matrix described here but is basically a matrix representing the image content. In his example we just return the RGB colour image as a matrix from the frame. We could equally have used the gray() method of CvCameraViewFrame to return a gray scale image.

Test

Once you’ve created this app, run it up on a device and test it. You should see the camera output being displayed in a the main window of your app. There’s no AR here yet but hopefully you can see how adding code to the onCameraFrame method will enable us to modify the displayed image, inserting our own text or objects.

Next time, we’ll look at some simple image processing we can do to make this app ‘proper’ AR.

Augmented Reality tutorial for Android pt 1

Augmented Reality, while it’s been around for many years, is grabbing the headlines at the moment, particularly due to the launch of the Google Glass project. In fact the earliest AR applications were probably Head-Up Displays in fighter plane cockpits. In fact as early as 1994, William Gibson wrote Virtual Light which features something awfully like the Google Glasses.

Anyway, this is such an interesting and entertaining field and so ripe for exploration that I thought I’d throw together a tutorial on how to get started developing A

R applications on android devices, partly to learn a thing or too myself (in anticipation of the day that Google Glass dev kits are available to people outside of the State) and partly to inspire others to do some cool stuff.

OPenCV logo

I’m going to be building this tutorial using OpenCV for Android. Mostly because I’m already familiar with OpenCV so the learning curve wont be quite as steep but also because it’s a free opensource library with plenty of contributors and a lot of support if you (or I!) run into trouble. The device I’m using is the Nexus 7, principally because it’s so cheap and cheerful. The code should run on any device with a camera but I’ll leave it to you guys to test that.

In addition to your primary development environment you will need to get hold of OpenCV for Android. This library uses native C/C++ OpenCV code for performance and to make it easier for existing OpenCV developers to get started. Don;t worry though, you can still write your apps in Java as we will do here, it’s just that you’ll need to do some specific setup in your dev environment to support OpenCV. I use Eclipse and the ADT as my development platform so if you have an alternative toolkit you will need to adjust the following setup instructions to suit your toolkit.What you will need

NDK

The NDK is a toolset that allows you to implement parts of your app using native-code languages such as C and C++. You will need to install this before you can start using OpenCV. Full directions are available here.

OpenCV

OpenCV (Open Source Computer Vision Library) is an open source computer vision andmachine learning software library. It’s written in C and C++ but is also available with an Android/Java wrapper which is what we are going to use. You can get OpenCV for Android, along with installation instructions here.

Once you have these two components installed you’re ready to start coding.

Android Layouts

So today I’ve been wresting with Android layouts to try to achieve a pretty screen layout as part of a project for my Masters. What ought to be really easy given the pre-existing work from Java springs and struts, HTML and CSS as well as iPhone layouts turns out to be a complete pig.

As an example, let me demonstrate the following. I have two images, one is 400×100 pixels and the other is a 100×100 square. What I would like to do is to lay them out so that together, they fill the width of the screen, scaling both as necessary while preserving their aspect ratios.

That is, the first should occupy 80% of the width and the second 20% of the width of the screen. The second should remain square. I want no padding, margins of other wasted space in this layout.

My first foray is to make a horizontal LinearLayout and allocate the appropriate weights to each.

<LinearLayout
   android:orientation="horizontal"
   android:layout_width="match_parent"
   android:layout_height="match_parent"
   android:weightSum="100" >
   <ImageView
      android:src="@drawable/im400x100"
      android:layout_width="wrap_content"
      android:layout_height="wrap_content"
      android:layout_weight="80"/>
   <ImageView
      android:src="@drawable/im100x100"
      android:layout_width="wrap_content"
      android:layout_height="wrap_content"
      android:layout_weight="20"/>
</LinearLayout>

The weights of the two components should now be set to 80% and 20% of the containing LinearLayout. Here are the results when displayed on a Nexus 1 (screen size 480×800)

AndLayout1So, we know that the combined width of the two images should be 500 pixels. This is clearly bigger than the 480 pixels allowed on the Nexus 1 screen. So what I’d hoped would happen is that both images would be shrunk enough that they would fit the width. A quick calculation shows that this would mean the 100×100 image would be 96×96 and the 400×100 would be 384×96 pixels. This is not what has happened. It looks as though the 100×100 image has been kept full size and then the 400×100 image has been shrunk to fit the remaining space. Odd!

Let’s see what happens on the Nexus 7 with its 800×1280 screen.

AndLayoutN7_1OK so here the images are spaced out sufficiently to occupy  the width but haven’t been scaled up to fill the available space. further inspection of android documentation suggests that I ought to add some additional attributes to the images to force them to scale so let’s try that.

Here I’ve added scaleType to each of the images. The android documentation derscribes scaleType

On the N7 it doesnt look too bad, in fact it’s done precisely what I wanted it to. Hurrah!

AndLayoutN7_2The images have scaled up to occupy the horizontal real-estate and have maintained their aspect ration. How does this look on the N1?

AndLayoutN1_2Oh dear. This time the images have been scaled down to fit the width but cropped at the edges. Well, I suppose that’s what the docs said. Let’s try again, this time using fitCenter which is supposed to:

Compute a scale that will maintain the original src aspect ratio, but will also ensure that src fits entirely inside dst. At least one axis (X or Y) will fit exactly. The result is centered inside dst.

Here’s how that looks on the N7.

AndLayoutN7_3

Well, the ImgeViews have resized correctly (as shown by the blue outlines) but the IMages are steadfastly refusing to resize. On the N1, we get this:

AndLayoutN1_3

Hmmmm…The horizontal ratio between the views looks OK but it’s clear that right hand view hasn’t preserved the aspect ratio and again, the images haven’t resized.

I’ve also tried to make 9patch images for these two images to get them to scale better but to no avail. It seems that as long as the image is set as the src, your choices are to scale it up to fill all the available space and crop it (which works for these images but not for any real image without a plain coloured background) or to scale only one dimension.

Of course the problem here may be that I expect weights and scaling to apply equally well when I scale an image down as when I scale it up. Perhaps using an image that was smaller than the smallest screen width would be the way to go.

Also, I know I can use the image as a background and without any scaling instruction whatsoever, it will expand to fit the available ImageView space but it doesn’t maintain the aspect ratio.

Well, I’ve exhausted my energy here and may have to try to handle this in code. If anyone can shed some light on how to solve this problem (and judging by the threads on StackOverflow it’s not trivial) all help will be gladly received.

 

Parsing the WWF board

As promised yesterday, today I’m going to spend a little time describing how we parse out the Words with Friends board to work out what tiles have already been played.

The screen shot contains a lot of detail but for the purposes of this post, we’re only interested in the playing area itself. While we could spend a lot of time using computer vision techniques to locate the main playing area, for example locating horizontal and vertical lines, looking for the corners of the playing area etc., it’s really not worth it because for a given device, the board is always in the same part of the image. If we can snapshot the image on iPad, iPhone 3, 4 and 5 then we can just count pixels to work out where the board is.

Image

Cells on the board

We think of the board as being made up of a number of cells – 15 x 15 to be exact. A cell is just a spot where we can play a tile. The main things to note about cells is that they are square and that there are exactly 15 of them filling the width of the screen. This means we can work out how big they are by dividing the width of the snapshot image by 15.

Once we’ve worked out these measurements, we can find the right part of the image for any given cell by simply using the offset from the top of the board area and adding in an appropriate X and Y offset based on the cell width.

The next task is working out whether there is a tile played at a given cell or not.

My first thought was that I could do this by looking for a mostly yellow cell at each location but there’s a problem with that. As mentioned yesterday there is a slight colour gradient across the board which means that tiles at the top of the board are a darker yellow than tiles in the middle of the board.

Also, where a tile is played on a double or triple word or letter score, the colour of the underlying cell bleeds through. This means that a tile in the upper half of the board on a double word score (red) has pretty much exactly the same colour as an empty triple word score tile (orange).

This doesn’t mean we can’t use colour at all, it just means that we need to be aware of where a tile is on the board and what’s underneath it. There are six types of cell, each of which has it’s own colour characteristics. These are:

  • Empty – mostly grey
  • Double letter – mostly blue
  • Triple letter – mostly green
  • Double word – mostly red
  • Triple word – mostly orange
  • Middle tile – a white cross on a darkish red background

There are two important facts here. Firstly, we know exactly where we expect to find these cells on the board and secondly, the colour gradient used on the tiles is not used on the board cells. This means every TW cell is the same colour as every other TW cell.

Tile average colours

Tile average colours

We first calculate the mean colour across each type of cell. You can see these colours here. Note that the colours are expressed in L*a*b* rather than the normal RGB triplets which is why there are negative values.

Now when we look at a cell on the board, we can calculate the average colour across that cell and if it differs from the expected values by more than some threshold (I use 10%) then we assume that it has a tile on it.

Image

Processing of tile E

Once we have the locations of each tile, we can extract that cell and do some further image processing to work out what the letter on that tile is. First I take the grey scale version of the tile and then threshold using Otsu’s method. This gives me a black and white version of the tile. From here, I calculate the bounding box of the letter itself (i.e. tripping out the score in the corner of the tile) before trying to recognise the letter.

OCR in general can be difficult and expensive however in this case, we have a very limited domain. There are only 26 letters to recognise and they are all positioned in the same way within a cell give or take a pixel or two.

This paper describes a set of 16 different measurements that can be made across a set of letters from a font to distinguish between them. I use these measurements, normalised to percentages,  to build a descriptor of each possible letter A-Z and then or each tile in the space, I compare to the stored descriptor, calculating the Euclidean distance from each to select the nearest neighbour.

Image

Problem Tiles

This gives pretty fair, though not perfect results as seen yesterday. One of the key problems is with the letter W. As you can see from the processing of that tile, the score in the corner of the tile actually merges with the letter itself in the processed tile. This may be making it more difficult to recognise. Similarly, the red circle containing the score of the last turn played can also corrupt the letter on a tile making it hard to recognise. The next step is to improve the pre-processing of tile images to hopefully get better recognition.

I’m going to attempt to improve these results by trying to tidy up the source images a little.

OCR – Early results encouraging

Well after a couple of days of effort, I’ve hit my first major milestone. Using screen grabs from Words with Friends boards on the iPhone, I’m now able to parse out the board. Given the input:

Image

My board parser produces the following output.

            V Q
           YE A
           SR R
            BAM
             D
       CLEFT AT
J     SO     GI
IS   ZOO KITTEN
BE   AXLE  RI N
?POOFS I  HAND?
 T  O  E CINE
    ROUSTER
   DARN  LEG

Now there are a couple of mistakes here where letters have been mis-recognised and more worryingly, there are a couple of spots where the code hasn’t even worked out that there is a tile present. Nevertheless, for a simple algorithm I’m pretty happy with this as a first pass!

The broad approach here is to first locate tiles on the board using colour as a guideline. For each tile we then try to recognise the character represented on the tile. Since there are only 26 distinct tiles, this is a reasonably straightforward task (compared for example with recognising handwriting!)

Obviously I’m going to revisit the OCR code and train on more board positions (I only trained on 8 boards to get this level of recognition) but I think this validates the basic approach.

Beyond training, other things I might need to take a look at are:

1. Getting rid of the red circle with the score in it. This is definitely corrupting my character recognition. You can see the mis-recognised A was marred by the score.

2. If you kind of squint at the screen, you’ll see that there’s a colour gradient across the tiles. Tiles at the very top and bottom of the screen are darker yellow while tiles in the central band of the board are much lighter. Since my tile recogniser relies on detecting colours, it may be the paleness of these central tiles that’s causing them not to be detected.

Anyway, it’s late now so I’m off to bed but will post a much more detailed description of the algorithms used to find tiles and recognise letters tomorrow.

For those of you who can’t wait, you might like to sneak a peek at Peter Frey and David Slate’s paper, Letter Recognition Using Holland-Style Adaptive Classifiers.