Wednesday, August 5, 2009

Activity 10: Preprocessing Text

For this activity, we used imaging techniques we've learned so far to process a handwritten document. The goal for the first part is to isolate each letter from each other and from the lines of the form. To remove the lines, we designed a filter such that it blocks the lines. Since the lines are repetitive in the form, they will have intense lines in the Fourier space. Moreover, since I cropped the image such that only horizontal lines are seen, we used a vertical filter. Prior to filtering, the image was rotated using Gimp to make the horizontal lines straight. The original image (cropped and rotated), the filter used, and the resulting image are shown below.
Notice the lines in the form are not visible as it was before although some lines can still be seen in the resulting image. The next part is to isolate the letters from each other by using morphological operators. Prior to that, the image is inverted since the operations work on the white part of the image. By examining the histogram, a threshold value was obtained and was applied to the image to further isolate the text with the background. Afterwhich, morphological operations were applied to isolate each letter and make it 1 pixel thick each. For this particular case I used
erode(erode(dilate(erode(inviconv,se),se),se),se2)
since this gave me the best result so far. The binarized image and the resulting image (after morphological operations were implemented) are shown below.
Notice in the figure that not all letters are 1 pixel thick and some letters are not visible anymore. This is the downside of the morphological operations on the letters. Since in the binarized image the thickness of each letter varies significantly, when morphological operations are applied, some letters (or parts of it) which are originally thin, are erased. A better image is expected if we have almost the same thickness for the letters :P

To make the handwritten more readable, we have to compensate the thickness. Using this operator instead
ero=erode(dilate(erode(inviconv,se),se),se)
give a more a readable result and the erased the remaining lines visible in the binarized object. Although some of you may object that it is not readable enough, my claim is that thatpart is subjective :P

For the second part, we are to perform the correlation of the word "DESCRIPTION" in the text. Assuming that the word has the same size in the image althroughout, strictly speaking, only 3 intense bright spots (corresponding to high correlation) should be observed in the Fourier Transform (FT) since "DESCRIPTION" was appeared only 3 times. However if we are to look at the FT, we have other bright spots. This is because there are white parts that can acoomodate the whole area of "DESCRIPTION". This results to higher correlation and hence a bright spot. An example is the logo on top of the form.
To verify this result, I cropped the image such thatthe logo is annot included anymore. Here we observed the 3 intense spots as what was expected.

For this activity, I give myself a 7/10 since I was not able to isolate the letters from each other, and make a readable 1 pixel thick letter. For me, this seems to be the hardest activity.

Many thanks to Ma'am Jing, Gilbert and Neil for the insights :)

1 comment: