iOS Tesseract Ocr Image Preperation

iOS Tesseract OCR Image Preperation

I'm currently working on the same thing.
I found that a PNG saved in photoshop worked fine, but an image which was originally sourced from the camera then imported into the app never worked.
Don't ask me to explain it - but applying this function made these images work. Maybe it'll work for you too.

// this does the trick to have tesseract accept the UIImage.
UIImage * gs_convert_image (UIImage * src_img) {
CGColorSpaceRef d_colorSpace = CGColorSpaceCreateDeviceRGB();
/*
* Note we specify 4 bytes per pixel here even though we ignore the
* alpha value; you can't specify 3 bytes per-pixel.
*/
size_t d_bytesPerRow = src_img.size.width * 4;
unsigned char * imgData = (unsigned char*)malloc(src_img.size.height*d_bytesPerRow);
CGContextRef context = CGBitmapContextCreate(imgData, src_img.size.width,
src_img.size.height,
8, d_bytesPerRow,
d_colorSpace,
kCGImageAlphaNoneSkipFirst);

UIGraphicsPushContext(context);
// These next two lines 'flip' the drawing so it doesn't appear upside-down.
CGContextTranslateCTM(context, 0.0, src_img.size.height);
CGContextScaleCTM(context, 1.0, -1.0);
// Use UIImage's drawInRect: instead of the CGContextDrawImage function, otherwise you'll have issues when the source image is in portrait orientation.
[src_img drawInRect:CGRectMake(0.0, 0.0, src_img.size.width, src_img.size.height)];
UIGraphicsPopContext();

/*
* At this point, we have the raw ARGB pixel data in the imgData buffer, so
* we can perform whatever image processing here.
*/

// After we've processed the raw data, turn it back into a UIImage instance.
CGImageRef new_img = CGBitmapContextCreateImage(context);
UIImage * convertedImage = [[UIImage alloc] initWithCGImage:
new_img];

CGImageRelease(new_img);
CGContextRelease(context);
CGColorSpaceRelease(d_colorSpace);
free(imgData);
return convertedImage;
}

I've also gone a lot of experimentation preparing the image for tesseract. Resizing, converting to grayscale, then adjusting brightness and contrast seems to work best.

I've also tried this GPUImage library. https://github.com/BradLarson/GPUImage
And the GPUImageAverageLuminanceThresholdFilter seems to give me a great adjusted image, but tesseract doesn't seem to work well with it.

I've also put in opencv into my project and plan to try out it's image routines. Possibly even some box detection to find the text area (i'm hoping this will speed up tesseract).

Tesseract OCR Camera

Almost for sure the problem is "orientation". Apple tends to create images in one bit map form - the image bits are laid out as if the camera was on its side with the volume buttons top and right. Images that you see which appear taller than wider are still laid out as above, but there is an "orientation" in the EXIF object included with the image.

I'm going to guess that tesseract does not look at the EXIF, but expects the image in a "standard" format so that text is in the position it would be for a person reading the text.

You can test my hypothesis by using camera images taken with volume button top right.

If they work, then what you will need to do is process the image yourself, and re-arrange the bits per the orientation setting. This is not all that hard to do but will require you to read up on vImage and/or bit map contexts.

implement tesseract OCR in iphone

try this one.
http://tinsuke.wordpress.com/2011/11/01/how-to-compile-and-use-tesseract-3-01-on-ios-sdk-5/

there are some old info.
https://github.com/rcarlsen/Pocket-OCR

Best free library for OCR in ios

You should try this library, it supports objective-c and swift both

https://github.com/gali8/Tesseract-OCR-iOS



Related Topics



Leave a reply



Submit