Powered by Blogger.

Thursday, November 20, 2008

Tag: , , , ,

Basic OCR in OpenCV




Demo Source from GitHub

In this tutorial we go to create a basic number OCR. It consist to classify a handwrite number into his class.

To do it, we go to use all we learn in before tutorials, we go to use a simple basic painter and the basic pattern recognition and classification with openCV tutorial.

In a typical pattern recognition classifier consist in three modules:

Preprocessing: in this module we go to process our input image, for example size normalize, convert color to BN...

Feature extraction: in this module we convert our image processed to a characteristic vector of features to classify, it can be the pixels matrix convert to vector or get contour chain codes data representation

Classification module get the feature vectors and train our system or classify an input feature vector with a classify method as knn.

In this basic OCR we go to use this graph:

Where we get a train set and test set of image to train and test our classifier method (knn)

We have a 1000 handwrite images, 100 images of each number. We get 50 images of each number (class) to train and other 50 to test our system.

Then the first work we do is pre-process all train image, to do it we create a preprocessing function. In this function we get a image and a new width and height we want as result of preprocessing, then the function return a normalized size with bounding box image. You can see more clear the process in this graph:


Pre-processing code:

[c language="++"]
void findX(IplImage* imgSrc,int* min, int* max){
int i;
int minFound=0;
CvMat data;
CvScalar maxVal=cvRealScalar(imgSrc->width * 255);
CvScalar val=cvRealScalar(0);
//For each col sum, if sum < width*255 then we find the min
//then continue to end to search the max, if sum< width*255 then is new max
for (i=0; i< imgSrc->width; i++){
cvGetCol(imgSrc, &data, i);
val= cvSum(&data);
if(val.val[0] < maxVal.val[0]){
*max= i;
if(!minFound){
*min= i;
minFound= 1;
}
}
}
}

void findY(IplImage* imgSrc,int* min, int* max){
int i;
int minFound=0;
CvMat data;
CvScalar maxVal=cvRealScalar(imgSrc->width * 255);
CvScalar val=cvRealScalar(0);
//For each col sum, if sum < width*255 then we find the min
//then continue to end to search the max, if sum< width*255 then is new max
for (i=0; i< imgSrc->height; i++){
cvGetRow(imgSrc, &data, i);
val= cvSum(&data);
if(val.val[0] < maxVal.val[0]){
*max=i;
if(!minFound){
*min= i;
minFound= 1;
}
}
}
}
CvRect findBB(IplImage* imgSrc){
CvRect aux;
int xmin, xmax, ymin, ymax;
xmin=xmax=ymin=ymax=0;

findX(imgSrc, &xmin, &xmax);
findY(imgSrc, &ymin, &ymax);

aux=cvRect(xmin, ymin, xmax-xmin, ymax-ymin);

//printf("BB: %d,%d - %d,%d\n", aux.x, aux.y, aux.width, aux.height);

return aux;

}

IplImage preprocessing(IplImage* imgSrc,int new_width, int new_height){
IplImage* result;
IplImage* scaledResult;

CvMat data;
CvMat dataA;
CvRect bb;//bounding box
CvRect bba;//boundinb box maintain aspect ratio

//Find bounding box
bb=findBB(imgSrc);

//Get bounding box data and no with aspect ratio, the x and y can be corrupted
cvGetSubRect(imgSrc, &data, cvRect(bb.x, bb.y, bb.width, bb.height));
//Create image with this data with width and height with aspect ratio 1
//then we get highest size betwen width and height of our bounding box
int size=(bb.width>bb.height)?bb.width:bb.height;
result=cvCreateImage( cvSize( size, size ), 8, 1 );
cvSet(result,CV_RGB(255,255,255),NULL);
//Copy de data in center of image
int x=(int)floor((float)(size-bb.width)/2.0f);
int y=(int)floor((float)(size-bb.height)/2.0f);
cvGetSubRect(result, &dataA, cvRect(x,y,bb.width, bb.height));
cvCopy(&data, &dataA, NULL);
//Scale result
scaledResult=cvCreateImage( cvSize( new_width, new_height ), 8, 1 );
cvResize(result, scaledResult, CV_INTER_NN);

//Return processed data
return *scaledResult;

}[/c]

We use the function getData of basicOCR class to create the train data and train classes, this function get all images under OCR folder to create this train data, the OCR forlder is structured with 1 folder to each class and each file have are pbm files with this name cnn.pbm where c is the class {0..9} and nn is the number of image {00..99}

Each image we get is pre-processed and then convert the data in a feature vector we use.

basicOCR.cpp getData code:

[c language="++"]
void basicOCR::getData()
{
IplImage* src_image;
IplImage prs_image;
CvMat row,data;
char file[255];
int i,j;
for(i =0; i<classes; i++){
for( j = 0; j< train_samples; j++){

//Load file
if(j<10)
sprintf(file,"%s%d/%d0%d.pbm",file_path, i, i , j);
else
sprintf(file,"%s%d/%d%d.pbm",file_path, i, i , j);
src_image = cvLoadImage(file,0);
if(!src_image){
printf("Error: Cant load image %s\n", file);
//exit(-1);
}
//process file
prs_image = preprocessing(src_image, size, size);

//Set class label
cvGetRow(trainClasses, &row, i*train_samples + j);
cvSet(&row, cvRealScalar(i));
//Set data
cvGetRow(trainData, &row, i*train_samples + j);

IplImage* img = cvCreateImage( cvSize( size, size ), IPL_DEPTH_32F, 1 );
//convert 8 bits image to 32 float image
cvConvertScale(&prs_image, img, 0.0039215, 0);

cvGetSubRect(img, &data, cvRect(0,0, size,size));

CvMat row_header, *row1;
//convert data matrix sizexsize to vecor
row1 = cvReshape( &data, &row_header, 0, 1 );
cvCopy(row1, &row, NULL);
}
}
}[/c]

After processed and get train data and classes whe then train our model with this data, in our sample we use knn method then:

[c language="++"]knn=new CvKNearest( trainData, trainClasses, 0, false, K );[/c]

Then we now can test our model, and we can use the test result to compare to another methods we can use, or if we reduce the image scale or similar. There are a function to create the test in our basicOCR class, test function.

This function get the other 500 samples and classify this in our selected method and check the obtained result.

[c language="++"]void basicOCR::test(){
IplImage* src_image;
IplImage prs_image;
CvMat row,data;
char file[255];
int i,j;
int error=0;
int testCount=0;
for(i =0; i<classes; i++){
for( j = 50; j< 50+train_samples; j++){

sprintf(file,"%s%d/%d%d.pbm",file_path, i, i , j);
src_image = cvLoadImage(file,0);
if(!src_image){
printf("Error: Cant load image %s\n", file);
//exit(-1);
}
//process file
prs_image = preprocessing(src_image, size, size);
float r=classify(&prs_image,0);
if((int)r!=i)
error++;

testCount++;
}
}
float totalerror=100*(float)error/(float)testCount;
printf("System Error: %.2f%%\n", totalerror);

}[/c]

Test use the classify function that get image to classify, process image, get feature vector and classify it with a find_nearest of knn class. This function we use to classify the input user images:

[c language="++"]float basicOCR::classify(IplImage* img, int showResult)
{
IplImage prs_image;
CvMat data;
CvMat* nearest=cvCreateMat(1,K,CV_32FC1);
float result;
//process file
prs_image = preprocessing(img, size, size);

//Set data
IplImage* img32 = cvCreateImage( cvSize( size, size ), IPL_DEPTH_32F, 1 );
cvConvertScale(&prs_image, img32, 0.0039215, 0);
cvGetSubRect(img32, &data, cvRect(0,0, size,size));
CvMat row_header, *row1;
row1 = cvReshape( &data, &row_header, 0, 1 );

result=knn->find_nearest(row1,K,0,0,nearest,0);

int accuracy=0;
for(int i=0;i<K;i++){
if( nearest->data.fl[i] == result)
accuracy++;
}
float pre=100*((float)accuracy/(float)K);
if(showResult==1){
printf("|\t%.0f\t| \t%.2f%%  \t| \t%d of %d \t| \n",result,pre,accuracy,K);
printf(" ---------------------------------------------------------------\n");
}

return result;

}[/c]

All work or training and test is in basicOCR class, when we create a basicOCR instance then only we need call to classify function to classify our input image. Then we go to use basic Painter we create before in other tutorial to user interactivity to draw a image and classify it.

Demo Source

About David Millán Escrivá

David completed his studies in Universidad Politecnica de Valencia in IT with a Master's degree in artificial intelligence, computer graphics, and pattern recognition, focusing on pattern recognition and Computer Vision. David has more than 15 years of experience in IT, with more than ten years of experience in Computer Vision, computer graphics, and pattern recognition, working on different projects and start-ups, applying his knowledge of Computer Vision, optical character recognition, and augmented reality. Co-Author of two OpenCV books and reviewer of few more.

368 comments:

  1. Hey, this seems a really nice tutorial. Do you think it can work in text numbers like those of a picture, for example

    http://img519.imageshack.us/my.php?image=20070609211136zk8.jpg

    Anywayz, nice job :)

    ReplyDelete
  2. Thanks for your comment.

    Yes you can use this for text, numbers, faces and all you want, but you need first in your sample segment the image to detect text, and segment text in characters, and then recognize with this method each character.

    ReplyDelete
  3. You mean, to take the part of the picture with the time and date(for example the bottom of the picture), then separate each character and finally doing this, it ll work ?

    ReplyDelete
  4. Yes!, but you need train this algorithm with all characther, no only numbers. it's same process.

    ReplyDelete
  5. Oh, I get it. Do you have some function for the training ?
    DOo you have the whole thins as a working source code ?

    ReplyDelete
  6. Actually, I am interested in finding time, so I am mostly going to need number recognition. Well, I am going to try this and see what it happens.

    ReplyDelete
  7. Ok then, if you can find where is the time in the image, then segment this to retreive each number and then use my algorithm, but my algoritgm is trained with handwritten numbers and you must train with os fonts.

    ReplyDelete
  8. Well, segmenting the picture is not that difficult for what I am doing now. How many samples do you think are adequate for making good classification ?

    Also, do you think that the size of the characters will play a role ? Because I tested tesseract and the performace was poor.

    ReplyDelete
  9. I guess though, it won't gonna need that many samples, because they aer produced from a computer and the characters are identical.

    ReplyDelete
  10. yes the character are identical if they use same font. But you can use more fonts for classification.

    In theory you need when more samples better, 1.000, 100.000 ... In your case with 500 each character or less is sufficient.

    The best size for character is better you get with empiric tests, try with 10x10px 15x15, 20x20, 40x40 and use you get better results.

    ReplyDelete
  11. I just used that, and I can say that it's pretty impressive. Well done!

    Actually, I have many pictures from the same camera, so I have lots of training data. However, the size of the characters is fixed and approximately 10x10, if i don;t do any scale up. I will try them and I ll let you know.

    The bad thing is, that I would like to make it pretty abstract, so as someone to select the area of the picture, where the letters exist and do the training automatically. Well, I ll see how I ma going to make it work. :)

    ReplyDelete
  12. Try 5x5px resoluton too.

    If you know the position of characters in image then you can do automatically, if no, then this task is huge.

    ReplyDelete
  13. Do the images have to be in PBM format or I can store them in JPG as well ?

    ReplyDelete
  14. The images can be all opencv formats support. JPG are good, but the process are in gray scale mode.

    ReplyDelete
  15. I did a first test, which was absolutely basic and it worked. So, I am optimistic. I wanted to ask you, if you know how to save the training results, so as not to train the algorithm, each time I run my software. Is there any way to same CvKNearest as an "image" or raw data or sth else ?

    ReplyDelete
  16. Stratis, OpenCV have a File Storage functions to save settings and all you need, see Opencv documentation, the functions are cvFileStorage, cvFileNode, cvAttrList, cvOpenFileStorage, cvReleaseFileStorage, cvStartWriteStruct, cvEndWriteStruct, cvWriteInt, cvWriteReal, cvWriteString, cvWriteComment, cvStartNextStream, cvWrite, cvWriteRawData, cvWriteFileNode, cvGetRootFileNode, cvGetFileNodeByName, cvGetHashedKey, cvGetFileNode, cvGetFileNodeName, cvReadInt, cvReadIntByName...

    See the doc: http://opencv.willowgarage.com/wiki/CxCore#DataPersistenceandRTTI

    ReplyDelete
  17. respected sir

    I have collected hand written data and scanned it telll me how i can make my database for this ocr ,is there any tool

    ReplyDelete
  18. Hi anikumar, in this ocr all images are stored as a single image in his folder, then the algorithm get each image and process it to computed into knn algorithm.

    You must put each character of handwritten in his folder folder 0, 1, 2, ...,9, a, b, c,...,z,A,B,... and then in ocr algoritm modify to accept the character a, b... to process it too.

    Expect i respond your question.

    ReplyDelete
  19. Hey there,

    I am trying to extract characters from an image but before that i need to deskew and denoise it. Can u help me out with this? I am trying the projection profile technique for the same. Any help would be really appreciated..

    Regards,

    Vikas

    ReplyDelete
  20. from where i download this code of ocr

    ReplyDelete
  21. please tell me good tutorial for open cv

    ReplyDelete
  22. Hi Vikas, for denoise you can use a filter or a morphological algorithm, for deskew, you must look for the lines that define the deskew transformation to inverse this transform.

    ReplyDelete
  23. dear sir i have installed open cv and vc++ i am able to compile and bulit the code but how to see out put

    ReplyDelete
  24. sorry, what code, the ocr code, the opencv code?

    If you compile my code you only must execute the output.

    ReplyDelete
  25. sir i am working with windows

    ReplyDelete
  26. yes but sure it work correctly, you have errors when compile? If you don't have errors can be windows don't found OCR images... check if you don't have errors when run the program and don't found files.

    ReplyDelete
  27. please help me in this matter

    ReplyDelete
  28. I don't have VC installed and i think i don't have any windows here, but give me some time to install a virtual machine with windows and install VC++ and openCV and then create a VC project that you can download.

    ReplyDelete
  29. I have istall opencv on linux i am getting the error of include files and library files kindly guide in this matter

    ReplyDelete
  30. Yes, sure you only need set this variable

    PKG_CONFIG_PATH=/usr/local/lib/opencv/pkgconfig

    and then

    export PKG_CONFIG_PATH

    The path i'm not sure because depend where you install opencv library.

    ReplyDelete
  31. where i should set the path

    ReplyDelete
  32. in terminal you only mus write:

    PKG_CONFIG_PATH=/usr/local/lib/opencv/pkgconfig

    then write

    export PKG_CONFIG_PATH

    and to finish write

    make

    ReplyDelete
  33. i am getting error no mkae target defined

    ReplyDelete
  34. you no are in ocr source folder.

    ReplyDelete
  35. but go into ocr source folder ;) and type make

    ReplyDelete
  36. these are all my paths of the files please guide me

    /root/Desktop/opencv-1.1.0/cv/include
    /root/Desktop/opencv-1.1.0/cv/src
    /root/Desktop/opencv-1.1.0/cxcore/include
    /root/Desktop/opencv-1.1.0/cxcore/src
    /root/Desktop/opencv-1.1.0/interfaces
    /root/Desktop/opencv-1.1.0/cvaux/include
    /root/Desktop/opencv-1.1.0/cvaux/src
    /root/Desktop/opencv-1.1.0/ml/include
    /root/Desktop/opencv-1.1.0/ml/src
    /root/Desktop/opencv-1.1.0/otherlibs/highgui

    ReplyDelete
  37. ok, i understand you never use linux???

    you download a source of opencv, then you must compile and install it, this are the steps:

    ./configure
    ./make
    sudo make install

    if it finish without errors then you have installed opencv then you only need do:

    PKG_CONFIG_PATH=/usr/local/lib/opencv/pkgconfig
    export PKG_CONFIG_PATH

    then go to ocr folder and type

    make

    But if you never user linux, i recomend, compile it in windows..... it's no dificult i think....

    ReplyDelete
  38. sir i am getting error while installing opencv
    suggest me alternate version

    ReplyDelete
  39. Sure if you have error installing this version you sure have error with other versions.

    Try install with synaptic, it's a linux software where you can install most of all linux software, then go to menu admin and execute synaptic, then search by opencv and install the libs and libsXXX_dev

    ReplyDelete
  40. Hi Damiles!first of all excuse me for my English...This tutorial is very good!It gave me an idea...I would use the same system to create software for assisted driving. I want to capture frames with a webcam mounted on the dashboard of a car. For each frame I find the edges and then I only analyze the ROI that covers the road's board. At the end I have a binary image for each frame containing the contours. I want to use a classifier like yours for predict the steering angle (in a scale from 0 to 9). For example if the car is very near to the left border, the system will alert the driver and the output will be 0. If the car is in the right trajectory, the output will be 5. If the car is very near to the right board the output will be 9,....
    I have wrote some lines of code but I have never used classifiers...so if you want I can sand you my sources by email...and if you have time maybe you could have a look. Contact me by email if you want.

    Thanks, Phelipe

    ReplyDelete
  41. Hi
    I need to find a text in a scene with a complex background. Can you help me?
    Regards

    ReplyDelete
  42. Hi Hamed, i go to write this week a basic and advanced tutorial of background substraction, pay atention to my blog!

    ReplyDelete
  43. Hi Damiles.
    Thanks for the response and sorry to interupt you again.
    I have examined ur blog but could not find it.
    I have one picture and i cant form the background (might help!).
    Regards

    ReplyDelete
  44. Hamed, tomorrow i will post it, please give me some time to write it hehehe... :D

    ReplyDelete
  45. Thanks for the rapid response.
    I am waiting eagerly to heare from you.
    If you have any idea about License Plate Location please do me a favore and share it too.

    ReplyDelete
  46. Hi ,
    I m having dificulties in compiling /binding your package on MAC OS ,, I have OpenCV done ,,, any sugestions ?
    Thaks,

    ReplyDelete
  47. Hi sergio, what is the problem, what is the error you get? I compile it in mac os perfectly.... I can create a cmake cross compiler if better for you.

    ReplyDelete
  48. This might be a stupid question, but I'm trying to understand how and where in your code you extract features from the images in your samples. More specifically how are you extracting the features and what features are you looking at for the images of the numbers? How are you evaluating these features when comparing it to the image file you are testing the accuracy of the classifier on?

    ReplyDelete
  49. There are no stupids questions.

    The features are the pixels of image.
    This is in the "basicOCR::getData()" function
    I use trainData variable for store the features.
    I use trainClasses for store the class identifier.

    First you get the row we go to use of trainData variable
    cvGetRow(trainData, &row, i*train_samples + j);

    Then get the image data
    IplImage* img = cvCreateImage( cvSize( size, size ), IPL_DEPTH_32F, 1 );
    And convert it into 32 float image
    cvConvertScale(&prs_image, img, 0.0039215, 0);

    I convert the image matrix into a vector data
    cvGetSubRect(img, &data, cvRect(0,0, size,size));
    CvMat row_header, *row1;
    row1 = cvReshape( &data, &row_header, 0, 1 );

    We store the data into row that is pointer of variable row of trainData
    cvCopy(row1, &row, NULL);

    ReplyDelete
  50. Hi damiles,

    I really appreciate your example code.
    I am a newbie of OpenCV. I were assigned a project to separate the text from the image, then to recognize the text and also analysis the image structure(with text removed). But I really have no any idea how to separate the text from image. Any suggestions or example code?

    Thank you in advance!

    ReplyDelete
  51. hello,

    can help me?
    I yet study openCV and I would compile your application with VC++ (MS-Visual Studio C++), do you have any example?

    ReplyDelete
  52. Hi Togan, i don't have any example to separate the image to the text, sorry.

    Has, it's no dificult compile with VC++, you only need set correctly the libraries. I go to create a cmake for this example, for correct cross compiling.

    ReplyDelete
  53. hello,
    i just downloaded openCV and was trying to get a hang of it. I was going through the OCR code, becuase I want to work on OCR too....when i checked the findX and findY functions it seems to me, this will work for only text on white backgrounds. Am i right? how do i work around that? also, how do i get the color of a pixel eg i want to test if (20,30) is black?

    ReplyDelete
  54. yes you're right.

    For test the color pixel of x,y position you can visit the faq documentation you get with opencv source (doc/faq.htm)

    in faq said:

    How to access image pixels

    (The coordinates are 0-based and counted from image origin, either top-left (img->origin=IPL_ORIGIN_TL) or bottom-left (img->origin=IPL_ORIGIN_BL)

    * Suppose, we have 8-bit 1-channel image I (IplImage* img):

    I(x,y) ~ ((uchar*)(img->imageData + img->widthStep*y))[x]

    * Suppose, we have 8-bit 3-channel image I (IplImage* img):

    I(x,y)blue ~ ((uchar*)(img->imageData + img->widthStep*y))[x*3]
    I(x,y)green ~ ((uchar*)(img->imageData + img->widthStep*y))[x*3+1]
    I(x,y)red ~ ((uchar*)(img->imageData + img->widthStep*y))[x*3+2]

    e.g. increasing brightness of point (100,100) by 30 can be done this way:

    CvPoint pt = {100,100};
    ((uchar*)(img->imageData + img->widthStep*pt.y))[pt.x*3] += 30;
    ((uchar*)(img->imageData + img->widthStep*pt.y))[pt.x*3+1] += 30;
    ((uchar*)(img->imageData + img->widthStep*pt.y))[pt.x*3+2] += 30;

    or more efficiently

    CvPoint pt = {100,100};
    uchar* temp_ptr = &((uchar*)(img->imageData + img->widthStep*pt.y))[x*3];
    temp_ptr[0] += 30;
    temp_ptr[1] += 30;
    temp_ptr[2] += 30;

    * Suppose, we have 32-bit floating point, 1-channel image I (IplImage* img):

    I(x,y) ~ ((float*)(img->imageData + img->widthStep*y))[x]

    * Now, the general case: suppose, we have N-channel image of type T:

    I(x,y)c ~ ((T*)(img->imageData + img->widthStep*y))[x*N + c]
    or you may use macro CV_IMAGE_ELEM( image_header, elemtype, y, x_Nc )
    I(x,y)c ~ CV_IMAGE_ELEM( img, T, y, x*N + c )

    There are functions that work with arbitrary (up to 4-channel) images and matrices (cvGet2D, cvSet2D), but they are pretty slow.

    ReplyDelete
  55. Hi,
    Could you please explain your getData function to me....I'm not understanding the functions you've used and why you've used them.

    1.How are you exactly setting trainData? Are trainClasses and trainData initialized to something or do I just declare them as CvMat* variables?

    2.I know cvGetRow gets the row value from a matrix but I what is this line doing exactly:
    cvGetRow(trainData, &row, i*train_samples + j);

    ReplyDelete
  56. Hi again,

    I got it going....took me time to understand but finally did...thanks for the earlier reply btw.... just a lingering doubt, everytime I run the program do I need to train it?

    ReplyDelete
  57. yes, you need do this if you don't save the train data, because this information you need for classify.

    You can save this data for no process the train images with opencv save data functions.

    ReplyDelete
  58. Sir,
    I have to train a set of retinal images for classifying another test set. Can you explain the working of knn classifier and cvNearest method.

    Thanking You

    ReplyDelete
  59. Hi Rahul, you can see the before post, in this post i explain a simple demo about knn, here is the link http://blog.damiles.com/?p=84 (The basic patter recognition and classification with openCV)

    ReplyDelete
  60. Dear damiles

    do you have a c# version of this code??

    thanks

    ReplyDelete
  61. No, i don't have a c# version of this.

    ReplyDelete
  62. Hello,
    Thanks for a good post! I've downloaded the basicOCR.tar.gz however the makefile is empty. Is this a mistake?

    /Karl

    ReplyDelete
  63. KarKrukow, there are a ORCbuild.sh for comile.

    ReplyDelete
  64. I'm using opencv1.1.pre1, btw.

    /K

    ReplyDelete
  65. hi damiles how r u, im a studing computer science i need some help in a program is like ur program but i have some difficulties, could u help me plz??

    ReplyDelete
  66. How I can help you angela?

    ReplyDelete
  67. befor, could u give me ur mail to contact with u cuz it will be difficult to talk there

    ReplyDelete
  68. Sir,
    May i know how to compile using ORCbuild.sh cz i nvr use this kind of compilation method before.

    regards,
    wei

    ReplyDelete
  69. This is a simply shell script for unix system, it only execute this commands:

    g++ -ggdb `pkg-config opencv --cflags --libs` preprocessing.c -c
    g++ -ggdb `pkg-config opencv --cflags --libs` preprocessing.o basicOCR.cpp -c
    g++ -ggdb `pkg-config opencv --cflags --libs` preprocessing.o basicOCR.o main.c -o OCR

    ReplyDelete
  70. Sir,
    I get the compilation method already.
    May i know that if i wan do text recognition in real time, then how can i extract the text feature from camera captured image.

    regards,
    wei

    ReplyDelete
  71. Very good info. Thank you Damiles.

    ReplyDelete
  72. hi, do know how can i connect a scanner to opencv? for scanned images to be processed in a program, thanks

    ReplyDelete
  73. christian, you must save the scanner image data to read it in opencv

    Then you must have access to scanner with the scanner's library to get the image data to pass as a iplimage to opencv.

    Regards

    ReplyDelete
  74. hello sir, im a friend of christian's. did you mean the TWAIN library?

    ReplyDelete
  75. thank u for d reply. :)

    but do u have an idea how can i process multiple images one after another without saving the scanned images?

    ReplyDelete
  76. troi twain for example is good.

    christian, if you use twain, you get a twain image structure in memory that you can save or convert to another structure as iplimage for opencv, when you finish to use this image memory you can save this image or release and capture another....

    ReplyDelete
  77. are you familiar with computer vision? what library could you suggest? thank you for the info

    ReplyDelete
  78. Thanks for reply Damiles.

    I download the VC++ visual studio express, but the debug button (F5) is disabled. I couldn't compile the files, any idea.

    note that I am using windows vista SP1, and visual studio 2008.

    ReplyDelete
  79. hi damils

    thanks for your demo source it really helped me understanding more about opencv

    but i have some doubts..

    what is the signiicance of using 0.0039215 in the cvConvertScale function?

    i understand that the scale parameter of the function multiplies the value of each pixel and the shift parameter adds to every pixel

    but how multiplying by 0.0039215 gets the image to the 32 floating point image?? i want to know how to get this number

    and if i am using jpeg images for the sample data than pbm files, do i have to use the cvConvertScale function?

    ReplyDelete
  80. Yes the 0.0039215 is for convert 32 floating point image.

    If you use a jpeg you are importing the image to 255 and you need to convert to 255 too.

    Regards.

    ReplyDelete
  81. nivasan, you must debug your application, and ensure you are loading correctly images and you are not overwritting pointers or no load correctly images, more times we don't load correctly the image and then we have a array filled with 0's....

    ReplyDelete
  82. oh it was my mistake

    well i only gave 4 training data and two classes and my k was 10 so obviously it will allways return the first value as the nearest match ( at least i hope this is the explanation) i reduced the k to 5 so it gave me some ok results

    sorry for bad English

    thanks for your support and guidance,

    Nivasan Sharma

    ReplyDelete
  83. Hi Damiles,

    I try to run your OCR program but there is an errors such as:

    error C2258: illegal pure syntax, must be '= 0'
    error C2252: 'K' : pure specifier can only be specified error C2065: 'K' : undeclared identifier

    basicOCR.obj - 3 error(s), 0 warning(s)

    Please help me.

    Amin

    ReplyDelete
  84. Hi again.

    I try run again and the error reduce from 3 to 2 errors:
    which are:

    error C2252: 'K' : pure specifier can only be specified for functions
    error C2065: 'K' : undeclared identifier

    basicOCR.obj - 2 error(s), 0 warning(s)

    Please help me.

    Thanks again.

    Amin

    ReplyDelete
  85. Hi damiles

    in the preprocessing file you can use the cvfindcontours and take the bounding box easily rather than using the findx and findy and findbb functions

    please clarify me if i am wrong.

    Regards

    Nivasan Sharma

    ReplyDelete
  86. nivasan, is correct, you can use findcontours.... And use this contours to do the classify, but you mus train with contours too.

    ReplyDelete
  87. Hello!

    have you tried statistical moments (cvMoments) as features?

    By using moments you dont have to finding the smallest bounding box, because they are scale and place invariant.

    it would be interesting how the performance of your pretty straightforward approach differs to the complicated one.

    ReplyDelete
  88. hi phen, thanks for your comment.

    ReplyDelete
  89. thanks for this great blog!

    keep us updated if you continue to work on it :)

    ReplyDelete
  90. Hi! I was surfing and found your blog post... nice! I love your blog. :) Cheers! Sandra. R.

    ReplyDelete
  91. Hi
    I am using Dev-cpp and I am trying to compile the program. I recieve following error :
    File format not recognized
    ld returned 1 exit status
    D:\Panto\OCR\Makefile.win [Build Error] [OCR.exe] Error 1

    Anybody an idea ?

    ReplyDelete
  92. You can replace four lines by one:
    ----------------------------------
    if(j<10)
    sprintf(file,"%s%d/%d0%d.pbm",file_path, i, i , j);
    else
    sprintf(file,"%s%d/%d%d.pbm",file_path, i, i , j);

    ----------------------------------
    sprintf(file,"%s%d/%d%02d.pbm",file_path, i, i , j);
    ----------------------------------
    This is basic C know-how :P

    ReplyDelete
  93. Hey,

    Thanks for your article.
    However, I'm quite disappointed with this very line :
    cvCopy(row1, &row, NULL);
    which basically means that you train the classifier with the *hole image*.
    How long did the knn classifier take for those 50 samples of 40² pixels ?

    More importantly : which "high level" criterion would you recommend (area, spline contour parameters, moments...) ?

    thanks!

    ReplyDelete
  94. Hi
    first of all thanks for this wonderful tutorial.
    I am doing a project that needs to extract a string of numbers from an image. I can get an image where it only contains numbers.
    My problem is how can i segment the image so i can get each number and test it against the sample?

    ReplyDelete
  95. You can use cvBlobd linbrary http://opencv.willowgarage.com/wiki/cvBlobsLib

    Or user any segmentation type or use countours or similar.

    Regards.

    ReplyDelete
  96. damiles, thanks for this great tutorial. it was actually very interesting and complete!

    ReplyDelete
  97. Hey...very nice tutorials. I was wondering if you have any tutorial on human action recognition as well...using some feature point descriptors and then training the model to classify the actions....

    ReplyDelete
  98. Hi,

    i compiled and run your code in vc++. I changed your pbm images into jpg but it did not give the good result if i draw the single horizontal line which is classified as class 4. Is there any code change is needed? please help me


    Thanks in advance
    Manivasakan B

    ReplyDelete
  99. Sorry vasakan, i don't understand what error you have... Can you explain it more detailed.

    Thanks

    ReplyDelete
  100. Thank you very much for your reply.

    if i draw the number from 0 to 9 that is classified from the corresponding class but i draw the shape that is not a number(differnet shape ) that shape also classified from the number class which give 100% accuracy.

    how to solve this problem.?

    Thanks
    Manivasakan.

    ReplyDelete
  101. Ah, ok, the system is prepared to get only 10 classes, numbers, then you must only draw numbers, if you draw a no number, then the system get the more approximate number class you draw. If you want no number class you must train with a new class to get the no number objects, but it's more complex.

    The accuaracy is how many of k-neighbourds are of the same class knn algorithm select as better. Then if you have 5 of 10 neighbourds you have a 50%.

    Solve proble of no class defined, you must work with a diferent learning algorithm as EM that is unsupervised in some cases more flexible.

    Regards.

    ReplyDelete
  102. Thanks

    I really appreciate your valuable help. currently i am working on pit pattern classification. i want classify pattern from the cancer tumor images now i try to classify the Type III L pit pattern which look like tubular shape. i expect help from you.

    I have basic knowledge in image processing with opencv


    thanks
    Manivasakan .

    ReplyDelete
  103. vasakan, your project is more advanced than this example, but it can help to start. you must have a good database of image properties, and train you classifier with it, and create 2 class, the tumor cancer and no tumor cancer.

    There are a lot of medical papers with this themes, i recomend you search technical papers.

    Regards.

    ReplyDelete
  104. thanks


    Is KNN classifier good choice for pattern classification. I have a good knowledge about Type III L pit pattern. I like to take contour points for processing. can you give your mail id?. i will send the image to you for your clarification.

    Regards
    Manivasakan.

    ReplyDelete
  105. KNN is good classifier for most task but there are more that you must check

    ReplyDelete
  106. Thank you very much.


    i have a image 640X480 size and i want to create the template for 128X128 size using contour points in opencv.

    can you suggest me good idea for template creation?

    i have tried to convert the contour points manually from 640X480 to 128X128 but it did not give the good template.


    Regards
    Manivasakan.

    ReplyDelete
  107. First, try use non pixels as parameters, i use it because is the most simpy to understand. I suggest read scientific papers to see what is the best template, and algorithm. I can't explain correctly what are the best for this computer vision tasks

    ReplyDelete
  108. Thank you very much for spending lot of time for me.


    i will read good papers for best template and algorithm and then i will get back you.


    Thanks
    Manivasakan

    ReplyDelete
  109. Hi,

    Have you seen my images? I want to create the class for that images. Can you give some idea for template creation. I going to create the two classes one for Type III L pit pattern another one for non Type III L.

    Is there any tool for template creation?

    Thanks
    Manivasakan B.

    ReplyDelete
  110. Hi!

    Can I use OpenCV in recognizing the font used in a document image? I am a noob in OpenCV. Can you give me some tips to make my project possible?

    Thank you in advance!

    ReplyDelete
  111. Hi,

    Can I use opencv in my code for ocr in mobile phone?because having a hard time where or how will i start my project. by the way im a student.
    i hope you give me more basic tutorial.

    thanks in advance.

    ReplyDelete
  112. Hi, there are some opencv port for some mobiles as symbian os or iphone, search in web about opencv port mobile.

    This tutorial is the most basic about ocr basics and pattern recognition, but it's not the best way to do a pattern recognition for ocr, you must use other ways as chains or similar simplify data and do more fast for mobile.

    Regards David.

    ReplyDelete
  113. how do i segment textlines and words of a document using opencv?? this is for offline handwritten recognition..pls help

    ReplyDelete
  114. I'm using Tesseract OCR engine for my hand written character recognition project. The letters are recognized one by one in the system and finally out put as words. But I want to improve the accuracy by again reviewing the entire words after the character recognition. In that case we can check whether the the word is having any meaning by comparing them with the words in our data base.

    But I don't have any idea of how to do it with openCv. So please can you help me on this issue. Thanks in advance.

    Thilanka.

    ReplyDelete
  115. Hi Damiles!! Your basic OCR is very good)) But I have some questions - do you use Neural Networks? And if use - what`s the method of recognition in this OpenCV OCR you use? Is this neural network`s back propagation method? Where is in this code you using Neural Networks? Thanks)))

    ReplyDelete
  116. Hi EstateMX, I don't use Neural Networks, i use a simply knn algorithm

    ReplyDelete
  117. hey there.do you have a code for motion tracking of multiple objects that doesn't necessarily appear in the 1st frame?

    ReplyDelete
  118. Jay, i go to prepare a new tutorial for this tasks. Wait some days.

    ReplyDelete
  119. I would like for my robot to do two things. First to simply be able to read type written pages. Hand written is nice also like this lesson,but typewritten is what I wish for. Next, I would like for him to be able to recognize objects such as a medicine bottle, coke can, etc. Is this possible with a few mods to THIS program. Oh, I also need for it to be in OpenSource because I am on a fixed income. Your tutorial was very good!

    Thanks!

    ReplyDelete
  120. Ooops! I forgot. The robot is using standard web-cams to see with.


    Thanks again,
    :-)

    ReplyDelete
  121. Hi MovieMaker, yes you can do it with some mods of this app, other people use this simple sample as base of Sign language. But this is a sample and no is the best way for a real and profesional project. But is good for simply projects, to know how work the functions and the pattern recognition bases.

    Regards.

    ReplyDelete
  122. Hi!! thanks for the post

    Please help me, :( i've tried implement the classifier but i always receive the greater K,i paste some of my code:
    #define cantidad 100
    int clases = 2;
    int K = 2;
    CvMat* trainData = cvCreateMat( clases*cantidad , 36 , CV_32FC1 );
    CvMat* trainClasses = cvCreateMat( clases*cantidad , 1 , CV_32FC1 );
    printf("\nObteniendo datos...");
    //Obtenemos los datos del archivo
    printf("...");
    trainData = ( CvMat* ) cvLoad( "datos.xml" );
    printf("%d,%d",trainData->rows,trainData->cols);
    CvMat row;
    for( int j = 0 ; j < clases ; j++ )
    for( int i = 0 ; i < cantidad ; i++ )
    {
    //Set class label
    cvGetRow( trainClasses , &row , j*cantidad + i );
    cvSet(&row, cvRealScalar( j+1 ));
    }
    printf( "...Datos obtenidos" );
    printf("\nEntrenando...");
    //Entrenamos el clasificador
    printf( "..." );
    CvKNearest knn( trainData, trainClasses, 0, false, K );
    printf( "...Entrenamiento terminado" );
    printf("\nProbando...");
    //Lo probamos
    printf( "...\n\n" );
    int error = 0 , cuenta = 0 , k;
    CvMat *prueba = cvCreateMat( clases*50 , 36 , CV_32FC1 );
    prueba = ( CvMat* ) cvLoad( "test.xml" );
    for( int j = 0 ; j < clases ; j++ )
    {
    for( int i = 0 ; i <50 ; i++ )
    {
    cvGetRow( prueba , &row , j*50 + i );
    float r = knn.find_nearest( &row , K , 0 , 0 , 0 , 0 );
    printf( "%f, " , r );
    k = j+1;
    if( (int)r != k )
    error++;
    cuenta++;
    }
    }
    float porcentaje = ( 100. * ( (float)error ) ) / ( (float)cuenta );
    cvReleaseMat( &prueba );
    printf( "...prueba terminada\nError promedio: %.2f" , porcentaje );
    printf("\nTodo salio bien ");

    The problem is that the find_nearest always return 2 ¬_¬, i thought that the problem was the file "datos.xml" & "test.xml" but i've seen that the matrix is correct, but if instead cvRealScalar( j+1 ) i put cvRealScalar(j) the only response is 1 ¬_¬'.
    Please help me,
    Thanks in advance
    Greetings

    ReplyDelete
  123. Hi Ricardo, I see you use a lower K value, you use 2, i recommend use 8 or higher for Knn construct and then for find use 4 or higher and lower than 8.

    Regards David.

    ReplyDelete
  124. ohhh, lot of thanks man!!!!

    what a stupid error =P, you're the man, i put k=10 and it's ok ohh my god, just 2 hours before that i have to present it =D

    Oh by the way, when i was looking for help in the web i have seen somebody who copy paste your post, http://leemin.tistory.com/7

    Thank u again =D

    ReplyDelete
  125. is this only for handwritten image? or it can be use for other digit image(those in photos or pictures)

    ReplyDelete
  126. Hi lalafell, this is a tutorial with an example of handwritten images, and i want explain how train and use of basic pattern recognition with handwritten but you can use this for any other use images, Vehicle registration plate..., i know this tutorial was used for a sign language project.

    ReplyDelete
  127. Hello
    Sorry but i am a new programmer here and i wand to make an ocr application but unfortunately i cant even run your code , i have a visual studio 2005!!
    is there any read me file or something to help

    Thanks,

    ReplyDelete
  128. Are you using opencv 1.10? check update to last opencv version...

    Regards.

    ReplyDelete
  129. Hi damiles,
    it works with me in a good way under windows xp and the system error is 9.80%
    but there is something i want to ask about, when i tried to load image instead of draw image by cursor by using imagen=cvLoadImage("pic.png");
    it cause an exception
    "OpenCV Error: Assertion failed (src.channels() == dst.channels()) in un
    known function, file ..\..\..\src\cxcore\cxcopy.cpp, line 484"
    Thanks,

    ReplyDelete
  130. Hi Damiles,
    Its an awesome tutorial.... But its for handwritten text.... I wanna do character recognition on computer fonts only. Specifically, I want to remove the labels in a map. Can you please tell me how to go about it?
    Thanks.

    ReplyDelete
  131. It's same process only you need train your model with computer fonts instead handwritten text.

    ReplyDelete
  132. Hi Damiles,

    it seems to me a great tutorial but, I'm very beginner with OpenCV. Can you tell me which version of the library was used in your project ?

    I'm working on Windows XP with VC 2005. With the version 2.0.0 I couldn't compile (the constructor CvKNearest::CvKNearest you used in basicOCR::train was changed) and with the 2.1.0 version I can compile but at runtime I always get an error message that the application couldn't start.

    ReplyDelete
  133. i think i compile it with opencv2.0. But, what error you get? what runtime error? you can debug to see where is the runtime error occour?

    ReplyDelete
  134. With version 2.0 the main problem was at runtime where I got messages like "cannot find function cvDestroyWinodow in cv200.dll" (the wrong DLL); so, I guess it's a linking problem.

    With version 2.1 first I have a compilation problem in "cvinternal.h" at line 117 (#include "pmmintrin.h" - the file is missing); I guess it's a bug I commented the line and then it compiles with no problem.

    At runtime debug, when I link with the libraries with ending "d" (like: cv210d.lib) - the application fails with the message: "The application was unable to start correctley (0x0150002)" - so, I guess another bug - the libraries are not linked correctley; it doesn't work even if I make a program with only one call to cvNamedWindow.

    At runtime release it works perfectly (I tried it today) when I link with the libraries with no ending "d" (like cv210.lib)

    ReplyDelete
  135. It's strange, i'm not use VC++ but it can be a incorrect project settings...

    Regards

    ReplyDelete
  136. hi,
    i am a student working on hand gesture recognition.1st i have to go for hand gesture segmentation. please help me

    ReplyDelete
  137. hello,
    Thanks for this trutorial. Accually I am using the latest version of the OpenCV which requires to use CMake. Frankly, I'm not that familiar with CMake, so if you can rise the CMakeLists.txt for this trutorial, I will be appreciated!

    ReplyDelete
  138. please tell me how to get a boundingbox over a specified pixel area of a binary image

    ReplyDelete
  139. Hi mithun, you must fincontours of your binary image, and then you can use boundingRect function to get the bounding box of your contour.

    Findcontour function: http://opencv.willowgarage.com/documentation/cpp/structural_analysis_and_shape_descriptors.html?highlight=findcontours#findContours

    boundingRect function:
    http://opencv.willowgarage.com/documentation/cpp/structural_analysis_and_shape_descriptors.html?highlight=findcontours#cv-boundingrect

    Regards.

    ReplyDelete
  140. Hi Mhd, opencv require cmake to build and install opencv but your projects can be compiled with makefiles, cmake, sh, vc++ and all you want, cmake is a makefile generator then you can use makefile.

    In this tutorial i use a shellscript for compile OCRbuild.sh, you can execute this script to compile.

    But in few days i create the cmake compiler for you ;) Regards David.

    ReplyDelete
  141. First of all .Thanks a lot for you BasicOCR, it helps me a lot in my study. Well done.
    But I have several question about your code:
    (1)In function both "findX" and "findY" you calculate the maxVal use
    "CvScalar maxVal=cvRealScalar(imgSrc->width * 255)".
    but i think maxVal for X and Y should be different,
    int "findX" it should be CvScalar maxVal=cvRealScalar(imgSrc->height * 255)
    (2)In function "getData" and "classify" ,in order to use opencv's knn, you convert the image depth from 8bit to 32bit use param 0.0039125 , why?

    Thank you very much

    ReplyDelete
  142. hi,thanks for your suggession,now my problem is how to select the starting point for the boundingbox using boundingRect function .So that in cam boundingbox appears across my hand only

    ReplyDelete
  143. Hi Yanlin

    1.- Yes it should be height instead width, but my OCR images are same width and height, then you can use to max value the with width or height.

    2.- for convert 8bit to 32bit, 8bit are in range 0 to 255 but 32bit should be 0 to 1, then you no only convert the internal format, you rescale the values, then you when convert 0 - 255 to 0 - 1, you get this formula: pixel_src/255, this si pixel_src* 1/255, and 1/255=0.0039125.

    ReplyDelete
  144. Hi mithun, i don't understand what is your problem, sorry, my english is so bad.

    for you get the bounding rect you should first search the contours, and use this contours to get the bounding rect.

    ReplyDelete
  145. Awesome! Exactly what I was looking for!! Thank you!

    ReplyDelete
  146. free ocr is a online ocr service. You can have a try.

    ReplyDelete
  147. This is a great example!
    Did you do this on Mac in Xcode?

    ReplyDelete
  148. In the last download demo, you can generate it with cmake util.

    cmake -g xcode

    ReplyDelete
  149. Hello.
    Thanks for this example.

    I think you want maxVal with cvRealScalar(imgSrc->height) in findX() func. (about 34 lines)
    Isn't it?

    ReplyDelete
  150. Hello Seongman, this line is correctly, the findY function, line 55, must be imgSrc->height, but it's no problem in my example because images has same with and height.

    Regards. David

    ReplyDelete
  151. hi,when I run basicOCR,I have found some errors:
    d:\basicocr\basicocr.h(31):error C2258: illegal pure syntax, must be '= 0'
    d:\basicocr\basicocr.h(32) : error C2252: 'k' : pure specifier can only be specified for functions
    d:\basicocr\basicocr.h(33) : error C2143: syntax error : missing ';' before '*'
    d:\basicocr\basicocr.h(33) : error C2501: 'CvKNearest' : missing storage-class or type specifiers
    d:\basicocr\basicocr.h(33) : error C2501: 'knn' : missing storage-class or type specifiers

    that is in this line "const static int k=10".Do you know why?

    ReplyDelete
  152. goodocr.com is an interesting free web tool that reproduces the features of OCR software directly online, this would be enough to upload an image of formats like pdf, jpg, gif, tiff, bmp, png and get the recognizied results.

    ReplyDelete
  153. This is a great post!

    I would like to use the data, is this your own data? How can I reference it in a publication?

    ReplyDelete
  154. Hey! Thanks for the great code!
    I have a dataset of 128x128 images of the 26 letters (alphabets)
    What modifications will I need to implement in the code?
    What I've done:
    a. Set size = 128
    b. set K=26

    I'm not sure if the K setting is correct. Is it?

    I am not getting satisfactory detection with my sent image. So, I wanted to know if there was something wrong with my sent image, or in my implementationThank you!

    ReplyDelete
  155. You must use a k value between 2 and 8 for example, is you use 26 you want get 26 neighbourds then you get all letters, you can use for example 4 neighbourds. K is the number of neighbourds you want check. You can found more info about knn algorithm in http://en.wikipedia.org/wiki/K-nearest_neighbor_algorithm

    Regards David.

    ReplyDelete
  156. Hi David,
    your tutorial helped me a lot.
    The vector doesn't need to be trained each time, so how do I persist the vector data?
    Appreciate your work!

    ReplyDelete
  157. Hi, you can use for example this tutorial for save and load this vector: http://blog.damiles.com/?p=262 and there are functions specially for save opencv vector data, in opencv Doc.

    Regards.

    ReplyDelete
  158. Hi,
    Thank you so much for the help you are providing to opencv comunity. I was wondering if you had or you know of any website that has data sets of english alphabets.

    ReplyDelete
  159. No sorry, i don't know any public dataset.

    ReplyDelete
  160. Hi, I'm having these errors when i try to run at vs2010, even i have stdafx.h at the same folder. Would you give me an advice about this?
    Thanks for the tutorials
    1> basicOCR.cpp
    1>d:\background\robot_image\preprocessing.h(9): fatal error C1083: Cannot open include file: 'stdafx.h': No such file or directory
    1> main.cpp
    1>d:\background\robot_image\basicOCR.h(12): fatal error C1083: Cannot open include file: 'stdafx.h': No such file or directory
    1> preprocessing.cpp
    1>d:\background\robot_image\preprocessing.h(9): fatal error C1083: Cannot open include file: 'stdafx.h': No such file or directory
    1> Generating Code...
    1>

    ReplyDelete
  161. You create the Visual Studio project from CMake builder?? Please download last source with cmake, create VS project from cmake and then open project anc compile it.

    Regards.

    ReplyDelete
  162. Hi..thanks for your tutorials. I am doing a Sri Lankan ancient coins recognition system for my final year project. Most of the coins have been minted the king's name. In your example, in your given image, there is only one number right? How I can give the place where the each character is on the coin. And for an example, if I consider the first character on coin, it may be in (x,y) position.. but in another coin the first character may not be in the same (x,y) position. So, how can i segment characters? Can you please help me? I am an undergraduate student in Sri Lanka. Thanks a lot in advance.

    ReplyDelete
  163. Hi, first before you segment the characters, you need preprocess your image to detect the position, to do this task, you need analize the coins characteristics to detect how you can know the center mass and the rotation. Then normalize the coins appliyng the transformations you need in this case the rotation, crop and scale all coins at same size can be sufficient, then in this moment you know the (x,y) position of first character. And other characters you can use a radial histogram from coin center to detect each character.

    Regards David.

    ReplyDelete
  164. Hi. thx for your reply. I am trying it now. I have sent you a mail attaching some of coin images that I am using and what I have done so far. Your email is david@artresnet.com right?

    ReplyDelete
  165. Hello David,

    I am trying to implement an OCR for digits, I am stuck at what features to use. Can You please provide me with the list of features you used for k-NN classification ?

    regards,
    vamsidhar

    ReplyDelete
  166. In this tutorial i use each pixel as feature, but you can use other features as 7 invariant hu moments or any you choose

    ReplyDelete
  167. Hi, Do we have to save 7 invariant hu moments in a file for classification. How I can use these hu moments for classification. IF I have to save it to a file, how can I save it and which format should I use to save? Can u pls help me?

    ReplyDelete
  168. Hi, It's recomend you save the clasification into file, the way to use 7 hu moments for classification are equal to i explain with pixels in this tutorial, only change nxm pixels by 7 hu moments. You can check opencv storage functions and tutorial this http://blog.damiles.com/?p=262

    Regards

    ReplyDelete
  169. Thanks. When I run knearest algorithms, It's displaying this error.
    undefined reference to `CvKNearest::CvKNearest(CvMat const*, CvMat const*, CvMat const*, bool, int)'
    collect2: ld returned 1 exit status

    I am using Qt and OpenCV and I have already included library ml210, source path "D:\OpenCV2.1\src\ml" and include path. Furthermore, I have included header file also. Can u pls help me?

    ReplyDelete
  170. Hello :)

    Thank you for your example.

    I am trying to understand this line of code:

    cvConvertScale(&prs_image, img, 0.0039215, 0);

    This value: "0.0039215"
    - where is it comming from
    - why do you use it (why not make it 1)?

    Thank you

    panJames

    ReplyDelete
  171. Nadeeshani:

    This is my Linker->Input->Additional Dependencies:
    cv210d.lib cxcore210d.lib highgui210d.lib ml210d.lib

    Do you have them all?

    panJames

    ReplyDelete
  172. Hello panJames, i respond this question in other comment, but here is the respond.

    for convert 8bit to 32bit, 8bit are in range 0 to 255 but 32bit should be 0 to 1, then you no only convert the internal format, you rescale the values, then you when convert 0 – 255 to 0 – 1, you get this formula: pixel_src/255, therefore pixel_src* 1/255, and 1/255=0.0039125.

    ReplyDelete
  173. Thanks for reply to Nadeeshani

    ReplyDelete
  174. Thanks panJames. I have included all libraries.
    In Qt, we have a .pro file which consists of all paths. Mine is as follows
    ---------------
    INCLUDEPATH += "D:\OpenCV2.1\include\opencv" \
    "D:\OpenCV2.1\src\cv" \
    "D:\OpenCV2.1\src\cvaux" \
    "D:\OpenCV2.1\src\cxcore" \
    "D:\OpenCV2.1\src\ml" \
    "D:\OpenCV2.1\src\highgui"

    LIBS += -L"D:\OpenCV2.1\lib" \
    -lcv210 \
    -lcvaux210 \
    -lhighgui210 \
    -lml210 \
    -lcxcore210
    ---------------------------

    Here, you can see that I have included all lib files. But, still I am getting the issue.

    ReplyDelete
  175. Hi.. problem solved. Problem was I was using GCC compilers. I used MSVC compiler and now it's compiling well.

    I got the solution from opencv community (Samuel Audet) "If you are trying to use with GCC the precompiled binaries of OpenCV for Windows, it won't work for C++ functionality. Have you recompiled OpenCV with GCC? C++ binaries produced by MSVC and GCC and not compatible against each other."

    Thanks for all. :)

    ReplyDelete
  176. I would like to do Basic OCR in java
    I am new to programming
    Please help
    Cant understand heads or tails of the program

    ReplyDelete
  177. Hi, what you don't understand exactly?

    ReplyDelete
  178. i can't build this program, and there is problem it is breaking at line
    CvScalar maxVal=cvRealScalar(imgSrc->width*255);

    can u help me?

    ReplyDelete
  179. Hi,

    I love the tutorials and the sample code. I am planning to adapt your OCR implementation to work with letters. My goal is to better identify traffic signs (starting with stop sign) by the letters on the sign and well as the shape.

    Nick

    ReplyDelete
  180. Hi smiley, what is your problem? in what os you are working? please send the error you get when compile.

    ReplyDelete
  181. hi nwadams, for traffic signs the ocr is the last you must do. In traffic there are a limited number of signs, and you can identify by form and other features, and the ocr only you must use for get the number of maxim km/h or similar.. But not for detect what sign is.

    It's more simple and robust use another methods for traffic sings detection.

    Regards.

    ReplyDelete
  182. to damiles, i can correct it thank you very much.
    :)

    ReplyDelete
  183. to damiles, i can fix this problem thank you very much.
    :)

    ReplyDelete
  184. damiles, i don't understand about Neural Network. i've to do a project which is about handwriting recognition

    can you help me? please

    ReplyDelete
  185. Dear Damiles,
    I have a task to build OCR application. The Goal of the application is to read character from LCD display. Can you give an suggestion how to this or maybe a flowchart to do OCR detection?
    I am new on Image Processing, but i am really interested in this are.

    Hope i can get your answer shortly

    Best Regards

    ReplyDelete
  186. Hi Yopy, the flowchart for this or another ocr it's similar to the flowchart of Basic OCR tutorial. You can merge the "Segmentation and feature extraction. Contours and blob detection" and this tutorial to create your lcd tutorial.

    Regards.

    ReplyDelete
  187. cvConvertScale(&prs_image, img, 0.0039215, 0)
    why is 0.0039215?

    ReplyDelete
  188. Please read other comments, i respond this 2 times and this 3...

    the 32float image must be in 0 - 1 range and the source image is 8int image with range 0 - 255 then you must scale values 1/255 = 0.0039215

    Regards.

    ReplyDelete
  189. Thanks,but I try to set like this:
    cvConvertScale(&prs_image, img, 1, 0)
    the result is right ,too
    In this case,I think it doesn't matter.
    How do you think?

    ReplyDelete
  190. Hello, Damiles
    Did you try SVM method ?
    About feature extraction ,did you try other methods?
    Such as LBP(Local binary pattern), PCA(Principal component analysis), Isomap, LLE
    It's said that PCA is Linear method, Isomap ,LLE are nonlinear methods.
    How do you think about that ?
    I have try PCA+SVM method in doorplate recognition, recognition rate is 97%.
    Number of train samples is 500, and number of test samples is 500, too.
    I achieve this by using Cvsvm and cvCalcPCA( ) in Opencv.

    ReplyDelete
  191. Thanks a lot for you BasicOCR, it helps me a lot in my study.

    ReplyDelete
  192. I don't know how to read pbm files on iphone, but sure iphone have libraries to read this type of image, once you have readed the image, sure have a data array where image values are stored, then only need create an iplimage and point the iplimage data to this array.

    Regards Damiles

    ReplyDelete
  193. Hi i don't test this features extraction on this case, but can be better, but for the tutorial i think the pixels values are the more simple to understand.

    The best method depend for each case, then the best is create a test and compare how method have the best results.

    Regards David (damiles)

    ReplyDelete
  194. I like that my tutorials helps people understand computer vision basics.

    Regards.

    ReplyDelete