Computer-implemented method for automatic extraction of data...

G - Physics – 06 – K

Patent

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

354/59

G06K 9/36 (2006.01) G06F 17/24 (2006.01) G06K 9/20 (2006.01)

Patent

CA 2000012

A computer-implemented method operable with conventional OCR scanning equipment and software, extracts character data from printed forms. A blank master form is scanned and its digital image stored. Clusters of ON bits of the master form image are first recognized as part of a line and then connected to form lines. All of the lines in the master form image are then identified by row and column start position and column end position, thereby creating a master-form-description. The resulting image, which consists only of lines in the master form, can then be displayed. Regions or masks in the displayed image of master form lines are then created, each mask corresponding to a field where data would be located in a filled-in form. Each data mask is spaced from nearby lines by a predetermined data margin, referred to as D. A filled-in or data form is then scanned and lines are also recognized and identified in a similar manner to create a data-form-description. The data-form-description is compared with the master-form-description by computing the horizontal and vertical offsets and skew of the two forms relative to one another. The created data masks, whose orientation with respect to the master form has been previously determined, are then transposed into the data form image using the computed values of horizontal and vertical offsets and skew. In this manner, the data masks are correctly located on the data form so that the actual data values in the data form reside within the corresponding data masks. Routines are then implemented for detecting extraneous data intruding into the data masks and for growing the masks, i.e. enlarging the masks to capture data which may extend beyond the perimeter of the masks. Thus, the data masks are adaptive in that they are grown if data does not lie entirely within the perimeter of the masks. During the mask growth routine, lines which are part of the background form are detected and removed by line removal algorithms. Following the removal of extraneous data from the masks, the growth of the masks to capture data, and any subsequent line removal, the remaining data from the masks is extracted and transferred to a new file. The new file then contains only data comprising characters of the data values in the desired regions, which can then be operated on by conventional OCR software to identify the specific character values.

LandOfFree

Say what you really think

Search LandOfFree.com for Canadian inventors and patents. Rate them and share your experience with other people.

Rating

Computer-implemented method for automatic extraction of data... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Computer-implemented method for automatic extraction of data..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Computer-implemented method for automatic extraction of data... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFCA-PAI-O-2015795

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.