Document structure identifier

G - Physics – 06 – F

Patent

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

G06F 17/27 (2006.01) G06F 17/22 (2006.01) G06K 9/20 (2006.01)

Patent

CA 2486528

A method of automated document structure identification based on visual cues is disclosed herein. The two dimensional layout of the document is analyzed to discern visual cues related to the structure of the document, and the text of the document is tokenized so that similarly structured elements are treated similarly. The method can be applied in the generation of extensible mark-up language files, natural language parsing and search engine ranking mechanisms.

L'invention concerne un procédé destiné à identifier la structure d'un document sur la base d'indices visuels. La disposition bidimensionnelle du document est analysée en vue de détecter des indices visuels associés à la structure du document, le texte du document étant marqué de façon que des éléments de structure similaire soient traités de manière similaire. Ce procédé peut être mis en application dans la génération de fichiers de langage XML, l'analyse de langages naturels et les mécanismes de classement de moteurs de recherche.

LandOfFree

Say what you really think

Search LandOfFree.com for Canadian inventors and patents. Rate them and share your experience with other people.

Rating

Document structure identifier does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Document structure identifier, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Document structure identifier will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFCA-PAI-O-1623037

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.