G - Physics – 06 – F
Patent
G - Physics
06
F
G06F 17/30 (2006.01) G06F 17/27 (2006.01) H04L 29/00 (2006.01)
Patent
CA 2315413
A method for partitioning a database containing a plurality of documents into desired and undesired type documents is provided. The plurality of documents contain text and/or links to and from other documents in the database. The method includes the steps of: providing a source document of the desired type; providing a sink document for providing access to the database; identifying a cut-set of links which is the smallest set of links such that removing them from the database completely disconnects the source document and its linked documents from the sink document and its linked documents into first and second subsets of documents, respectively; and defining the first subset of documents as desired type documents and the remaining documents as undesired type documents. Preferably, the database is the World Wide Web, the documents are web pages, and the links are hyperlinks between web pages. The identifying step preferably comprises: mapping at least a portion of the database into a graph structure; and applying a maximum flow algorithm to the graph structure, the subset of the graph structure which remains after application of the maximum flow algorithm being the first subset of documents. Also provided are a computer program product and program storage device for carrying out the method of the present invention and for storing a set of instructions to carry out the method of the present invention, respectively.
Corporation Nec
Smart & Biggar
LandOfFree
A method to efficiently partition large hyperlinked... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with A method to efficiently partition large hyperlinked..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and A method to efficiently partition large hyperlinked... will most certainly appreciate the feedback.
Profile ID: LFCA-PAI-O-1352300