星期四, 一月 25, 2007

算法如下:

1. /网页识别,找到Form(Form Analysis):/ Parse and process the form
to build an internal representation based on the above model.
2. /识别Form中可填入的块,在块中填入合适的值(Value assignment and
ranking):/ Use approximate string matching between the form
labels and the labels in the LVS table to generate a set of
candidate value assignments. (A /value assignment/ is an
assignment of a value to each element of a form.) Use fuzzy
aggregation functions to combine individual weights into weights
for value assignments and use these weights for ranking the
candidate assignments.
3. /提交Form(Form Submission):/ Use the top ``N'' value assignments
to repeatedly fill out and submit the form.
4. /检查返回页面,判断是否提交失败(Response Analysis and
Navigation):/ Analyze the response pages (i.e., the pages
received in response to form submissions) to check if the
submission yielded valid search results. Use this feedback to tune
the value assignments in Step 2. Crawl the hypertext links in the
response page to some pre-specified depth.

没有评论: