4424892 发表于 2010-10-31 14:02:53

采集文章分页排序错乱的问题

使用2010的火车头采集文章 有分页,分页一多 采集的文章分页顺序跟 原始文章不一致 28页竟然排在第二页的位置上了如下面的列表请问 如何解决 请高人或者官方 给个修正方法

█此页面包含多个分页:
█1:http://news.cs.soufun.com/2010-10-31/3973260.htm
█2:http://news.cs.soufun.com/2010-10-31/3973260_28.html
█3:http://news.cs.soufun.com/2010-10-31/3973260_2.html
█4:http://news.cs.soufun.com/2010-10-31/3973260_3.html
█5:http://news.cs.soufun.com/2010-10-31/3973260_4.html
█6:http://news.cs.soufun.com/2010-10-31/3973260_5.html
█7:http://news.cs.soufun.com/2010-10-31/3973260_6.html
█8:http://news.cs.soufun.com/2010-10-31/3973260_7.html
█9:http://news.cs.soufun.com/2010-10-31/3973260_8.html
█10:http://news.cs.soufun.com/2010-10-31/3973260_9.html
█11:http://news.cs.soufun.com/2010-10-31/3973260_10.html
█12:http://news.cs.soufun.com/2010-10-31/3973260_11.html
█13:http://news.cs.soufun.com/2010-10-31/3973260_12.html
█14:http://news.cs.soufun.com/2010-10-31/3973260_13.html
█15:http://news.cs.soufun.com/2010-10-31/3973260_14.html
█16:http://news.cs.soufun.com/2010-10-31/3973260_15.html
█17:http://news.cs.soufun.com/2010-10-31/3973260_16.html
█18:http://news.cs.soufun.com/2010-10-31/3973260_17.html
█19:http://news.cs.soufun.com/2010-10-31/3973260_18.html
█20:http://news.cs.soufun.com/2010-10-31/3973260_19.html
█21:http://news.cs.soufun.com/2010-10-31/3973260_20.html
█22:http://news.cs.soufun.com/2010-10-31/3973260_21.html
█23:http://news.cs.soufun.com/2010-10-31/3973260_22.html
█24:http://news.cs.soufun.com/2010-10-31/3973260_23.html
█25:http://news.cs.soufun.com/2010-10-31/3973260_24.html
█26:http://news.cs.soufun.com/2010-10-31/3973260_25.html
█27:http://news.cs.soufun.com/2010-10-31/3973260_26.html
█28:http://news.cs.soufun.com/2010-10-31/3973260_27.html
█有分页匹配的标签,比如内容注意选中标签编辑框中的[该标签在分页中匹配]

kuhabe 发表于 2010-10-31 14:23:26

改成单线程采集试试

专业收费采集 发表于 2010-10-31 14:32:43

再检查下采集规则~~~~~~

yerencao 发表于 2010-11-8 15:31:43

有的网站一次只显示部分分页导航,就会出现这种情况,有没有办法对分页链接地址进行排序?
页: [1]
查看完整版本: 采集文章分页排序错乱的问题