对于某种类型的URL,采集器无法获取网页
本帖最后由 zouqqz 于 2013-4-7 12:52 编辑有部分网页无法被正常获取:火车头提供的http请求工具返回404错误或出现与浏览器中不一致的结果,但这些网址均可以直接通过浏览器访问。以下3个网页,第一个无法被采集器获取,但可通过电脑浏览器访问。第二个网址在采集器中获取的结果和浏览器中不一致。第三个网址可以正常被获取。经过观察,是否是网址中出现./这样的字符导致,应该是个bug,如何解决?
http://www.hotcourses.com/uk-courses/postgraduate-Economics-courses-in-the-UK/hc2_search.adv_col_do/16180339/90904/search_category/EB./qualification/L/town_city/United+Kingdom/page.htm
http://www.hotcourses.com/uk-courses/postgraduate-economics-courses-london-metropolitan-university/16180339/90904/15574/EB./L/any/county/united+kingdom/all/list.htm
http://www.hotcourses.com/uk-courses/postgraduate-Agricultural-Economics-courses-in-the-UK/hc2_search.adv_col_do/16180339/90904/search_category/EB.643/qualification/L/town_city/United+Kingdom/page.htm
问题已收到 很多了,类似这样的能找到十几个
页:
[1]