一个简单的网站可防火车头采集
http://www.pacszhanghong.cn/front2/Knowledge.aspx这个网站很简单,但是火车头就是不能采集。这是为什么 有防采集吗?
没有。。。。。
http://www.pacszhanghong.cn/front2/Knowledge.aspx?pageindex=1
到.
http://www.pacszhanghong.cn/front2/Knowledge.aspx?pageindex=10
测试没问题 呵呵 车头动作真快·· 不对吧,我下的最新版本,试了好多次都不行
获取源码提示:
/indentify.aspx
返回头信息:
HTTP/1.1 302 Found
Content-Length:132
Cache-Control:private
Content-Type:text/html; charset=utf-8
Date:Mon, 03 Aug 2009 07:56:32 GMT
Location:/indentify.aspx
Set-Cookie:ASP.NET_SessionId=fl3rlbvbchzokz2jikbyty55; path=/; HttpOnly
Server:Microsoft-IIS/6.0
X-AspNet-Version:2.0.50727
X-Powered-By:ASP.NET
如图:
我下个旧版本的试试 这个嘛 你采集不到不能证明火车头采集器就无法采集这个页面 好好学习吧 能告诉我怎么设置的吗,我搞了好久还是不行啊, 在站点的里规则测试可以获取,
在任务里获取不了,
截取封包
正常的:
接收自 GET /front2/Knowledge.aspx?pageindex=1 HTTP/1.1
Accept: */*
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 2.0.50727)
Host: www.pacszhanghong.cn
Connection: Close
发送?HTTP/1.1 302 Found
Connection: close
Date: Tue, 04 Aug 2009 00:58:15 GMT
Server: Microsoft-IIS/6.0
X-Powered-By: ASP.NET
X-AspNet-Version: 2.0.50727
Location: /indentify.aspx
Set-Cookie: ASP.NET_SessionId=zgpdvu45z1xfrr45d4axbz55; path=/; HttpOnly
Cache-Control: private
Content-Type: text/html; charset=utf-8
Content-Length: 132
<html><head><title>Object moved</title></head><body>
<h2>Object moved to <a href="/indentify.aspx">here</a>.</h2>
</body></html>
接收 GET /indentify.aspx HTTP/1.1
Accept: */*
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 2.0.50727)
Host: www.pacszhanghong.cn
Cookie: ASP.NET_SessionId=zgpdvu45z1xfrr45d4axbz55
Connection: Close
发送HTTP/1.1 302 Found
Connection: close
Date: Tue, 04 Aug 2009 00:58:15 GMT
Server: Microsoft-IIS/6.0
X-Powered-By: ASP.NET
X-AspNet-Version: 2.0.50727
Location: /front2/Knowledge.aspx?pageindex=1
Cache-Control: private
Content-Type: text/html; charset=utf-8
Content-Length: 690
<html><head><title>Object moved</title></head><body>
<h2>Object moved to <a href="/front2/Knowledge.aspx?pageindex=1">here</a>.</h2>
</body></html>
不正常的:
GET /front2/Knowledge.aspx?pageindex=1 HTTP/1.1
Accept: */*
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 2.0.50727)
Referer: http://www.pacszhanghong.cn/front2/Knowledge.aspx?pageindex=1
Host: www.pacszhanghong.cn
Connection: Close
发送?HTTP/1.1 302 Found
Connection: close
Date: Tue, 04 Aug 2009 00:56:19 GMT
Server: Microsoft-IIS/6.0
X-Powered-By: ASP.NET
X-AspNet-Version: 2.0.50727
Location: /indentify.aspx
Set-Cookie: ASP.NET_SessionId=ve4s4l45m3e1u5555neschva; path=/; HttpOnly
Cache-Control: private
Content-Type: text/html; charset=utf-8
Content-Length: 132
<html><head><title>Object moved</title></head><body>
<h2>Object moved to <a href="/indentify.aspx">here</a>.</h2>
</body></html>
接收 GET /indentify.aspx HTTP/1.1
Accept: */*
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 2.0.50727)
Referer: http://www.pacszhanghong.cn/front2/Knowledge.aspx?pageindex=1
Host: www.pacszhanghong.cn
Connection: Close
发送?HTTP/1.1 302 Found
Connection: close
Date: Tue, 04 Aug 2009 00:56:19 GMT
Server: Microsoft-IIS/6.0
X-Powered-By: ASP.NET
X-AspNet-Version: 2.0.50727
Location: /front2/index.aspx
Set-Cookie: ASP.NET_SessionId=4wajusy1xp1zpxigmcfueyme; path=/; HttpOnly
Cache-Control: private
Content-Type: text/html; charset=utf-8
Content-Length: 674
<html><head><title>Object moved</title></head><body>
<h2>Object moved to <a href="/front2/index.aspx">here</a>.</h2>
</body></html>
最后一个包不同,应该是两个地方获取方式不同,
不知道你们上面是怎么设置的,我这里是不行。 网站万能信息采集器可以正常获取,可惜要收费的
页:
[1]