火车采集器软件交流官方论坛

 找回密码
 加入会员
搜索
火车采集器V9版免费下载火车浏览器 - 可视采集,万能群发,全自动脚本工具
查看: 3367|回复: 8

一个简单的网站可防火车头采集

[复制链接]
发表于 2009-8-3 14:48:22 | 显示全部楼层 |阅读模式
http://www.pacszhanghong.cn/front2/Knowledge.aspx

这个网站很简单,但是火车头就是不能采集。这是为什么
发表于 2009-8-3 15:32:05 | 显示全部楼层
有防采集吗?
没有。。。。。

本帖子中包含更多资源

您需要 登录 才可以下载或查看,没有帐号?加入会员

x
发表于 2009-8-3 15:34:42 | 显示全部楼层
发表于 2009-8-3 15:35:46 | 显示全部楼层
呵呵 车头动作真快··
 楼主| 发表于 2009-8-3 16:04:03 | 显示全部楼层
不对吧,我下的最新版本,试了好多次都不行
获取源码提示:
/indentify.aspx

返回头信息:
HTTP/1.1 302 Found
Content-Length:132
Cache-Control:private
Content-Type:text/html; charset=utf-8
Date:Mon, 03 Aug 2009 07:56:32 GMT
Location:/indentify.aspx
Set-Cookie:ASP.NET_SessionId=fl3rlbvbchzokz2jikbyty55; path=/; HttpOnly
Server:Microsoft-IIS/6.0
X-AspNet-Version:2.0.50727
X-Powered-By:ASP.NET

如图:




我下个旧版本的试试

本帖子中包含更多资源

您需要 登录 才可以下载或查看,没有帐号?加入会员

x
发表于 2009-8-3 18:22:24 | 显示全部楼层
这个嘛 你采集不到不能证明火车头采集器就无法采集这个页面 好好学习吧
 楼主| 发表于 2009-8-3 21:34:04 | 显示全部楼层
能告诉我怎么设置的吗,我搞了好久还是不行啊,
 楼主| 发表于 2009-8-4 09:10:45 | 显示全部楼层
在站点的里规则测试可以获取,
在任务里获取不了,
截取封包
正常的:
接收自   GET /front2/Knowledge.aspx?pageindex=1 HTTP/1.1
Accept: */*
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 2.0.50727)
Host: www.pacszhanghong.cn
Connection: Close

发送?  HTTP/1.1 302 Found
Connection: close
Date: Tue, 04 Aug 2009 00:58:15 GMT
Server: Microsoft-IIS/6.0
X-Powered-By: ASP.NET
X-AspNet-Version: 2.0.50727
Location: /indentify.aspx
Set-Cookie: ASP.NET_SessionId=zgpdvu45z1xfrr45d4axbz55; path=/; HttpOnly
Cache-Control: private
Content-Type: text/html; charset=utf-8
Content-Length: 132

<html><head><title>Object moved</title></head><body>
<h2>Object moved to <a href="/indentify.aspx">here</a>.</h2>
</body></html>
接收   GET /indentify.aspx HTTP/1.1
Accept: */*
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 2.0.50727)
Host: www.pacszhanghong.cn
Cookie: ASP.NET_SessionId=zgpdvu45z1xfrr45d4axbz55
Connection: Close

发送  HTTP/1.1 302 Found
Connection: close
Date: Tue, 04 Aug 2009 00:58:15 GMT
Server: Microsoft-IIS/6.0
X-Powered-By: ASP.NET
X-AspNet-Version: 2.0.50727
Location: /front2/Knowledge.aspx?pageindex=1
Cache-Control: private
Content-Type: text/html; charset=utf-8
Content-Length: 690

<html><head><title>Object moved</title></head><body>
<h2>Object moved to <a href="/front2/Knowledge.aspx?pageindex=1">here</a>.</h2>
</body></html>


不正常的:
GET /front2/Knowledge.aspx?pageindex=1 HTTP/1.1
Accept: */*
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 2.0.50727)
Referer: http://www.pacszhanghong.cn/front2/Knowledge.aspx?pageindex=1
Host: www.pacszhanghong.cn
Connection: Close

发送?  HTTP/1.1 302 Found
Connection: close
Date: Tue, 04 Aug 2009 00:56:19 GMT
Server: Microsoft-IIS/6.0
X-Powered-By: ASP.NET
X-AspNet-Version: 2.0.50727
Location: /indentify.aspx
Set-Cookie: ASP.NET_SessionId=ve4s4l45m3e1u5555neschva; path=/; HttpOnly
Cache-Control: private
Content-Type: text/html; charset=utf-8
Content-Length: 132

<html><head><title>Object moved</title></head><body>
<h2>Object moved to <a href="/indentify.aspx">here</a>.</h2>
</body></html>
接收   GET /indentify.aspx HTTP/1.1
Accept: */*
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 2.0.50727)
Referer: http://www.pacszhanghong.cn/front2/Knowledge.aspx?pageindex=1
Host: www.pacszhanghong.cn
Connection: Close

发送?  HTTP/1.1 302 Found
Connection: close
Date: Tue, 04 Aug 2009 00:56:19 GMT
Server: Microsoft-IIS/6.0
X-Powered-By: ASP.NET
X-AspNet-Version: 2.0.50727
Location: /front2/index.aspx
Set-Cookie: ASP.NET_SessionId=4wajusy1xp1zpxigmcfueyme; path=/; HttpOnly
Cache-Control: private
Content-Type: text/html; charset=utf-8
Content-Length: 674

<html><head><title>Object moved</title></head><body>
<h2>Object moved to <a href="/front2/index.aspx">here</a>.</h2>
</body></html>

最后一个包不同,应该是两个地方获取方式不同,
不知道你们上面是怎么设置的,我这里是不行。
 楼主| 发表于 2009-8-4 10:06:02 | 显示全部楼层
网站万能信息采集器  可以正常获取,可惜要收费的
您需要登录后才可以回帖 登录 | 加入会员

本版积分规则

QQ|手机版|Archiver|火车采集器官方站 ( 皖ICP备06000549 )

GMT+8, 2024-11-24 13:24

Powered by Discuz! X3.4

Copyright © 2001-2020, Tencent Cloud.

快速回复 返回顶部 返回列表