火车头请看,遇到无法采集的网站,附报错信息
遇到一个网站无法采集, www.westca.com 似乎无法加载目标页面, 采集时一直不显示页面下载完成, 测试规则的时候就报错,请看截图,http://www.twin-lotus.com/images/error.gif
出错信息全文:
See the end of this message for details on invoking
just-in-time (JIT) debugging instead of this dialog box.
************** Exception Text **************
System.ArgumentOutOfRangeException: Index was out of range.Must be non-negative and less than the size of the collection.
Parameter name: index
at System.Collections.ArrayList.get_Item(Int32 index)
at LocoySpider.LocoyArticle.GetContent(String lname, Int32 arthtmlNum)
at LocoySpider.LocoyArticle.Parse(Boolean ISDownload)
at LocoySpider.LocoyRuleNew.btnPageTest_Click(Object sender, EventArgs e)
at System.Windows.Forms.Control.OnClick(EventArgs e)
at System.Windows.Forms.Button.OnClick(EventArgs e)
at System.Windows.Forms.Button.OnMouseUp(MouseEventArgs mevent)
at System.Windows.Forms.Control.WmMouseUp(Message& m, MouseButtons button, Int32 clicks)
at System.Windows.Forms.Control.WndProc(Message& m)
at System.Windows.Forms.ButtonBase.WndProc(Message& m)
at System.Windows.Forms.Button.WndProc(Message& m)
at System.Windows.Forms.ControlNativeWindow.OnMessage(Message& m)
at System.Windows.Forms.ControlNativeWindow.WndProc(Message& m)
at System.Windows.Forms.NativeWindow.Callback(IntPtr hWnd, Int32 msg, IntPtr wparam, IntPtr lparam)
************** Loaded Assemblies **************
mscorlib
Assembly Version: 1.0.5000.0
Win32 Version: 1.1.4322.573
CodeBase: file:///c:/windows/microsoft.net/framework/v1.1.4322/mscorlib.dll
----------------------------------------
LocoySpider
Assembly Version: 1.2.0.0
Win32 Version: 1.2.0.0
CodeBase: file:///D:/Web_Project/Resources/%5BLocoySpider%5D%202006-02-07V1.2.0%20Build/LocoySpider.exe
----------------------------------------
System.Windows.Forms
Assembly Version: 1.0.5000.0
Win32 Version: 1.1.4322.573
CodeBase: file:///c:/windows/assembly/gac/system.windows.forms/1.0.5000.0__b77a5c561934e089/system.windows.forms.dll
----------------------------------------
System
Assembly Version: 1.0.5000.0
Win32 Version: 1.1.4322.573
CodeBase: file:///c:/windows/assembly/gac/system/1.0.5000.0__b77a5c561934e089/system.dll
----------------------------------------
System.Drawing
Assembly Version: 1.0.5000.0
Win32 Version: 1.1.4322.573
CodeBase: file:///c:/windows/assembly/gac/system.drawing/1.0.5000.0__b03f5f7f11d50a3a/system.drawing.dll
----------------------------------------
Sloppycode.UI.DragDropTreeView
Assembly Version: 1.0.0.0
Win32 Version: 1.0.0.0
CodeBase: file:///D:/Web_Project/Resources/%5BLocoySpider%5D%202006-02-07V1.2.0%20Build/Sloppycode.UI.DragDropTreeView.DLL
----------------------------------------
CustomControls
Assembly Version: 1.0.2038.24855
Win32 Version: 1.0.2038.24855
CodeBase: file:///D:/Web_Project/Resources/%5BLocoySpider%5D%202006-02-07V1.2.0%20Build/CustomControls.DLL
----------------------------------------
XMLconfig
Assembly Version: 1.0.2163.27571
Win32 Version: 1.0.2163.27571
CodeBase: file:///D:/Web_Project/Resources/%5BLocoySpider%5D%202006-02-07V1.2.0%20Build/XMLconfig.DLL
----------------------------------------
System.Xml
Assembly Version: 1.0.5000.0
Win32 Version: 1.1.4322.573
CodeBase: file:///c:/windows/assembly/gac/system.xml/1.0.5000.0__b77a5c561934e089/system.xml.dll
----------------------------------------
Microsoft.VisualBasic
Assembly Version: 7.0.5000.0
Win32 Version: 7.10.3052.4
CodeBase: file:///c:/windows/assembly/gac/microsoft.visualbasic/7.0.5000.0__b03f5f7f11d50a3a/microsoft.visualbasic.dll
----------------------------------------
************** JIT Debugging **************
To enable just in time (JIT) debugging, the config file for this
application or machine (machine.config) must have the
jitDebugging value set in the system.windows.forms section.
The application must also be compiled with debugging
enabled.
For example:
<configuration>
<system.windows.forms jitDebugging="true" />
</configuration>
When JIT debugging is enabled, any unhandled exception
will be sent to the JIT debugger registered on the machine
rather than being handled by this dialog.
谢谢查看 测试规则的时候就报错 是自己写的吗?
有没有朋友再测试一下www.westca.com看有没有问题
我代理上,不太行 测试页面的时候有错误
有关调用实时(JIT)调试而不是此对话框的详细信息,
请参阅此消息的结尾。
************** 异常文本 **************
System.ArgumentOutOfRangeException: 索引超出范围。必须为非负值并小于集合大小。
参数名: index
at System.Collections.ArrayList.get_Item(Int32 index)
at LocoySpider.LocoyArticle.GetContent(String lname, Int32 arthtmlNum)
at LocoySpider.LocoyArticle.Parse(Boolean ISDownload)
at LocoySpider.LocoyRuleNew.btnPageTest_Click(Object sender, EventArgs e)
at System.Windows.Forms.Control.OnClick(EventArgs e)
at System.Windows.Forms.Button.OnClick(EventArgs e)
at System.Windows.Forms.Button.OnMouseUp(MouseEventArgs mevent)
at System.Windows.Forms.Control.WmMouseUp(Message& m, MouseButtons button, Int32 clicks)
at System.Windows.Forms.Control.WndProc(Message& m)
at System.Windows.Forms.ButtonBase.WndProc(Message& m)
at System.Windows.Forms.Button.WndProc(Message& m)
at System.Windows.Forms.ControlNativeWindow.OnMessage(Message& m)
at System.Windows.Forms.ControlNativeWindow.WndProc(Message& m)
at System.Windows.Forms.NativeWindow.Callback(IntPtr hWnd, Int32 msg, IntPtr wparam, IntPtr lparam)
************** 已加载的程序集 **************
mscorlib
程序集版本: 1.0.5000.0
Win32 版本: 1.1.4322.2300
基本代码: file:///c:/windows/microsoft.net/framework/v1.1.4322/mscorlib.dll
----------------------------------------
LocoySpider
程序集版本: 1.2.0.0
Win32 版本: 1.2.0.0
基本代码: file:///C:/Documents%20and%20Settings/Administrator/桌面/火车采集器V1.2.0%20_QQ群测试版/LocoySpider.exe
----------------------------------------
System.Windows.Forms
程序集版本: 1.0.5000.0
Win32 版本: 1.1.4322.2300
基本代码: file:///c:/windows/assembly/gac/system.windows.forms/1.0.5000.0__b77a5c561934e089/system.windows.forms.dll
----------------------------------------
System
程序集版本: 1.0.5000.0
Win32 版本: 1.1.4322.2300
基本代码: file:///c:/windows/assembly/gac/system/1.0.5000.0__b77a5c561934e089/system.dll
----------------------------------------
System.Drawing
程序集版本: 1.0.5000.0
Win32 版本: 1.1.4322.2300
基本代码: file:///c:/windows/assembly/gac/system.drawing/1.0.5000.0__b03f5f7f11d50a3a/system.drawing.dll
----------------------------------------
Sloppycode.UI.DragDropTreeView
程序集版本: 1.0.0.0
Win32 版本: 1.0.0.0
基本代码: file:///C:/Documents%20and%20Settings/Administrator/桌面/火车采集器V1.2.0%20_QQ群测试版/Sloppycode.UI.DragDropTreeView.DLL
----------------------------------------
CustomControls
程序集版本: 1.0.2038.24855
Win32 版本: 1.0.2038.24855
基本代码: file:///C:/Documents%20and%20Settings/Administrator/桌面/火车采集器V1.2.0%20_QQ群测试版/CustomControls.DLL
----------------------------------------
System.Windows.Forms.resources
程序集版本: 1.0.5000.0
Win32 版本: 1.1.4322.573
基本代码: file:///c:/windows/assembly/gac/system.windows.forms.resources/1.0.5000.0_zh-chs_b77a5c561934e089/system.windows.forms.resources.dll
----------------------------------------
mscorlib.resources
程序集版本: 1.0.5000.0
Win32 版本: 1.1.4322.573
基本代码: file:///c:/windows/assembly/gac/mscorlib.resources/1.0.5000.0_zh-chs_b77a5c561934e089/mscorlib.resources.dll
----------------------------------------
XMLconfig
程序集版本: 1.0.2163.27571
Win32 版本: 1.0.2163.27571
基本代码: file:///C:/Documents%20and%20Settings/Administrator/桌面/火车采集器V1.2.0%20_QQ群测试版/XMLconfig.DLL
----------------------------------------
System.Xml
程序集版本: 1.0.5000.0
Win32 版本: 1.1.4322.2300
基本代码: file:///c:/windows/assembly/gac/system.xml/1.0.5000.0__b77a5c561934e089/system.xml.dll
----------------------------------------
Microsoft.VisualBasic
程序集版本: 7.0.5000.0
Win32 版本: 7.10.6310.4
基本代码: file:///c:/windows/assembly/gac/microsoft.visualbasic/7.0.5000.0__b03f5f7f11d50a3a/microsoft.visualbasic.dll
----------------------------------------
System.resources
程序集版本: 1.0.5000.0
Win32 版本: 1.1.4322.573
基本代码: file:///c:/windows/assembly/gac/system.resources/1.0.5000.0_zh-chs_b77a5c561934e089/system.resources.dll
----------------------------------------
************** JIT 调试 **************
计算机的配置文件(machine.config)的
system.windows.forms 节中必须设置 jitDebugging 值。
编译应用程序时还必须启用\r\n调试。\r\n\r\n例如: \r\n\r\n<configuration>\r\n <system.windows.forms jitDebugging="true" />\r\n</configuration>\r\n\r\n启用 JIT 调试后,任何未处理的异常\r\n都将被发送到此计算机上注册的 JIT 调试器,\r\n而不是由此对话框处理。\r\n 索引超出范围。必须为非负值并小于集合大小
参数名: index 测试规则的时候就报错 是自己写的吗?
规则应该没问题,即使规则里我只定义采个页面标题也报错
原帖由 scorpion 于 2006-2-15 22:18 发表
索引超出范围。必须为非负值并小于集合大小
参数名: index
请问这代表什么错误,应该如何设置?
谢谢火车头和版主scorpion
[ 本帖最后由 polarstar9 于 2006-2-16 07:49 编辑 ] 我昨晚也试了下,确实采不了文题链接,测试页也是报的同样的错误,scorpion 你测试过吗?会不会是我采集网址时设置有问题啊?怪事,但我用BFC却能轻易拿下它。
我发现它文题列表的代码中多了个<li>********</li>会不会是这个原因啊?不知高手们有什么好办法。
[ 本帖最后由 netdream 于 2006-2-16 11:55 编辑 ] 原帖由 netdream 于 2006-2-16 11:27 发表
我昨晚也试了下,确实采不了文题链接,测试页也是报的同样的错误,scorpion 你测试过吗?会不会是我采集网址时设置有问题啊?怪事,但我用BFC却能轻易拿下它。
我发现它文题列表的代码中多了个<li>******* ...
请问确实是这个原因吗?怎么解决呢? 我也碰到这个问题了,请问如何解决呀? 问题的确存在。
用winsock expert监测发现只有一句get命令和一句host命令,用wkiller监测得到以下结果,估计是因为返回记录的长度为0:
发送:
GET / HTTP/1.1
Host: www.westca.com
接收:
HTTP/1.1 200 OK
Date: Wed, 19 Jul 2006 15:49:10 GMT
Server: Apache/2.0.46 (Red Hat)
Accept-Ranges: bytes
X-Powered-By: PHP/4.3.2
Content-Length: 0
Content-Type: text/html
页:
[1]
2