polarstar9 发表于 2006-2-15 16:39:38

火车头请看,遇到无法采集的网站,附报错信息

遇到一个网站无法采集, www.westca.com 似乎无法加载目标页面, 采集时一直不显示页面下载完成, 测试规则的时候就报错,请看截图,

http://www.twin-lotus.com/images/error.gif


出错信息全文:


See the end of this message for details on invoking
just-in-time (JIT) debugging instead of this dialog box.

************** Exception Text **************
System.ArgumentOutOfRangeException: Index was out of range.Must be non-negative and less than the size of the collection.
Parameter name: index
   at System.Collections.ArrayList.get_Item(Int32 index)
   at LocoySpider.LocoyArticle.GetContent(String lname, Int32 arthtmlNum)
   at LocoySpider.LocoyArticle.Parse(Boolean ISDownload)
   at LocoySpider.LocoyRuleNew.btnPageTest_Click(Object sender, EventArgs e)
   at System.Windows.Forms.Control.OnClick(EventArgs e)
   at System.Windows.Forms.Button.OnClick(EventArgs e)
   at System.Windows.Forms.Button.OnMouseUp(MouseEventArgs mevent)
   at System.Windows.Forms.Control.WmMouseUp(Message& m, MouseButtons button, Int32 clicks)
   at System.Windows.Forms.Control.WndProc(Message& m)
   at System.Windows.Forms.ButtonBase.WndProc(Message& m)
   at System.Windows.Forms.Button.WndProc(Message& m)
   at System.Windows.Forms.ControlNativeWindow.OnMessage(Message& m)
   at System.Windows.Forms.ControlNativeWindow.WndProc(Message& m)
   at System.Windows.Forms.NativeWindow.Callback(IntPtr hWnd, Int32 msg, IntPtr wparam, IntPtr lparam)


************** Loaded Assemblies **************
mscorlib
    Assembly Version: 1.0.5000.0
    Win32 Version: 1.1.4322.573
    CodeBase: file:///c:/windows/microsoft.net/framework/v1.1.4322/mscorlib.dll
----------------------------------------
LocoySpider
    Assembly Version: 1.2.0.0
    Win32 Version: 1.2.0.0
    CodeBase: file:///D:/Web_Project/Resources/%5BLocoySpider%5D%202006-02-07V1.2.0%20Build/LocoySpider.exe
----------------------------------------
System.Windows.Forms
    Assembly Version: 1.0.5000.0
    Win32 Version: 1.1.4322.573
    CodeBase: file:///c:/windows/assembly/gac/system.windows.forms/1.0.5000.0__b77a5c561934e089/system.windows.forms.dll
----------------------------------------
System
    Assembly Version: 1.0.5000.0
    Win32 Version: 1.1.4322.573
    CodeBase: file:///c:/windows/assembly/gac/system/1.0.5000.0__b77a5c561934e089/system.dll
----------------------------------------
System.Drawing
    Assembly Version: 1.0.5000.0
    Win32 Version: 1.1.4322.573
    CodeBase: file:///c:/windows/assembly/gac/system.drawing/1.0.5000.0__b03f5f7f11d50a3a/system.drawing.dll
----------------------------------------
Sloppycode.UI.DragDropTreeView
    Assembly Version: 1.0.0.0
    Win32 Version: 1.0.0.0
    CodeBase: file:///D:/Web_Project/Resources/%5BLocoySpider%5D%202006-02-07V1.2.0%20Build/Sloppycode.UI.DragDropTreeView.DLL
----------------------------------------
CustomControls
    Assembly Version: 1.0.2038.24855
    Win32 Version: 1.0.2038.24855
    CodeBase: file:///D:/Web_Project/Resources/%5BLocoySpider%5D%202006-02-07V1.2.0%20Build/CustomControls.DLL
----------------------------------------
XMLconfig
    Assembly Version: 1.0.2163.27571
    Win32 Version: 1.0.2163.27571
    CodeBase: file:///D:/Web_Project/Resources/%5BLocoySpider%5D%202006-02-07V1.2.0%20Build/XMLconfig.DLL
----------------------------------------
System.Xml
    Assembly Version: 1.0.5000.0
    Win32 Version: 1.1.4322.573
    CodeBase: file:///c:/windows/assembly/gac/system.xml/1.0.5000.0__b77a5c561934e089/system.xml.dll
----------------------------------------
Microsoft.VisualBasic
    Assembly Version: 7.0.5000.0
    Win32 Version: 7.10.3052.4
    CodeBase: file:///c:/windows/assembly/gac/microsoft.visualbasic/7.0.5000.0__b03f5f7f11d50a3a/microsoft.visualbasic.dll
----------------------------------------

************** JIT Debugging **************
To enable just in time (JIT) debugging, the config file for this
application or machine (machine.config) must have the
jitDebugging value set in the system.windows.forms section.
The application must also be compiled with debugging
enabled.

For example:

<configuration>
    <system.windows.forms jitDebugging="true" />
</configuration>

When JIT debugging is enabled, any unhandled exception
will be sent to the JIT debugger registered on the machine
rather than being handled by this dialog.

polarstar9 发表于 2006-2-15 16:46:40

谢谢查看

火车头 发表于 2006-2-15 21:24:10

测试规则的时候就报错 是自己写的吗?

有没有朋友再测试一下www.westca.com看有没有问题
我代理上,不太行

scorpion 发表于 2006-2-15 22:14:52

测试页面的时候有错误

有关调用实时(JIT)调试而不是此对话框的详细信息,
请参阅此消息的结尾。

************** 异常文本 **************
System.ArgumentOutOfRangeException: 索引超出范围。必须为非负值并小于集合大小。
参数名: index
   at System.Collections.ArrayList.get_Item(Int32 index)
   at LocoySpider.LocoyArticle.GetContent(String lname, Int32 arthtmlNum)
   at LocoySpider.LocoyArticle.Parse(Boolean ISDownload)
   at LocoySpider.LocoyRuleNew.btnPageTest_Click(Object sender, EventArgs e)
   at System.Windows.Forms.Control.OnClick(EventArgs e)
   at System.Windows.Forms.Button.OnClick(EventArgs e)
   at System.Windows.Forms.Button.OnMouseUp(MouseEventArgs mevent)
   at System.Windows.Forms.Control.WmMouseUp(Message& m, MouseButtons button, Int32 clicks)
   at System.Windows.Forms.Control.WndProc(Message& m)
   at System.Windows.Forms.ButtonBase.WndProc(Message& m)
   at System.Windows.Forms.Button.WndProc(Message& m)
   at System.Windows.Forms.ControlNativeWindow.OnMessage(Message& m)
   at System.Windows.Forms.ControlNativeWindow.WndProc(Message& m)
   at System.Windows.Forms.NativeWindow.Callback(IntPtr hWnd, Int32 msg, IntPtr wparam, IntPtr lparam)


************** 已加载的程序集 **************
mscorlib
    程序集版本: 1.0.5000.0
    Win32 版本: 1.1.4322.2300
    基本代码: file:///c:/windows/microsoft.net/framework/v1.1.4322/mscorlib.dll
----------------------------------------
LocoySpider
    程序集版本: 1.2.0.0
    Win32 版本: 1.2.0.0
    基本代码: file:///C:/Documents%20and%20Settings/Administrator/桌面/火车采集器V1.2.0%20_QQ群测试版/LocoySpider.exe
----------------------------------------
System.Windows.Forms
    程序集版本: 1.0.5000.0
    Win32 版本: 1.1.4322.2300
    基本代码: file:///c:/windows/assembly/gac/system.windows.forms/1.0.5000.0__b77a5c561934e089/system.windows.forms.dll
----------------------------------------
System
    程序集版本: 1.0.5000.0
    Win32 版本: 1.1.4322.2300
    基本代码: file:///c:/windows/assembly/gac/system/1.0.5000.0__b77a5c561934e089/system.dll
----------------------------------------
System.Drawing
    程序集版本: 1.0.5000.0
    Win32 版本: 1.1.4322.2300
    基本代码: file:///c:/windows/assembly/gac/system.drawing/1.0.5000.0__b03f5f7f11d50a3a/system.drawing.dll
----------------------------------------
Sloppycode.UI.DragDropTreeView
    程序集版本: 1.0.0.0
    Win32 版本: 1.0.0.0
    基本代码: file:///C:/Documents%20and%20Settings/Administrator/桌面/火车采集器V1.2.0%20_QQ群测试版/Sloppycode.UI.DragDropTreeView.DLL
----------------------------------------
CustomControls
    程序集版本: 1.0.2038.24855
    Win32 版本: 1.0.2038.24855
    基本代码: file:///C:/Documents%20and%20Settings/Administrator/桌面/火车采集器V1.2.0%20_QQ群测试版/CustomControls.DLL
----------------------------------------
System.Windows.Forms.resources
    程序集版本: 1.0.5000.0
    Win32 版本: 1.1.4322.573
    基本代码: file:///c:/windows/assembly/gac/system.windows.forms.resources/1.0.5000.0_zh-chs_b77a5c561934e089/system.windows.forms.resources.dll
----------------------------------------
mscorlib.resources
    程序集版本: 1.0.5000.0
    Win32 版本: 1.1.4322.573
    基本代码: file:///c:/windows/assembly/gac/mscorlib.resources/1.0.5000.0_zh-chs_b77a5c561934e089/mscorlib.resources.dll
----------------------------------------
XMLconfig
    程序集版本: 1.0.2163.27571
    Win32 版本: 1.0.2163.27571
    基本代码: file:///C:/Documents%20and%20Settings/Administrator/桌面/火车采集器V1.2.0%20_QQ群测试版/XMLconfig.DLL
----------------------------------------
System.Xml
    程序集版本: 1.0.5000.0
    Win32 版本: 1.1.4322.2300
    基本代码: file:///c:/windows/assembly/gac/system.xml/1.0.5000.0__b77a5c561934e089/system.xml.dll
----------------------------------------
Microsoft.VisualBasic
    程序集版本: 7.0.5000.0
    Win32 版本: 7.10.6310.4
    基本代码: file:///c:/windows/assembly/gac/microsoft.visualbasic/7.0.5000.0__b03f5f7f11d50a3a/microsoft.visualbasic.dll
----------------------------------------
System.resources
    程序集版本: 1.0.5000.0
    Win32 版本: 1.1.4322.573
    基本代码: file:///c:/windows/assembly/gac/system.resources/1.0.5000.0_zh-chs_b77a5c561934e089/system.resources.dll
----------------------------------------

************** JIT 调试 **************
计算机的配置文件(machine.config)的
system.windows.forms 节中必须设置 jitDebugging 值。
编译应用程序时还必须启用\r\n调试。\r\n\r\n例如: \r\n\r\n<configuration>\r\n    <system.windows.forms jitDebugging="true" />\r\n</configuration>\r\n\r\n启用 JIT 调试后,任何未处理的异常\r\n都将被发送到此计算机上注册的 JIT 调试器,\r\n而不是由此对话框处理。\r\n

scorpion 发表于 2006-2-15 22:18:34

索引超出范围。必须为非负值并小于集合大小

参数名: index

polarstar9 发表于 2006-2-16 07:35:27

测试规则的时候就报错 是自己写的吗?

规则应该没问题,即使规则里我只定义采个页面标题也报错


原帖由 scorpion 于 2006-2-15 22:18 发表
索引超出范围。必须为非负值并小于集合大小

参数名: index

请问这代表什么错误,应该如何设置?

谢谢火车头和版主scorpion

[ 本帖最后由 polarstar9 于 2006-2-16 07:49 编辑 ]

netdream 发表于 2006-2-16 11:27:42

我昨晚也试了下,确实采不了文题链接,测试页也是报的同样的错误,scorpion 你测试过吗?会不会是我采集网址时设置有问题啊?怪事,但我用BFC却能轻易拿下它。
我发现它文题列表的代码中多了个<li>********</li>会不会是这个原因啊?不知高手们有什么好办法。

[ 本帖最后由 netdream 于 2006-2-16 11:55 编辑 ]

polarstar9 发表于 2006-2-19 07:41:15

原帖由 netdream 于 2006-2-16 11:27 发表
我昨晚也试了下,确实采不了文题链接,测试页也是报的同样的错误,scorpion 你测试过吗?会不会是我采集网址时设置有问题啊?怪事,但我用BFC却能轻易拿下它。
我发现它文题列表的代码中多了个<li>******* ...

请问确实是这个原因吗?怎么解决呢?

mentor 发表于 2006-7-2 11:38:25

我也碰到这个问题了,请问如何解决呀?

dr5d 发表于 2006-7-19 23:54:49

问题的确存在。
用winsock expert监测发现只有一句get命令和一句host命令,用wkiller监测得到以下结果,估计是因为返回记录的长度为0:
发送:
GET / HTTP/1.1
Host: www.westca.com




接收:
HTTP/1.1 200 OK
Date: Wed, 19 Jul 2006 15:49:10 GMT
Server: Apache/2.0.46 (Red Hat)
Accept-Ranges: bytes
X-Powered-By: PHP/4.3.2
Content-Length: 0
Content-Type: text/html
页: [1] 2
查看完整版本: 火车头请看,遇到无法采集的网站,附报错信息