火车采集器软件交流官方论坛

 找回密码
 加入会员
搜索
火车采集器V9版免费下载火车浏览器 - 可视采集,万能群发,全自动脚本工具
查看: 9980|回复: 11

dvbbs采集和入库的经验分享

[复制链接]
发表于 2006-12-3 01:23:13 | 显示全部楼层 |阅读模式
在模块中动网的这个web发布模块已经做的很好了,那么就是在入动网的过程中有需要注意的地方:

1.动网发帖的时候加了xhtml校验,所有采集到的代码必须符合xhml规范,否则将发表失败,在做模块测试的时候那个内容就是不符合xhtml校验的,所以在测试的时候会出现发布失败的情况,

解决办法:在采集的内容里面要加上两个替换,<br>→<br />   <hr>→<hr />,将那个测试内容中的<br>替换以后试下,是不是发布成功了

2.在采集过程中难免会遇到采集重复的现象,我采集的是163的内容,乖乖,每条都重复,动网发帖时又没有加是否重复的校验,真是忧人。

解决办法: 修改论坛程序,加入校验,然后变成如果重复主题那么将后来的内容做为前一个主题的回复内容。具体修改办法如下:

在SavePost.asp文件第836行做如下修改:
  1. SQL="insert into Dv_topic (Title,Boardid,PostUsername,PostUserid,DateAndTime,Expression,LastPost,LastPostTime,PostTable,locktopic,istop,TopicMode,isvote,PollID,Mode,GetMoney,UseTools,GetMoneyType,isSmsTopic,HideName) values ('"&topic&"',"&Dvbbs.boardid&",'"&username&"',"&Dvbbs.userid&",'"&DateTimeStr&"','"&Expression(0)&"|"&Expression(1)&"','$$"&DateTimeStr&"$$$$','"&MyLastPostTime&"','"&TotalUseTable&"',"&locktopic&","&Myistop&","&MyTopicMode&","&isvote&","&voteid&","&TopicMode&","&ToMoney&",'"&UseTools&"',"&GetMoneyType&","&isAlipayTopic&","&hidename&")"
  2.    Dvbbs.Execute(sql)
复制代码


替换为:


  1. '修改部分,避免标题重复
  2.   dim rsreg
  3.   Set rsreg = server.CreateObject("adodb.recordset")
  4.   rsreg.open "select * from [Dv_topic] where Title='" & topic & "'", conn, 1, 3
  5.   If rsreg.EOF Then
  6.    SQL="insert into Dv_topic (Title,Boardid,PostUsername,PostUserid,DateAndTime,Expression,LastPost,LastPostTime,PostTable,locktopic,istop,TopicMode,isvote,PollID,Mode,GetMoney,UseTools,GetMoneyType,isSmsTopic,HideName) values ('"&topic&"',"&Dvbbs.boardid&",'"&username&"',"&Dvbbs.userid&",'"&DateTimeStr&"','"&Expression(0)&"|"&Expression(1)&"','$$"&DateTimeStr&"$$$$','"&MyLastPostTime&"','"&TotalUseTable&"',"&locktopic&","&Myistop&","&MyTopicMode&","&isvote&","&voteid&","&TopicMode&","&ToMoney&",'"&UseTools&"',"&GetMoneyType&","&isAlipayTopic&","&hidename&")"
  7.    Dvbbs.Execute(sql)
  8.   End If
  9.   rsreg.Close
  10.      Set rsreg = Nothing
复制代码

评分

1

查看全部评分

发表于 2006-12-3 11:39:27 | 显示全部楼层
虽然没有用过。但感觉很有思路。赞。
发表于 2006-12-4 01:01:42 | 显示全部楼层
希望能把经验都分享一下
发表于 2006-12-4 23:42:49 | 显示全部楼层
支持一下!我说采集的内容怎么有些不能发的啊!原来是通不过校验啊1

谢谢
发表于 2007-1-11 22:08:14 | 显示全部楼层
我支持你。。希望早日出DVBBS的教程
发表于 2007-1-12 00:44:22 | 显示全部楼层
好经验,建议加精
发表于 2007-1-16 22:34:19 | 显示全部楼层
顶一个
发表于 2007-1-17 17:15:26 | 显示全部楼层
为什么我的提示是没有权限撒》?
发表于 2007-1-25 12:21:37 | 显示全部楼层
偶发布的测试的时候提示是这样的?请教各位高手




<html>
<head>
<meta http-equiv="Content-Type" c>
<title>OtherErr-错误信息[北漂论坛]</title>
<meta name="generator" c />
<meta name="keywords" c />
<!--北漂论坛致力为漂泊在北京的朋友提供一个温馨网上家园-->
<meta name="MSSmartTagsPreventParsing" c />
<meta http-equiv="MSThemeCompatible" c />
<link rel="SHORTCUT ICON" href="favicon.ico" />
<link rel="stylesheet" type="text/css" href="skins/aspsky_5.css" />
<link title="北漂论坛-频道列表" type="application/rss+xml" rel="alternate" href="rssfeed.asp" />
<link title="北漂论坛-最新20篇论坛主题" type="application/rss+xml" rel="alternate" href="rssfeed.asp?rssid=4" />
<script language = "javaScript" src = "inc/Main.js" type="text/javascript"></script>
</head>
<body >
<div class="menuskin" id="popmenu"   style="z-index:100;"></div><script language="javascript" type="text/javascript">var boardxml='<?xml version="1.0" encoding="gb2312"?><BoardList><board boardid="42" boardtype="北漂社区" parentid="0" depth="0" rootid="1" child="4" hidden="0" nopost="0"><board boardid="45" boardtype="焦点话题" parentid="42" depth="1" rootid="1" child="0" hidden="0" nopost="0"/><board boardid="43" boardtype="混在北京" parentid="42" depth="1" rootid="1" child="0" hidden="0" nopost="0"/><board boardid="44" boardtype="跳蚤市场" parentid="42" depth="1" rootid="1" child="0" hidden="0" nopost="0"/><board boardid="47" boardtype="超级贴图" parentid="42" depth="1" rootid="1" child="0" hidden="0" nopost="0"/></board><board boardid="49" boardtype="同乡论坛" parentid="0" depth="0" rootid="2" child="11" hidden="0" nopost="0"><board boardid="50" boardtype="河北" parentid="49" depth="1" rootid="2" child="2" hidden="0" nopost="0"><board boardid="52" boardtype="石家庄" parentid="50" depth="2" rootid="2" child="0" hidden="0" nopost="0"/><board boardid="51" boardtype="张家口" parentid="50" depth="2" rootid="2" child="0" hidden="0" nopost="0"/></board><board boardid="53" boardtype="河南" parentid="49" depth="1" rootid="2" child="0" hidden="0" nopost="0"/><board boardid="56" boardtype="湖北" parentid="49" depth="1" rootid="2" child="0" hidden="0" nopost="0"/><board boardid="57" boardtype="湖南" parentid="49" depth="1" rootid="2" child="0" hidden="0" nopost="0"/><board boardid="58" boardtype="广东" parentid="49" depth="1" rootid="2" child="0" hidden="0" nopost="0"/><board boardid="59" boardtype="广西" parentid="49" depth="1" rootid="2" child="0" hidden="0" nopost="0"/><board boardid="54" boardtype="山西" parentid="49" depth="1" rootid="2" child="0" hidden="0" nopost="0"/><board boardid="55" boardtype="山东" parentid="49" depth="1" rootid="2" child="1" hidden="0" nopost="0"><board boardid="60" boardtype="青岛" parentid="55" depth="2" rootid="2" child="0" hidden="0" nopost="0"/></board></board><board boardid="3" boardtype="北漂房产" parentid="0" depth="0" rootid="3" child="3" hidden="0" nopost="0"><board boardid="8" boardtype="购房攻略" parentid="3" depth="1" rootid="3" child="0" hidden="0" nopost="0"/><board boardid="39" boardtype="装修装饰" parentid="3" depth="1" rootid="3" child="0" hidden="0" nopost="0"/><board boardid="33" boardtype="地产沙龙" parentid="3" depth="1" rootid="3" child="0" hidden="0" nopost="0"/></board></BoardList>';</script>
<div class="mainbar" id="topbar_top">dvbbs</div>
<div class="mainbar" id="topbar_mid">
<div id="topbar_mid_r">
<div style="cursor:hand"   >收藏本页</div>
<div><a href="http://bbs.china010.com.cn/contact.asp" target="_blank">联系我们</a></div>
<div><a href="boardhelp.asp?boardID=0">论坛帮助</a></div>
</div>
<div id="topbar_mid_l">
<a href="北漂论坛"><img border="0" src="images/logo.gif" alt="" /></a>
</div>
<div id="topbar_mid_m"><iframe marginwidth="0" marginheight="0" src="" frameborder="0" width="630" scrolling="no" height="60"></iframe></div>
</div>
<div class="mainbar" id="topbar_bottom">dvbbs</div>
<div class="mainbar" id="topbar_menu">
<!--顶部用户导航栏:客人菜单-->
<div class="menu_popup" id="stylemenu">
<div class="menuitems">
<div class="menuitems"><a href="javascript:getskins(1,0);">恢复默认设置</a></div>
</div>
</div>
<div class="menudiv2"><a href="login.asp" >登录</a></div>
<div class="menudiv1"><a href="reg.asp">注册</a> </div>
<div class="menudiv1"><a href="query.asp?boardid=0">搜索</a></div>
<div class="menudiv1"><a   class="ImgOnclick">风格</a></div>

<div class="menudiv1"><a  class="ImgOnclick">论坛状态</a></div>
<div class="menudiv1"><a href="show.asp?boardid=0"  class="ImgOnclick">论坛展区</a></div>


<div class="menudiv1"><a href="BoardPermission.asp?boardid=0&action=Myinfo">我能做什么</a> </div></div>
<br />
<div class="mainbar0" style="text-align : left;">
<div style="float:right;height:22px"></div>
<div style="text-align : left;width:80%; ">>> 欢迎光临 <b>北漂论坛</b> </div>
</div>
<div class="tableborder2" style="text-align : left;height:25px;line-height:25px;">
<div style="float:left;"><img src="skins/default/Forum_nav.gif" style="margin:8px 4px;" alt="" /></div>
<div style="float:right;"> </div>
<a href="index.asp">北漂论坛</a> → <a href=>错误信息</a> → OtherErr-错误信息 <a name="top"></a> </div>
<br /><br/><table cellpadding="3" cellspacing="1" align="center" class="tableborder1" style="width:600">
<tr align="center">
<th width="100%" height="25" colspan="2">北漂论坛-OtherErr-错误信息</td>
</tr><tr><td width="100%" class="tablebody1" colspan="2">
   <b>您在"<font color="#FF0000">访问论坛</font>"的时候发生错误,共有1项,下面是错误的详细信息</b></td></tr>
<tr><td width="100%" class="tablebody1" colspan="2">
<li>脛煤脙禄脫脨路垄卤铆脨脗脰梅脤芒碌脛脠篓脧脼
</td></tr>
<tr><td width="100%" class="tablebody1" colspan="2">
<li>请仔细阅读论坛帮助文件,确保您有相应的操作权限。
</td></tr>
<tr><td class="tablebody2" valign="middle" colspan="2" align="center">
<a href="javascript:history.go(-1)"><< 返回上一页</a>
</td></tr>
</table><form action="login.asp?action=chk" method="post">
<input type="hidden" value="post.asp?action=new&boardid=47" name="comeurl"/>
<table cellpadding="3" cellspacing="1" align="center" class="tableborder1" style="width:600">
<tr>
<th valign="middle" colspan="2" align="center" height="25">您尚未登录,您可以输入您的用户名、密码登录以便进行您的操作。</th></tr>
<tr>
<td valign="middle" class="tablebody1">请输入您的用户名</td>
<td valign="middle" class="tablebody1"><INPUT name="username" type="text"/>   <a href="reg.asp">没有注册?</a></td></tr>
<tr>
<td valign="middle" class="tablebody1">请输入您的密码</td>
<td valign="middle" class="tablebody1"><INPUT name="password" type="password"/>   <a href="lostpass.asp"> 忘记密码?</a></td></tr>
<tr>
<td valign="middle" class="tablebody1">请输入验证码</td>
<td valign="middle" class="tablebody1"><!--验证码表单-->
<input type="text" name="codestr"  size="4" /> <img src="DV_getcode.asp" alt= "验证码,看不清楚?请点击刷新验证码" style="cursor : pointer;height : 20px;" /> </td></tr>
<tr>
<td class="tablebody1" valign="top" width="30%"><b>Cookie 选项</b><BR/>请选择你的 Cookie 保存时间,<br/>下次访问可以方便输入。</td>
<td valign="middle" class="tablebody1"><input type="radio" name="CookieDate" value="0" checked="checked"/>不保存,关闭浏览器就失效<br/>
<input type="radio" name="CookieDate" value="1"/>保存一天<br/>
<input type="radio" name="CookieDate" value="2"/>保存一月<br/>
<input type="radio" name="CookieDate" value="3"/>保存一年<br/></td></tr>
<tr>
<td valign="top" width="30%" class="tablebody1"><b>隐身登录</b><br/> 您可以选择隐身登录,论坛会员将在用户列表看不到您的信息。</td>
<td valign="middle" class="tablebody1"><input type="radio" name="userhidden" value="2" checked="checked"/>正常登录<br>
<input type="radio" name="userhidden" value="1"/>隐身登录<br/>
</td></tr>
<tr>
<td class="tablebody2" valign="middle" colspan="2" align="center"><input type="submit" name="submit" value="登 录"/></td></tr></table>
</form></body></html>
发表于 2007-1-25 16:04:46 | 显示全部楼层
不用br也能发表.第二个不错,学习.
您需要登录后才可以回帖 登录 | 加入会员

本版积分规则

QQ|手机版|Archiver|火车采集器官方站 ( 皖ICP备06000549 )

GMT+8, 2026-4-4 11:11

Powered by Discuz! X3.4

Copyright © 2001-2020, Tencent Cloud.

快速回复 返回顶部 返回列表