火车采集器软件交流官方论坛

 找回密码
 加入会员
搜索
火车采集器V9版免费下载火车浏览器 - 可视采集,万能群发,全自动脚本工具
查看: 4420|回复: 2

采集内容中包含日文字符导致全文搜索出错的解决办法

[复制链接]
发表于 2007-4-13 16:37:08 | 显示全部楼层 |阅读模式
[以下内容仅适用在asp+access站]
今天发现采集的数据中带有日文字符的话会使站点全文搜索失效(提示内存溢出),网上搜索出原因是"26个日文字符捣的鬼",这是access数据库的BUG.小弟根据网上的指导,已解决了这个问题.现特意分享给大家.
1.建立内容替换规则
ゴ,ガ,ギ,グ,ゲ,ザ,ジ,ズ,ヅ,デ,ド,ポ,ベ,プ,ビ,パ,ヴ,ボ,ペ,ブ,ピ,バ,ヂ,ダ,ゾ,ゼ

以上字符分别替换为Jn0; ....Jn25; (Jn可任意定义)

2.在数据浏览页<!--#include file="jp_code.asp"-->
jp_code.asp如下
  1. <%
  2. '编码
  3. Function Jencode(byVal iStr)
  4. if isnull(iStr) or isEmpty(iStr) then
  5. Jencode=""
  6. Exit function
  7. end if
  8. dim F,i,E
  9. ' F=array("ゴ","ガ","ギ","グ","ゲ","ザ","ジ","ズ","ヅ","デ",_
  10. ' "ド","ポ","ベ","プ","ビ","パ","ヴ","ボ","ペ","ブ","ピ","バ",_
  11. ' "ヂ","ダ","ゾ","ゼ")
  12. E=array("Jn0;","Jn1;","Jn2;","Jn3;","Jn4;","Jn5;","Jn6;","Jn7;","Jn8;","Jn9;","Jn10;","Jn11;","Jn12;","Jn13;","Jn14;","Jn15;","Jn16;","Jn17;","Jn18;","Jn19;","Jn20;","Jn21;","Jn22;","Jn23;","Jn24;","Jn25;")
  13. F=array(chr(-23116),chr(-23124),chr(-23122),chr(-23120),_
  14. chr(-23118),chr(-23114),chr(-23112),chr(-23110),_
  15. chr(-23099),chr(-23097),chr(-23095),chr(-23075),_
  16. chr(-23079),chr(-23081),chr(-23085),chr(-23087),_
  17. chr(-23052),chr(-23076),chr(-23078),chr(-23082),_
  18. chr(-23084),chr(-23088),chr(-23102),chr(-23104),_
  19. chr(-23106),chr(-23108))
  20. Jencode=iStr
  21. for i=0 to 25
  22. Jencode=replace(Jencode,F(i),E(i))
  23. next
  24. End Function
  25. '解码:
  26. Function Juncode(byVal iStr)
  27. if isnull(iStr) or isEmpty(iStr) then
  28. Juncode=""
  29. Exit function
  30. end if
  31. dim F,i,E
  32. ' F=array("ゴ","ガ","ギ","グ","ゲ","ザ","ジ","ズ","ヅ","デ",_
  33. ' "ド","ポ","ベ","プ","ビ","パ","ヴ","ボ","ペ","ブ","ピ","バ",_
  34. ' "ヂ","ダ","ゾ","ゼ")
  35. E=array("Jn0;","Jn1;","Jn2;","Jn3;","Jn4;","Jn5;","Jn6;","Jn7;","Jn8;","Jn9;","Jn10;","Jn11;","Jn12;","Jn13;","Jn14;","Jn15;","Jn16;","Jn17;","Jn18;","Jn19;","Jn20;","Jn21;","Jn22;","Jn23;","Jn24;","Jn25;")
  36. F=array(chr(-23116),chr(-23124),chr(-23122),chr(-23120),_
  37. chr(-23118),chr(-23114),chr(-23112),chr(-23110),_
  38. chr(-23099),chr(-23097),chr(-23095),chr(-23075),_
  39. chr(-23079),chr(-23081),chr(-23085),chr(-23087),_
  40. chr(-23052),chr(-23076),chr(-23078),chr(-23082),_
  41. chr(-23084),chr(-23088),chr(-23102),chr(-23104),_
  42. chr(-23106),chr(-23108))
  43. Juncode=iStr
  44. for i=0 to 25
  45. Juncode=replace(Juncode,E(i),F(i))'□
  46. next
  47. End Function

  48. %>
复制代码



3.在需要的位置进行编码和解码.
例如<%=Juncode(rs("username"))%> <----将读取的数据库字段值解码还原

搜索示例:
sql="select * from 表 where (字段 like '%"&Jencode(kword)&"%') order by id desc"   <--将输入的关键字编码



以上代码非原创,只是在遇到采集带日文字的情况下才找到并总结的一个解决方法而已,希望对大家有用,谢谢!

[ 本帖最后由 teamoustar 于 2007-4-13 16:46 编辑 ]

评分

1

查看全部评分

 楼主| 发表于 2007-4-13 20:26:59 | 显示全部楼层
谢谢老大~
发表于 2007-4-13 23:28:13 | 显示全部楼层
是的,多用ACCESS有好多益处,不过这问题以前没注意。。谢谢,先收下了。
您需要登录后才可以回帖 登录 | 加入会员

本版积分规则

QQ|手机版|Archiver|火车采集器官方站 ( 皖ICP备06000549 )

GMT+8, 2026-3-31 04:54

Powered by Discuz! X3.4

Copyright © 2001-2020, Tencent Cloud.

快速回复 返回顶部 返回列表