zyicdijx 发表于 2009-12-4 13:14:31

求教高手教我采集这段代码

本帖最后由 zyicdijx 于 2009-12-4 13:16 编辑

通过同时采集页面采集到的代码,我设置开始字符串<P align=center> 结束字符串</P>和该标签循环匹配,该标签在分页中匹配
但是有一部内容采集不下来。我想不到其他办法了。请高手帮我看看。代码是
<P align=center>
<TABLE style="WIDTH: 344pt; BORDER-COLLAPSE: collapse" border=0 cellSpacing=0 cellPadding=0 width=459>
<COLGROUP>
<COL style="WIDTH: 72pt; mso-width-source: userset; mso-width-alt: 3072" width=96>
<COL style="WIDTH: 77pt; mso-width-source: userset; mso-width-alt: 3296" width=103>
<COL style="WIDTH: 72pt; mso-width-source: userset; mso-width-alt: 3072" width=96>
<COL style="WIDTH: 69pt; mso-width-source: userset; mso-width-alt: 2944" width=92>
<COL style="WIDTH: 54pt" width=72>
<TBODY>
<TR style="HEIGHT: 13.5pt" height=18>
<TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; WIDTH: 72pt; HEIGHT: 13.5pt; BORDER-TOP: #f0f0f0; BORDER-RIGHT: #f0f0f0" height=18 width=96><FONT face=宋体>上装</FONT></TD>
<TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; WIDTH: 77pt; BORDER-TOP: #f0f0f0; BORDER-RIGHT: #f0f0f0" width=103><FONT face=宋体></FONT></TD>
<TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; WIDTH: 72pt; BORDER-TOP: #f0f0f0; BORDER-RIGHT: #f0f0f0" width=96><FONT face=宋体></FONT></TD>
<TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; WIDTH: 69pt; BORDER-TOP: #f0f0f0; BORDER-RIGHT: #f0f0f0" width=92><FONT face=宋体></FONT></TD>
<TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: #f0f0f0; BACKGROUND-COLOR: transparent; WIDTH: 54pt; BORDER-TOP: #f0f0f0; BORDER-RIGHT: #f0f0f0" width=72><FONT face=宋体>单位:cm</FONT></TD></TR>
<TR style="HEIGHT: 13.5pt" height=18>
<TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext 0.5pt solid; BACKGROUND-COLOR: transparent; HEIGHT: 13.5pt; BORDER-TOP: windowtext 0.5pt solid; BORDER-RIGHT: windowtext 0.5pt solid" class=xl65 height=18>
<P align=center><FONT face=宋体>尺寸</FONT></P></TD>
<TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext; BACKGROUND-COLOR: transparent; BORDER-TOP: windowtext 0.5pt solid; BORDER-RIGHT: windowtext 0.5pt solid" class=xl65>
<P align=center><FONT face=宋体>肩宽</FONT></P></TD>
<TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext; BACKGROUND-COLOR: transparent; BORDER-TOP: windowtext 0.5pt solid; BORDER-RIGHT: windowtext 0.5pt solid" class=xl65>
<P align=center><FONT face=宋体>胸围</FONT></P></TD>
<TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext; BACKGROUND-COLOR: transparent; BORDER-TOP: windowtext 0.5pt solid; BORDER-RIGHT: windowtext 0.5pt solid" class=xl65>
<P align=center><FONT face=宋体>袖长</FONT></P></TD>
<TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext; BACKGROUND-COLOR: transparent; BORDER-TOP: windowtext 0.5pt solid; BORDER-RIGHT: windowtext 0.5pt solid" class=xl65>
<P align=center><FONT face=宋体>总长</FONT></P></TD></TR>
<TR style="HEIGHT: 13.5pt" height=18>
<TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext 0.5pt solid; BACKGROUND-COLOR: transparent; HEIGHT: 13.5pt; BORDER-TOP: windowtext; BORDER-RIGHT: windowtext 0.5pt solid" class=xl65 height=18>
<P align=center><FONT face=宋体>FREE SIZE</FONT></P></TD>
<TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext; BACKGROUND-COLOR: transparent; BORDER-TOP: windowtext; BORDER-RIGHT: windowtext 0.5pt solid" class=xl65>
<P align=center><FONT face=宋体>33.5 </FONT></P></TD>
<TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext; BACKGROUND-COLOR: transparent; BORDER-TOP: windowtext; BORDER-RIGHT: windowtext 0.5pt solid" class=xl65>
<P align=center><FONT face=宋体>35 </FONT></P></TD>
<TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext; BACKGROUND-COLOR: transparent; BORDER-TOP: windowtext; BORDER-RIGHT: windowtext 0.5pt solid" class=xl65>
<P align=center><FONT face=宋体>58 </FONT></P></TD>
<TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext; BACKGROUND-COLOR: transparent; BORDER-TOP: windowtext; BORDER-RIGHT: windowtext 0.5pt solid" class=xl65>
<P align=center><FONT face=宋体>62 </FONT></P></TD></TR>
<TR style="HEIGHT: 13.5pt" height=18>
<TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext 0.5pt solid; BACKGROUND-COLOR: transparent; HEIGHT: 13.5pt; BORDER-TOP: windowtext; BORDER-RIGHT: windowtext 0.5pt solid" class=xl65 height=18>
<P align=center><FONT face=宋体>颜色</FONT></P></TD>
<TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext; BACKGROUND-COLOR: transparent; BORDER-TOP: windowtext 0.5pt solid; BORDER-RIGHT: windowtext 0.5pt solid" class=xl65 colSpan=4>
<P align=center><FONT face=宋体>米白色아이보리,薄荷色민트 </FONT></P></TD></TR>
<TR style="HEIGHT: 13.5pt" height=18>
<TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: windowtext 0.5pt solid; BACKGROUND-COLOR: transparent; HEIGHT: 13.5pt; BORDER-TOP: windowtext; BORDER-RIGHT: windowtext 0.5pt solid" class=xl66 height=18>
<P align=center><FONT face=宋体>材质</FONT></P></TD>
<TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext; BACKGROUND-COLOR: transparent; BORDER-TOP: windowtext 0.5pt solid; BORDER-RIGHT: windowtext 0.5pt solid" class=xl65 colSpan=4>
<P align=center><FONT face=宋体>针织 </FONT></P></TD></TR>
<TR style="HEIGHT: 13.5pt" height=18>
<TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext 0.5pt solid; BACKGROUND-COLOR: transparent; HEIGHT: 13.5pt; BORDER-TOP: windowtext 0.5pt solid; BORDER-RIGHT: windowtext 0.5pt solid" class=xl65 height=18>
<P align=center><FONT face=宋体>官网编号</FONT></P></TD>
<TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext; BACKGROUND-COLOR: transparent; BORDER-TOP: windowtext 0.5pt solid; BORDER-RIGHT: windowtext 0.5pt solid" class=xl65 colSpan=4>
<P align=center><FONT face=宋体>227 </FONT></P></TD></TR>
<TR style="HEIGHT: 13.5pt" height=18>
<TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext 0.5pt solid; BACKGROUND-COLOR: transparent; HEIGHT: 40.5pt; BORDER-TOP: windowtext; BORDER-RIGHT: windowtext 0.5pt solid" class=xl65 height=54 rowSpan=3>
<P align=center><FONT face=宋体>注意事项</FONT></P></TD>
<TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: windowtext; BACKGROUND-COLOR: transparent; BORDER-TOP: windowtext 0.5pt solid; BORDER-RIGHT: black 0.5pt solid" class=xl67 colSpan=4><FONT face=宋体>1.由于测量方法的不同,可能存在1-2cm的误差。</FONT></TD></TR>
<TR style="HEIGHT: 13.5pt" height=18>
<TD style="BORDER-BOTTOM: #f0f0f0; BORDER-LEFT: windowtext; BACKGROUND-COLOR: transparent; HEIGHT: 13.5pt; BORDER-TOP: #f0f0f0; BORDER-RIGHT: black 0.5pt solid" class=xl70 height=18 colSpan=4><FONT face=宋体>2.请注意不同显示器下显示颜色可能有所差异。</FONT></TD></TR>
<TR style="HEIGHT: 13.5pt" height=18>
<TD style="BORDER-BOTTOM: windowtext 0.5pt solid; BORDER-LEFT: windowtext; BACKGROUND-COLOR: transparent; HEIGHT: 13.5pt; BORDER-TOP: #f0f0f0; BORDER-RIGHT: black 0.5pt solid" class=xl72 height=18 colSpan=4><FONT face=宋体>3.以上是根据韩国官方信息翻译过来的尺寸信息。</FONT></TD></TR></TBODY></TABLE></P>
<P align=center><IMG border=0 src="http://pic.kro-pic.com/2009124_picture_images/2009124103317265_28.jpg" width=720 height=6987></P>

都市乞丐 发表于 2009-12-4 14:59:15

循环标签是找最相近的匹配,你这样肯定获取不全

面向大海 发表于 2009-12-4 17:07:41

肯定是采集不下来的,不包含在它里面的内容肯定采集不下来。

zyicdijx 发表于 2009-12-4 17:59:08

那有什么办法采集吗

zyicdijx 发表于 2009-12-4 19:02:25

我出钱谁帮我采集啊。QQ是26315115
页: [1]
查看完整版本: 求教高手教我采集这段代码