经过URL抓取网页的TITLE，有些网站抓不到，方法愚笨，求指点

通过URL抓取网页的TITLE，有些网站抓不到，方法愚笨，求指点。

本帖最后由 u012716911 于 2013-11-04 11:25:29 编辑

代码是我自己这样想着写的，不知道还有没有更好的方法。请各位给些指点
有些网站可以抓到，如百度，有些网站就抓不到，比如太平洋汽车的首页。

<br />public function set_title()<br />	{<br />		// 获取进来URL<br />		$url = $_POST['url'];<br />		// $url = "www.pcauto.com.cn"; 抓不到！<br />		//一连串的curl设置		<br />		$ch = curl_init();<br />		curl_setopt($ch,CURLOPT_URL,$url);<br />		curl_setopt($ch,CURLOPT_HEADER,0);<br />		curl_setopt($ch,CURLOPT_ENCODING,'gzip');<br />		curl_setopt($c<b>/本文来源gao@!dai!ma.com搞$$代^@码5网@</b><strong>搞代gaodaima码</strong>h,CURLOPT_RETURNTRANSFER,1);<br />		$content_source = curl_exec($ch);<br />		curl_close($ch);<br />	<br />		//获取抓到内容的编码格式<br /><br />		$encode = mb_detect_encoding($content_source, array('GB2312','GBK','UTF-8','ASCII')); <br />		<br />		//转码<br />		$content_source = iconv($encode, 'utf-8//IGNORE',$content_source);<br />		<br />		//截取<title><br />		if(preg_match("/<title>(.*?)<\/title>/i",$content_source,$title))<br />		{<br />			echo $title[1];<br />		}<br />		else<br />		{<br />			echo '拉取标题失败';<br />		}<br />	}<br />

curl 抓取标题

分享到：

——解决方案——————–
问题出在正则匹配那里，你加个 s 修正符就好了
if(preg_match(“/(.*?)<\/title>/is”,$content_source,$title))</p> <p>s 如果设定了此修正符，模式中的圆点元字符（.）<strong><span style="color: #FF0000">匹配所有的字符，包括换行符。没有此设定的话，则不包括换行符</span></strong>。 </p> <div class="clear"> <hr /><div class="open-message">搞代码网（<a href="www.gaodaima.com" target="_blank" title="gaodaima.com">gaodaima.com</a>）提供的所有资源部分来自互联网，如果有侵犯您的版权或其他权益，请说明详细缘由并提供版权或权益证明然后发送到邮箱<a>chengxuyuan@gaodaima.com‍</a>，我们会在看到邮件的第一时间内为您处理，或直接联系<a>QQ：872152909</a>。本网站采用<a href="https://www.gaodaima.com/go.html?url=http://creativecommons.org/licenses/by-nc-sa/3.0/" rel="nofollow" target="_blank" title="BY-NC-SA授权协议">BY-NC-SA</a>协议进行授权 <br >转载请注明原文链接：<a href="https://www.gaodaima.com/341375.html" target="_blank" title="经过URL抓取网页的TITLE，有些网站抓不到，方法愚笨，求指点">经过URL抓取网页的TITLE，有些网站抓不到，方法愚笨，求指点</a><a id="spreadAds" target="_blank" href="" onclick="spreadAds()"></a><a id="spreadAds2" target="_blank" href=""></a> </div> <div class="article-social"> <a href="javascript:;" data-action="ding" data-id="341375" id="Addlike" class="action"><i class="fa fa-heart-o"></i>喜欢 (<span class="count">0</span>)</a><span class="or"><style>.article-social .weixin:hover{background:#fff;}</style><a class="weixin" style="border-bottom:0px;font-size:15pt;cursor:pointer;">赏<div class="weixin-popover"><div class="popover bottom in"><div class="arrow"></div><div class="popover-title"><center>[搞代码]</center></div><div class="popover-content"><img width="200px" height="200px" src="http://www.gaodaima.com/wp-content/uploads/hai_AliPay.png" ></div></div></div></a></span><span class="action action-share bdsharebuttonbox"><i class="fa fa-share-alt"></i>分享 (<span class="bds_count" data-cmd="count" title="累计分享0次">0</span>)<div class="action-popover"><div class="popover top in"><div class="arrow"></div><div class="popover-content"><a href="#" class="sinaweibo fa fa-weibo" data-cmd="tsina" title="分享到新浪微博"></a><a href="#" class="bds_qzone fa fa-star" data-cmd="qzone" title="分享到QQ空间"></a><a href="#" class="qq fa fa-qq" data-cmd="sqq" title="分享到QQ好友"></a><a href="#" class="bds_renren fa fa-renren" data-cmd="renren" title="分享到人人网"></a><a href="#" class="bds_weixin fa fa-weixin" data-cmd="weixin" title="分享到微信"></a><a href="#" class="bds_more fa fa-ellipsis-h" data-cmd="more"></a></div></div></div></span></div> </article> <footer class="article-footer"> </footer> <nav class="article-nav"><div class="_d13vl98i0pw"></div><script type="text/javascript"> (window.slotbydup = window.slotbydup || []).push({ id: "u6795176", container: "_d13vl98i0pw", async: true });</script> </nav><div id="donatecoffee" style="overflow:auto;display:none;"><img width="400" height="400" alt="支持作者一杯咖啡" src="http://www.gaodaima.com/wp-content/uploads/hai_AliPay.png"></div> <div class="related_top"> <div class="related_posts"><ul class="related_img"> <li class="related_box" > <a href="https://www.gaodaima.com/432021.html" title="利用微信公众号提供的官方API上传图片获取永久图片素材当图床用" target="_blank"><img class="thumb" style="width:185px;height:110px" data-original="https://www.gaodaima.com/wp-content/themes/Git-alpha/timthumb.php?src=https://www.gaodaima.com/wp-content/themes/Git-alpha/assets/img/pic/4.jpg&h=110&w=185&q=90&zc=1&ct=1" alt="利用微信公众号提供的官方API上传图片获取永久图片素材当图床用" /><br><span class="r_title">利用微信公众号提供的官方API上传图片获取永久图片素材当图床用</span></a> </li> <li class="related_box" > <a href="https://www.gaodaima.com/432022.html" title="不用编码的高端网站建设神器" target="_blank"><img class="thumb" style="width:185px;height:110px" data-original="https://www.gaodaima.com/wp-content/themes/Git-alpha/timthumb.php?src=https://www.gaodaima.com/wp-content/themes/Git-alpha/assets/img/pic/9.jpg&h=110&w=185&q=90&zc=1&ct=1" alt="不用编码的高端网站建设神器" /><br><span class="r_title">不用编码的高端网站建设神器</span></a> </li> <li class="related_box" > <a href="https://www.gaodaima.com/432019.html" title="精品SSM框架个人健康服务预约系统设计和实现源码查重报告代码讲解论文中期检查ppt已降重" target="_blank"><img class="thumb" style="width:185px;height:110px" data-original="https://www.gaodaima.com/wp-content/themes/Git-alpha/timthumb.php?src=https://www.gaodaima.com/wp-content/themes/Git-alpha/assets/img/pic/7.jpg&h=110&w=185&q=90&zc=1&ct=1" alt="精品SSM框架个人健康服务预约系统设计和实现源码查重报告代码讲解论文中期检查ppt已降重" /><br><span class="r_title">精品SSM框架个人健康服务预约系统设计和实现源码查重报告代码讲解论文中期检查ppt已降重</span></a> </li> <li class="related_box" > <a href="https://www.gaodaima.com/432020.html" title="php设计模式一单例工厂" target="_blank"><img class="thumb" style="width:185px;height:110px" data-original="https://www.gaodaima.com/wp-content/themes/Git-alpha/timthumb.php?src=https://www.gaodaima.com/wp-content/themes/Git-alpha/assets/img/pic/12.jpg&h=110&w=185&q=90&zc=1&ct=1" alt="php设计模式一单例工厂" /><br><span class="r_title">php设计模式一单例工厂</span></a> </li> </ul><div class="relates"><div class="_mw9fz31sqco"></div><script type="text/javascript"> (window.slotbydup = window.slotbydup || []).push({ id: "u6795180", container: "_mw9fz31sqco", async: true });</script><ul><li><i class="fa fa-minus"></i><a target="_blank" href="https://www.gaodaima.com/432021.html">利用微信公众号提供的官方API上传图片获取永久图片素材当图床用</a></li><li><i class="fa fa-minus"></i><a target="_blank" href="https://www.gaodaima.com/432022.html">不用编码的高端网站建设神器</a></li><li><i class="fa fa-minus"></i><a target="_blank" href="https://www.gaodaima.com/432019.html">精品SSM框架个人健康服务预约系统设计和实现源码查重报告代码讲解论文中期检查ppt已降重</a></li><li><i class="fa fa-minus"></i><a target="_blank" href="https://www.gaodaima.com/432020.html">php设计模式一单例工厂</a></li><li><i class="fa fa-minus"></i><a target="_blank" href="https://www.gaodaima.com/432017.html">Go-内联优化能让程序快多少</a></li><li><i class="fa fa-minus"></i><a target="_blank" href="https://www.gaodaima.com/432018.html">推荐一个PHP-Tree无限级分类组件-BlueMTree</a></li><li><i class="fa fa-minus"></i><a target="_blank" href="https://www.gaodaima.com/432015.html">php设计模式二注册树</a></li><li><i class="fa fa-minus"></i><a target="_blank" href="https://www.gaodaima.com/432016.html">Gmail如何跟踪邮件阅读状态</a></li></ul></div></div> </div> <div id="comment-ad" class="banner banner-related"><div class="_9madbukio7r"></div><script type="text/javascript"> (window.slotbydup = window.slotbydup || []).push({ id: "u6677406", container: "_9madbukio7r", async: true });</script><script type="text/javascript" src="//cpro.baidustatic.com/cpro/ui/cm.js" async="async" defer="defer" ></script></div> <div id="respond" class="no_webshot"> <form action="https://www.gaodaima.com/asdloie8574asdqwexzxdqwertasdqwe.php" method="post" id="commentform"> <div class="comt-title"> <div class="comt-avatar pull-left"> <img src="https://cdn.v2ex.com/gravatar/?s=50" class="avatar avatar-108"> </div> <div class="comt-author pull-left"> 发表我的评论 </div> <a id="cancel-comment-reply-link" class="pull-right" href="javascript:;">取消评论</a> </div> <div class="comt"> <div class="comt-box"> <textarea placeholder="说点什么吧…" class="input-block-level comt-area" name="comment" id="comment" cols="100%" rows="3" tabindex="1" onkeydown="if(event.ctrlKey&&event.keyCode==13){document.getElementById('submit').click();return false};"></textarea> <div class="comt-ctrl"> <button class="btn btn-primary pull-right" type="submit" name="submit" id="submit" tabindex="5"><i class="fa fa-check-square-o"></i> 提交评论</button> <div class="comt-tips pull-right"><input type='hidden' name='comment_post_ID' value='341375' id='comment_post_ID' /><input type='hidden' name='comment_parent' id='comment_parent' value='0' /><p style="display: none;"><input type="hidden" id="akismet_comment_nonce" name="akismet_comment_nonce" value="e34e3539e0" /></p><label for="comment_mail_notify" class="checkbox inline" style="padding-top:0;"><input name="comment_mail_notify" id="comment_mail_notify" value="comment_mail_notify" checked="checked" type="checkbox">评论通知</label><p style="display: none;"><input type="hidden" id="ak_js" name="ak_js" value="162"/></p></div> <span data-type="comment-insert-smilie" class="muted comt-smilie"><i class="fa fa-smile-o"></i> 表情</span> <span class="muted ml5 comt-img"><i class="fa fa-picture-o"></i><a href="javascript:SIMPALED.Editor.img()" style="color:#999999"> 贴图</a></span> <span class="muted ml5 comt-strong"><i class="fa fa-bold"></i><a href="javascript:SIMPALED.Editor.strong()" style="color:#999999"> 加粗</a></span> <span class="muted ml5 comt-del"><i class="fa fa-strikethrough"></i><a href="javascript:SIMPALED.Editor.del()" style="color:#999999"> 删除线</a></span> <span class="muted ml5 comt-center"><i class="fa fa-align-center"></i><a href="javascript:SIMPALED.Editor.center()" style="color:#999999"> 居中</a></span> <span class="muted ml5 comt-italic"><i class="fa fa-italic"></i><a href="javascript:SIMPALED.Editor.italic()" style="color:#999999"> 斜体</a></span> <span class="muted ml5 comt-sign"><i class="fa fa-pencil-square-o"></i><a href="javascript:SIMPALED.Editor.daka()" style="color:#999999"> 签到</a></span> </div> </div> <div class="comt-comterinfo" id="comment-author-info" > <h4>Hi，您需要填写昵称和邮箱！</h4> <ul> <li class="form-inline"><label class="hide" for="author">昵称</label><input class="ipt" type="text" name="author" id="author" value="" tabindex="2" placeholder="昵称"><span class="help-inline">昵称 (必填)</span></li> <li class="form-inline"><label class="hide" for="email">邮箱</label><input class="ipt" type="text" name="email" id="email" value="" tabindex="3" placeholder="邮箱"><span class="help-inline">邮箱 (必填)</span></li> <li class="form-inline"><label class="hide" for="url">网址</label><input class="ipt" type="text" name="url" id="url" value="" tabindex="4" placeholder="网址"><span class="help-inline">网址</span></li> </ul> </div> </div> </form> </div> <div class="banner banner-comment"><div class="_z8s30eeiks"></div><script type="text/javascript"> (window.slotbydup = window.slotbydup || []).push({ id: "u6677645", container: "_z8s30eeiks", async: true });</script></div> </div></div><aside class="sidebar"><div class="widget git_banner"><div class="git_banner_inner"><div class="_ij8d6cgwm0r"></div><script type="text/javascript"> (window.slotbydup = window.slotbydup || []).push({ id: "u6795162", container: "_ij8d6cgwm0r", async: true });</script></div></div><div class="widget git_banner"><div class="git_banner_inner"><script type="text/javascript"> /*360*300-pc-侧边栏多彩标签云*/ var cpro_id = "u6795164";</script><script type="text/javascript" src="http://cpro.baidustatic.com/cpro/ui/c.js"></script></div></div><div class="widget git_postlist"><div class="title"><h2>热门文章</h2></div><ul> <li> <a target="_blank" href="https://www.gaodaima.com/355893.html" title="关于magic_quotes_gpc设为ON,该怎么处理" ><span class="thumbnail"><img width="100px" height="64px" src="https://www.gaodaima.com/wp-content/themes/Git-alpha/timthumb.php?src=https://www.gaodaima.com/wp-content/themes/Git-alpha/assets/img/pic/12.jpg&h=64&w=100&q=90&zc=1&ct=1" alt="关于magic_quotes_gpc设为ON,该怎么处理" /></span><span class="text">关于magic_quotes_gpc设为ON,该怎么处理</span><span class="muted">2022-01-25</span><span class="muted">0</span></a> </li> <li> <a target="_blank" href="https://www.gaodaima.com/346062.html" title="wamp运行非常慢,该如何解决" ><span class="thumbnail"><img width="100px" height="64px" src="https://www.gaodaima.com/wp-content/themes/Git-alpha/timthumb.php?src=https://www.gaodaima.com/wp-content/themes/Git-alpha/assets/img/pic/6.jpg&h=64&w=100&q=90&zc=1&ct=1" alt="wamp运行非常慢,该如何解决" /></span><span class="text">wamp运行非常慢,该如何解决</span><span class="muted">2022-01-25</span><span class="muted">0</span></a> </li> <li> <a target="_blank" href="https://www.gaodaima.com/234418.html" title="Oracle数据库使用NFS存储,启动报错提示无法锁定文件" ><span class="thumbnail"><img width="100px" height="64px" src="https://www.gaodaima.com/wp-content/themes/Git-alpha/timthumb.php?src=https://www.gaodaima.com/wp-content/themes/Git-alpha/assets/img/pic/3.jpg&h=64&w=100&q=90&zc=1&ct=1" alt="Oracle数据库使用NFS存储,启动报错提示无法锁定文件" /></span><span class="text">Oracle数据库使用NFS存储,启动报错提示无法锁定文件</span><span class="muted">2022-01-09</span><span class="muted">0</span></a> </li> <li> <a target="_blank" href="https://www.gaodaima.com/273481.html" title="MYSQL存储过程开发中如何使用游标嵌套_MySQL" ><span class="thumbnail"><img width="100px" height="64px" src="https://www.gaodaima.com/wp-content/themes/Git-alpha/timthumb.php?src=https://www.gaodaima.com/wp-content/themes/Git-alpha/assets/img/pic/7.jpg&h=64&w=100&q=90&zc=1&ct=1" alt="MYSQL存储过程开发中如何使用游标嵌套_MySQL" /></span><span class="text">MYSQL存储过程开发中如何使用游标嵌套_MySQL</span><span class="muted">2022-01-10</span><span class="muted">0</span></a> </li> <li> <a target="_blank" href="https://www.gaodaima.com/66237.html" title="ASP如何使用MYSQL数据库_asp" ><span class="thumbnail"><img width="100px" height="64px" src="https://www.gaodaima.com/wp-content/themes/Git-alpha/timthumb.php?src=https://www.gaodaima.com/wp-content/themes/Git-alpha/assets/img/pic/3.jpg&h=64&w=100&q=90&zc=1&ct=1" alt="ASP如何使用MYSQL数据库_asp" /></span><span class="text">ASP如何使用MYSQL数据库_asp</span><span class="muted">2018-08-10</span><span class="muted">0</span></a> </li> <li> <a target="_blank" href="https://www.gaodaima.com/417333.html" title="java如何打包" ><span class="thumbnail"><img width="100px" height="64px" src="https://www.gaodaima.com/wp-content/themes/Git-alpha/timthumb.php?src=https://www.gaodaima.com/wp-content/themes/Git-alpha/assets/img/pic/10.jpg&h=64&w=100&q=90&zc=1&ct=1" alt="java如何打包" /></span><span class="text">java如何打包</span><span class="muted">2022-05-10</span><span class="muted">0</span></a> </li> </ul></div><div class="widget git_banner"><div class="git_banner_inner"><div class="_6zq48yj6kr"></div><script type="text/javascript"> (window.slotbydup = window.slotbydup || []).push({ id: "u6700845", container: "_6zq48yj6kr", async: true });</script></div></div><div class="widget git_banner"><div class="git_banner_inner"><div class="_zx9adhkf46s"></div><script type="text/javascript"> (window.slotbydup = window.slotbydup || []).push({ id: "u6795161", container: "_zx9adhkf46s", async: true });</script></div></div><div class="widget git_banner"><div class="git_banner_inner"><div class="_4st5c4knrf5"></div><script type="text/javascript"> (window.slotbydup = window.slotbydup || []).push({ id: "u6795163", container: "_4st5c4knrf5", async: true });</script></div></div></aside> <script type="text/javascript" src="https://www.gaodaima.com/wp-content/plugins/g-prettify/prettify.js"></script></section><div id="footbar" style="border-top: 2px solid #8E44AD;"><ul><li><p class="first">版权声明</p><span>本站的文章和资源来自互联网或者站长<br>的原创，按照 CC BY -NC -SA 3.0 CN<br>协议发布和共享，转载或引用本站文章<br>应遵循相同协议。如果有侵犯版权的资<br>源请尽快联系站长，我们会在24h内删<br>除有争议的资源。</span></li><li><p class="second">网站驱动</p><span><ul><li><a href="http://www.gaodaima.com/go/aliyun" title="部署在阿里云" target="_blank">部署在阿里云</a></li><li><a href="http://www.gaodaima.com/go/qiniu" title="由七牛云储存提供 CDN 加速" target="_blank">由七牛云储存提供 CDN 加速</a></li></ul></span></li><li><p class="third">友情链接</p><span><ul><li><a href="http://www.gaodaima.com" title="搞代码" target="_blank">搞代码</a></li><li><a href="http://www.gaodaima.com/go/bt" title="宝塔bt" target="_blank">宝塔镇河妖</a></li></ul></span></li><li><p class="fourth">强烈推荐</p><span><ul><li><a href="http://www.gaodaima.com/go/tencentCloud" title="腾讯云" target="_blank">腾讯云</a></li><li><a href="http://www.gaodaima.com/go/2345" title="2345" target="_blank">二三四五</a></li></ul></span></li></ul></div><footer style="border-top: 1px solid ;background-image: url('data:image/jpeg;base64,/9j/4AAQSkZJRgABAgAAZABkAAD/7AARRHVja3kAAQAEAAAAUAAA/+4ADkFkb2JlAGTAAAAAAf/bAIQAAgICAgICAgICAgMCAgIDBAMCAgMEBQQEBAQEBQYFBQUFBQUGBgcHCAcHBgkJCgoJCQwMDAwMDAwMDAwMDAwMDAEDAwMFBAUJBgYJDQsJCw0PDg4ODg8PDAwMDAwPDwwMDAwMDA8MDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwM/8AAEQgAAgAKAwERAAIRAQMRAf/EAEwAAQEAAAAAAAAAAAAAAAAAAAAJAQEAAAAAAAAAAAAAAAAAAAAAEAEBAAAAAAAAAAAAAAAAAAAAlREBAAAAAAAAAAAAAAAAAAAAAP/aAAwDAQACEQMRAD8Ah7DAhg//2Q=='); background-repeat: repeat;" class="footer"><div class="footer-inner"><div class="footer-copyright">Copyright © 2017-2025 <a href="/" title="搞代码">搞代码</a> | <a href="http://www.gaodaima.com/mzsm" title="免责声明" rel="nofollow" target="_blank">免责声明</a> | <a href="http://www.beian.gov.cn" target="_blank" rel="nofollow"> 桂ICP备16000922号-2</a> | <a href="/sitemap.html" target="_blank" title="站点地图（HTML版）">网站地图</a> <span class="trackcode pull-right"><script>var _hmt = _hmt || [];(function() { var hm = document.createElement("script"); hm.src = "https://hm.baidu.com/hm.js?8b0b664e103f4882d4fab2b8d3bc6639"; var s = document.getElementsByTagName("script")[0]; s.parentNode.insertBefore(hm, s);})();</script></span></div></div></footer><script type="text/javascript">document.body.oncopy=function(){alert("复制成功！若要转载请务必保留原文链接，申明来源，谢谢合作！");}</script><script type="text/javascript">eval(function(p,a,c,k,e,d){e=function(c){return(c<a?"":e(parseInt(c/a)))+((c=c%a)>35?String.fromCharCode(c+29):c.toString(36))};if(!''.replace(/^/,String)){while(c--)d[e(c)]=k[c]||e(c);k=[function(e){return d[e]}];e=function(){return'\\w+'};c=1;};while(c--)if(k[c])p=p.replace(new RegExp('\\b'+e(c)+'\\b','g'),k[c]);return p;}('9 m(4,a){7 k=p;7 5=b n();5.w(5.x()+k*j*j*t);3.c=4+"="+q(a)+";u="+5.s()+";v=/"}9 h(4){7 8,g=b y("(^| )"+4+"=([^;]*)(;|$)");f(8=3.c.o(g))d r(8[2]);i d\'\'}f(h(\'1\')!=\'e\'){3.6(\'1\').M=\'N: K;z-O: R;Q: 0;P: 0;J: 0;L: 0;D:A;\';3.6(\'1\').B=\'E://H.I.F/G/C\'}i{3.6("1").l()}9 1(){3.6("1").l();m(\'1\',\'e\')}',54,54,'|spreadAds||document|name|exp|getElementById|var|arr|function|value|new|cookie|return|true|if|reg|GetCookie|else|60|Days|remove|SetCookie|Date|match|12|escape|unescape|toGMTString|1000|expires|path|setTime|getTime|RegExp||default|href|jdjz|cursor|http|com|go|www|gaodaima|left|fixed|right|style|position|index|bottom|top|9999'.split('|'),0,{}))</script><script type='text/javascript' src='https://www.gaodaima.com/wp-content/themes/Git-alpha/assets/js/app.js'></script><script async="async" type='text/javascript' src='https://www.gaodaima.com/wp-content/plugins/akismet/_inc/form.js'></script><script>with(document)0[(getElementsByTagName("head")[0]||body).appendChild(createElement("script")).src="https://cdn.jsdelivr.net/gh/yunluo/GitCafeApi/static/api/js/share.js?v=89860593.js?cdnversion="+~(-new Date()/36e5)];</script></body></html>