内容字号:默认大号超大号

段落设置:段首缩进取消段首缩进

字体设置:切换到微软雅黑切换到宋体

Scrapy抓取图片301和403错误

2018-04-16 17:16 出处:清屏网 人气: 评论(0

1. 301错误

301是重定向,在settings加这个就可以了,默认是False

MEDIA_ALLOW_REDIRECTS =True

2.403错误

403是禁止访问的错误,我这边是因为对方对Referer进行了判断,如果是空就会403,在process_request中的request中加Referer.用目标网址替换这边的xxxxx

def process_request(self, request, spider):
        # Called for each request that goes through the downloader
        # middleware.

        # Must either:
        # - return None: continue processing this request
        # - or return a Response object
        # - or return a Request object
        # - or raise IgnoreRequest: process_exception() methods of
        #   installed downloader middleware will be called
        agent = random.choice(agents)
        request.headers["User-Agent"] = agent

        #request.meta["proxy"] = proxyServer
        #request.headers["Proxy-Authorization"] = proxyAuth
        request.headers['Referer'] = 'xxxxx;



        return None

3290


分享给小伙伴们:
本文标签: Scrapy

相关文章

发表评论愿您的每句评论,都能给大家的生活添色彩,带来共鸣,带来思索,带来快乐。

CopyRight © 2015-2016 QingPingShan.com , All Rights Reserved.

清屏网 版权所有 豫ICP备15026204号