从零安装Scrapy心得 | Install Python Scrapy from scratch

1. 介绍

Scrapy,是基于python的网络爬虫框架,它能从网络上爬下来信息,是data获取的一个好方式。于是想安装下看看。

进到它的官网,安装的介绍页面

https://docs.scrapy.org/en/latest/intro/install.html

2. 失败的安装过程

有3种装法,一个是从pip,一个是从源码编译,一个是从conda

根据之前的知识,pip就已经是集成在python中的包管理工具,最简单明了,就忽视了官网介绍界面的一句话

Note that sometimes this may require solving compilation issues for some Scrapy dependencies depending on your operating system

结果在编译阶段报了很多错误,解决一个还有一个。

然后就放弃了,从源码编译,跟pip一样,也是一堆编译错误。

3. conda方式安装

没办法,就去看conda,下载了个miniconda,60多M吧。仔细一研究结果爽死了。

可能python也注意到了它的包下载下来需要编译,编译的话需要依赖自己OS的环境配置,经常出错的这个问题。

miniconda是个已经安装好了python的一个集成环境,等于下载安装好了miniconda也就是下载好了基本的python核心程序,然后可以通过conda命令来来下载conda已经编译好的包来做功能扩展。也就是scrapy包以及它依赖的lxml,twisted等编译的我半死的包都是已经跟编译好的。那下载下来直接用就可以了。

conda install -c conda-forge scrapy

https://conda.io/docs/install/quick.html

https://conda.io/miniconda.html

English Version

1. Introduction

Scrapy, it's a network crawler framework based on Python, which is able to download infomation from Internet, so it's a good way to obtain original data. 

For better understanding towards Scrapy, I found the installation instruction on below official website and try to install scrapy framework.

https://docs.scrapy.org/en/latest/intro/install.html

2. Failure experience of installion

Before install Scrapy framework, there must be Python environment in your computer, Scrapy is one of Python extension packages from view of Python.  

If Python env is already here, then there are 3 ways to install Scrapy package: 1 is thru pip, 2 is to compile dependencies from source code, 3 is thru conda. 

Based on my previous experience and knowledge, pip is the package management tool that already integrated in python env. It's quite straightforward to use pip for installation. However I had overlooked one important note from official website, which is 

Note that sometimes this may require solving compilation issues for some Scrapy dependencies depending on your operating system

As a result, there were many compilation errors during the denpendencies installation process, when you solved one, another error occurred. Therefore i tried second installation method but still get the same result as first method. 

3. Install thru conda

The last option for me is to install Scrapy thru conda. I found conda offitial website, and downloaded miniconda as per instruction, around 60 Megabytes. After install and run the tool, it is really cool and make things simple. It might noticed that dependencies complilation issue always drive people craze, as it depends on the OS environment.

Conda is a integrated Python environment with core Python packages. Users who want to install packages just need to download those packages instead of compile them locally, such as lxml, twisted Scrapy dependencies packages. All extension packages have been compiled on Conda server, therefore, it avoid the issue that happened above.   

The package download or so called installation syntax is as below

conda install -c conda-forge scrapy

https://conda.io/docs/install/quick.html

https://conda.io/miniconda.html