在python中使用aiohttp获取多个url

在python中使用aiohttp获取多个url

问题描述:

在之前的问题中,用户建议使用以下方法使用 aiohttp 获取多个 url(API 调用):

In a previous question, a user suggested the following approach for fetching multiple urls (API calls) with aiohttp:

import asyncio
import aiohttp


url_list = ['https://api.pushshift.io/reddit/search/comment/?q=Nestle&size=30&after=1530396000&before=1530436000', 'https://api.pushshift.io/reddit/search/comment/?q=Nestle&size=30&after=1530436000&before=1530476000']

async def fetch(session, url):
    async with session.get(url) as response:
        return await response.json()['data']


async def fetch_all(session, urls, loop):
    results = await asyncio.gather(*[loop.create_task(fetch(session, url)) for url in urls], return_exceptions= True)
    return results

if __name__=='__main__':
    loop = asyncio.get_event_loop()
    urls = url_list
    with aiohttp.ClientSession(loop=loop) as session:
        htmls = loop.run_until_complete(fetch_all(session, urls, loop))
    print(htmls)

然而,这只会导致返回属性错误:

However, this results in only returning Attribute errors:

[AttributeError('__aexit__',), AttributeError('__aexit__',)]

(我启用了它,否则它就会中断).我真的希望这里有人可以帮助解决这个问题,仍然很难找到 asyncio 等的资源.返回的数据是 json 格式.最后我想把所有的 json dicts 放在一个列表中.

(which I enabled, otherwhise it would just break). I really hope there is somebody here, who can help with this, it is still kind of hard to find resources for asyncio etc. The returned data is in json format. In the end I would like to put all json dicts in a list.

工作示例:

import asyncio
import aiohttp
import ssl

url_list = ['https://api.pushshift.io/reddit/search/comment/?q=Nestle&size=30&after=1530396000&before=1530436000',
            'https://api.pushshift.io/reddit/search/comment/?q=Nestle&size=30&after=1530436000&before=1530476000']


async def fetch(session, url):
    async with session.get(url, ssl=ssl.SSLContext()) as response:
        return await response.json()


async def fetch_all(urls, loop):
    async with aiohttp.ClientSession(loop=loop) as session:
        results = await asyncio.gather(*[fetch(session, url) for url in urls], return_exceptions=True)
        return results


if __name__ == '__main__':
    loop = asyncio.get_event_loop()
    urls = url_list
    htmls = loop.run_until_complete(fetch_all(urls, loop))
    print(htmls)