An Intro to aiohttp
Python 3.5 added some new syntax that allows developers to create asynchronous applications and packages easier. One such package is aiohttp.
Join the DZone community and get the full member experience.
Join For FreePython 3.5 added some new syntax that allows developers to create asynchronous applications and packages easier. One such package is aiohttp which is an HTTP client/server for asyncio. Basically, it allows you to write asynchronous clients and servers. The aiohttp package also supports Server WebSockets and Client WebSockets. You can install aiohttp using pip:
pip install aiohttp
Now that we have aiohttp installed, let’s take a look at one of their examples!
Fetching a Web Page
The documentation for aiohtpp has a fun example that shows how to grab a web page’s HTML. Let’s take a look at it and see how it works:
import aiohttp
import asyncio
import async_timeout
async def fetch(session, url):
with async_timeout.timeout(10):
async with session.get(url) as response:
return await response.text()
async def main(loop):
async with aiohttp.ClientSession(loop=loop) as session:
html = await fetch(session, 'http://www.blog.pythonlibrary.org')
print(html)
loop = asyncio.get_event_loop()
loop.run_until_complete(main(loop))
Here we just import aiohttp, Python’s asyncio and async_timeout, which gives us the ability to timeout a coroutine. We create our event loop at the bottom of the code and call the main() function. It will create a ClientSession object that we pass to our fetch() function along with what URL to fetch. Finally in the fetch() function, we use set our timeout and attempt to get the URL’s HTML. If everything works without timing out, you will see a bunch of text spewed into stdout.
Downloading Files With aiohttp
A fairly common task that developers will do is download files using threads or processes. We can download files using coroutines too! Let’s find out how:
import aiohttp
import asyncio
import async_timeout
import os
async def download_coroutine(session, url):
with async_timeout.timeout(10):
async with session.get(url) as response:
filename = os.path.basename(url)
with open(filename, 'wb') as f_handle:
while True:
chunk = await response.content.read(1024)
if not chunk:
break
f_handle.write(chunk)
return await response.release()
async def main(loop):
urls = ["http://www.irs.gov/pub/irs-pdf/f1040.pdf",
"http://www.irs.gov/pub/irs-pdf/f1040a.pdf",
"http://www.irs.gov/pub/irs-pdf/f1040ez.pdf",
"http://www.irs.gov/pub/irs-pdf/f1040es.pdf",
"http://www.irs.gov/pub/irs-pdf/f1040sb.pdf"]
async with aiohttp.ClientSession(loop=loop) as session:
for url in urls:
await download_coroutine(session, url)
if __name__ == '__main__':
loop = asyncio.get_event_loop()
loop.run_until_complete(main(loop))
You will notice here that we import a couple of new items: aiohttp and async_timeout. The latter is actually one of the aiohttp’s dependencies and allows us to create a timeout context manager. Let’s start at the bottom of the code and work our way up. In the bottom conditional statement, we start our asynchronous event loop and call our main function. In the main function, we create a ClientSession object that we pass on to our download coroutine function for each of the urls we want to download. In the download_coroutine, we call our session’s get() method which gives us a response object. Now we get to the part that is a bit magical. When you use the content attribute of the response object, it returns an instance of aiohttp.StreamReader which allows us to download the file in chunks of whatever size we’d like. As we read the file, we write it out to local disk. Finally, we call the response’s release() method, which will finish the response processing.
According to aiohttp’s documentation, because the response object was created in a context manager, it technically calls release() implicitly. But in Python, explicit is usually better and there is a note in the documentation that we shouldn’t rely on the connection just going away, so I believe that it’s better to just release it in this case.
There is one part that is still blocking here and that is the portion of the code that actually writes to disk. While we are writing the file, we are still blocking. There is another library called aiofiles that we could use to try and make the file writing asynchronous too We will take a look at that next.
Note: The section above cam from one of my previous articles.
Using aiofiles for Asynchronous Writing
You will need to install aiofiles to make this work. Let’s get that out of that way:
pip install aiofiles
Now that we have all the items we need, we can update our code!
import aiofiles
import aiohttp
import asyncio
import async_timeout
import os
async def download_coroutine(session, url):
with async_timeout.timeout(10):
async with session.get(url) as response:
filename = os.path.basename(url)
async with aiofiles.open(filename, 'wb') as fd:
while True:
chunk = await response.content.read(1024)
if not chunk:
break
await fd.write(chunk)
return await response.release()
async def main(loop):
urls = ["http://www.irs.gov/pub/irs-pdf/f1040.pdf",
"http://www.irs.gov/pub/irs-pdf/f1040a.pdf",
"http://www.irs.gov/pub/irs-pdf/f1040ez.pdf",
"http://www.irs.gov/pub/irs-pdf/f1040es.pdf",
"http://www.irs.gov/pub/irs-pdf/f1040sb.pdf"]
async with aiohttp.ClientSession(loop=loop) as session:
for url in urls:
await download_coroutine(session, url)
if __name__ == '__main__':
loop = asyncio.get_event_loop()
loop.run_until_complete(main(loop))
The only change is adding an import for aiofiles and then changing how we open the file. You will note that it is now
async with aiofiles.open(filename, 'wb') as fd:
And that we use await for the writing portion of the code:
await fd.write(chunk)
Other than that, the code is the same. There are some portability issues mentioned here that you should be aware of.
Wrapping Up
Now you should have some basic understanding of how to use aiohttp and aiofiles. The documentation for both projects is worth a look as this tutorial really only scratches the surface of what you can do with these libraries.
Published at DZone with permission of Mike Driscoll, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments