Skip to main content

Downloading Files

In this tutorial, we will be learning how to download files using the Python Requests library. This will include understanding how to download several types of files such as images, PDFs, and more.

What is Python Requests?

Python Requests is a popular and user-friendly library used for making HTTP requests. It abstracts the complexities of making requests behind a beautiful, simple API, allowing you to send HTTP/1.1 requests.

Installation

Before we begin, ensure that the Requests library is installed on your system. If not, you can install it using pip:

pip install requests

Downloading Files

Let's start with a basic example of downloading an image from the web.

import requests

url = 'https://example.com/image.jpg'
response = requests.get(url)

with open('image.jpg', 'wb') as f:
f.write(response.content)

In the example above, we first import the requests module, then define the URL of the file we want to download. We use the requests.get() method to send a GET request to the specified URL. The response from the server is stored in the response object.

Then we open a file in write-binary mode (wb) and write the content of the response to this file. This content is the actual file data that we want to download.

Downloading Large Files

The method above works fine for small files. However, if we need to download large files, we should use a different approach to save memory. We can download these large files in chunks.

import requests

url = 'https://example.com/largefile.zip'
response = requests.get(url, stream=True)

with open('largefile.zip', 'wb') as f:
for chunk in response.iter_content(chunk_size=1024):
if chunk:
f.write(chunk)

In this example, we add an extra parameter stream=True to the get() method. This ensures that only the response headers are downloaded and the connection remains open. The content is not downloaded until you access the Response.content attribute or iterate over Response.iter_content() or Response.iter_lines().

The iter_content() function is used to iterate over the content of the response in chunks (of size 1024 bytes in this case). Each chunk is then written to the file.

Error Handling

When downloading files, it's important to handle errors. The Requests library makes error handling simple.

import requests

url = 'https://example.com/file.pdf'

response = requests.get(url)
response.raise_for_status() # Raise an HTTPError if one occurred

with open('file.pdf', 'wb') as f:
f.write(response.content)

In the above code, raise_for_status() will check if the request was successful. If the HTTP request returned an unsuccessful status code, it will raise an HTTPError.

This is a basic introduction to downloading files using the Python Requests library. This library has many more features and options that you can explore to fit your specific needs. Happy coding!