Everything is accessible on the Web through requests. If you need information from a web page in your Python application, you need a web request. In this article, we’ll dig into Python requests. We’ll look at how a web request is structured and how to make a Python request. By the end, you’ll be able to use the Python requests library, which makes the whole process easier.
Key Takeaways
- HTTP (Hypertext Transfer Protocol) is a client-server protocol used for exchanging data on the web. It uses TCP as a transport protocol for reliable transport. The HTTP request is initiated by the client and processed by the server, which returns an appropriate response. HTTP is stateless, meaning there is no link between two requests served one after the other.
- The Python requests library simplifies the process of making HTTP requests in Python. It abstracts the complexities of making requests, providing an easy-to-use interface. The library allows sending Python HTTP requests from basic to complex ones. It can be installed using pip and used for making GET requests, handling status codes, reading the body of the response and interacting with APIs.
- HTTP headers provide additional information in an HTTP conversation. They can be customized in the Python requests library to provide additional information about the sender or the message. For instance, the User-Agent header gives information about the client making the request, while the Accept-Language header communicates the languages the client can understand.
An Introduction to HTTP Requests
To exchange data on the Web, we firstly need a communication protocol. The protocol used when we browse the Web is the Hypertext Transfer Protocol, or HTTP. HTTP uses TCP as a transport protocol, because it needs reliable transport, and only TCP can guarantee that.
Let’s say there’s a resource we need — such an HTML page, on a web server located somewhere in the world. We want to access this resource or, in other words, we want to look at that page in our web browser. The first thing we have to do is make an HTTP request. HTTP is a client–server protocol, which means that the requests are initiated by the client.
After the server receives the requests, it processes them and returns an appropriate response.
The server might reply in different ways. It might send the resource we requested, or reply with status codes if something doesn’t go as expected.
In every communication protocol, the information needs to be in specific fields. That’s because both the client and the server should know how to interpret the request or response. In the next sections, we’ll look at how an HTTP request and an HTTP response are built. We’ll also discuss the role of the most important fields.
The HTTP request
One of the most important design features of HTTP is that it’s human readable. This means that, when we look at an HTTP request, we can easily read everything, even if there’s a lot of complexity under the hood. Another feature of HTTP is that it is stateless. This means that there’s no link between two requests served one after the other. The HTTP protocol doesn’t remember anything of the previous request. This implies that each request must contain everything that the server needs to carry out the request.
A valid HTTP request must contain the following elements:
- an HTTP method — such as
GET
orPOST
- the version of the HTTP protocol
- the path of the resource to fetch
Then, we can also add some optional headers that specify additional information about the sender or the message. One example of a common HTTP request header is the User-Agent
or the natural language the client prefers. Both of those optional headers give information about the client that’s making the request.
This is an example of an HTTP message, and we can clearly understand all the fields specified:
~~~http
GET / HTTP/1.1
Host: www.google.com
Accept-Language: en-GB,en;q=0.5
~~~
The first line specifies the request type and the version of the HTTP protocol. Then we specify the Host
and the language accepted by the client that’s sending the request. Usually, the messages are much longer, but this gives a hint of what they look like.
The HTTP response
Now that we have an idea of what an HTTP request looks like, we can go on and see the HTTP response.
An HTTP response usually contains the following elements:
- the version of the HTTP protocol
- a status code, with a descriptive short-message
- a list of HTTP headers
- a message body containing the requested resource
Now that we’ve introduced the basic elements you need, it’s worth making a summary before taking the next step. It should be clear by now that, whenever a client wants to communicate with an HTTP server, it must create and send an HTTP request. Then, when the server receives it, it creates and sends an HTTP response.
We’re finally ready to introduce the Python requests library.
The Python requests Library
The Python requests library allows you to send Python HTTP requests — from basic to complicated ones. The Python requests library abstracts the complexities of making complex Python requests, providing an easy-to-use interface. In the next sections, we’ll see how to create easy Python requests and interpret the response. We’ll also see some of the features provided by the Python requests library.
Installing Python requests
First, we need to install the Python requests library. Let’s install it using pip
:
$ pip install requests
Once the Python requests library is installed correctly, we can start using it.
Our first GET request with Python requests
The first thing we have to do is to create a Python file. In this example, we call it web.py
. Inside this source file, insert this code:
import requests
URL = "https://www.google.com"
resp = requests.get(URL)
print(resp)
This program makes a GET request for Google. If we run this program, we’ll probably get this output:
$ python web.py
<Response [200]>
So, what does this mean?
We talked about the status code earlier. This output is telling us that our request has been received, understood and processed successfully. There are other codes as well, and we can list a few of the most common:
-
301 Moved Permanently
. This is a redirection message. The URL of the resource we were looking for has been moved. The new URL comes with the response. -
401 Unauthorized
. This indicates a client error response. In this case, the server is telling us that we must authenticate before proceeding with the request. -
404 Not found
. This indicates a client error response too. In particular, this means that the server can’t find the resource we were looking for.
What if we want to conditionally check the status, and provide different actions based on the status code? Well, we can easily do this:
import requests
URL = "https://www.google.com/blah"
resp = requests.get(URL)
if resp.status_code == 200:
print("Okay, all good!")
elif resp.status_code == 301:
print("Ops, the resource has been moved!")
elif resp.status_code == 404:
print("Oh no, the resource wasn't found!")
else:
print(resp.status_code)
If we run the script now, we’ll get something different. Have a try and see what we get. 😉
If we also want the descriptive short message that comes with each status code, we can use resp.reason
. In the case of a 200 status code, we’ll simply get OK
.
Inspecting the response of the Python request
At this point, we know how to make a basic Python request. After the request, we want the response, right?
In the previous section, we saw how to get the status code of the response. Now, we want to read the body of the response, which is the actual resource we requested. To do this, we need to use resp.content
. Let’s say that we’re looking for the Google home page.
This is what we get when we run the script:
b'<!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="en"><head><meta content="text/html; [...]
I’ve added [...]
above because the resource we get — which is a text/html
document — is too long to be printed. By how much? We can use len(resp.content)
to get this information. In the case above, it was 13931 bytes — definitely too much to be printed here!
Making use of APIs
One of the reasons why the Python requests library became so popular is because it makes interacting with APIs very easy. For this example, we’ll use a simple API for predicting a person’s age, given their name. This API is called Agify.
This is the code for the example:
import requests
import json
URL = "https://api.agify.io/?name=Marcus"
resp = requests.get(URL)
if resp.status_code == 200:
encoded = resp.json()
print(encoded['age'])
else:
print(resp.status_code)
In this case, we want to know the age of a person whose name is Marcus. Once we have the response, if the status code is 200, we interpret the result in JSON using resp.json()
. At this point, we have a dictionary-like object, and we can print the estimated age.
The estimated age of Marcus is 41
years old.
Customizing the headers
HTTP headers provide additional information to both parties of an HTTP conversation. In the following example, we’ll see how we can change the headers of an HTTP GET request. In particular, we’ll change the User-Agent
and the Accept-Language
headers. The User-Agent
tells the server some information about the application, the operating system and the vendor of the requesting agent. The Accept-Language
header communicates which languages the client is able to understand.
This is our simple code snippet:
import requests
URL = "https://www.google.com"
custom_headers = {'Accept-Language': 'fr-CH, fr;q=0.9, en;q=0.8, de;q=0.7, *;q=0.5', 'User-Agent': 'Mozilla/5.0 (Linux; Android 12; SM-S906N Build/QP1A.190711.020; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/80.0.3987.119 Mobile Safari/537.36'}
resp = requests.get(URL, headers=custom_headers)
if resp.status_code == 200:
# handle the response
print(resp.content[:100])
else:
print(resp.status_code)
If everything goes right, you should get something like this:
$ <!doctype html><html lang="fr"><head><meta charset="UTF-8"><meta content="width=device-width,mini [...]
In this example, we’ve changed the User-Agent
, pretending that our request comes from Mozilla Firefox. We’re also saying that our operating system is Android 12 and that our device is a Samsung Galaxy S22.
Since we’ve printed the first 100 characters of the response above, we can see that the HTML page we’ve received is in French.
Conclusion
In this article, we talked about the HTTP protocol, with a brief theoretical introduction. Then we looked at the Python requests library. We saw how to write basic Python HTTP requests and how to customize them according to our needs.
I hope you’ll find this library and this article useful for your projects.
Related reading:
- Why Learn Python?
- How Four Programmers Got Their First Python Jobs
- Using Python to Parse Spreadsheet Data
- Getting Started with Natural Language Processing in Python
- Course: Learn Programming Fundamentals with Python
Frequently Asked Questions about HTTP Requests in Python
requests
library in Python? The requests
library is a popular third-party Python library for making HTTP requests. It provides a simple and Pythonic way to interact with web services and APIs by abstracting the complexities of making HTTP requests.
requests
library? You can install the requests
library using the following command: pip install requests
requests
? Use the get
method from the requests
module to make a GET request.
You can include query parameters in the request by passing a params
dictionary to the get
method.
requests
? Use the post
method to make a POST request. You can include data in the request body using the data
parameter.
I am a computer science student fond of asking questions and learning new things. Traveller, musician and occasional writer. Check me out on my website.