Recently, I stumbled upon an exciting API from Transport for London Unified API.
This API gets insightful data on the different aspects of transportation in London.
However, as I began to use the API, I discovered that some of the responses were in bytes, others in strings, and none in JSON (the standard API response type). This became a challenge when parsing responses from the API.
Then I decided to build on top of this API, making it more developer friendly.
In this article, I will talk about how I built an API on top of an existing API and used Linode to deploy it.
Let's get right into it.
Building the API
I would be explaining how I built most parts of the API in this section, I would not explain each part of this project, I would only talk on a few important parts.
Here are the tools I used for this project:
- Django rest framework
- MongoDB
- Postman
- Linode Linux server
Project Setup
Firstly, I set up my local development environment. I did this by, first, creating my virtual environment in my working directory:
mkdir api-dir
cd api-dir
virtualenv env
Next, I need to activate it and installed the necessary dependencies, like so:
source env/bin/activate
pip install django djangorestframework djangorestframework_simplejwt pymongo
I used pymongo
to connect to my mongo-db cluster in Linode.
All I needed to do at this point was to create my django project and app, and I was all set.
django-admin startproject main .
python manage.py startapp api
The first command helps us create a new django app
called api
.
Database Setup
I created my database cluster using Linode and provisioned only 1 node project.
You can create your own database cluster with any database engine. Head on to Linode's cloud manager to get started.
In Linode's cloud management console, by the left, you will see a side panel which you can hover over to access different resources.
Right down there, you will see the Databases
option. Click on it, this will take you straight to the next page to create our database cluster with any database of our choice, but I will be using mongo-db.
- I gave my database a reasonable name (best practice)
- Selected MongoDB as my database engine
- Chose the nearest availability zone for my database cluster
- Chose the smallest size for my database cluster to reduce billing costs :)
- I only chose one database node to be in the cluster, because that's what I need
- I added an IP address
0.0.0.0/0
, so my database cluster can be accessible to anyone. (Don't do this in production).
After creating my mongo-db, I needed to connect to it so I could populate it with the necessary information. This is where pymongo
comes in.
With pymongo
you can connect to your mongo-db and perform CRUD
operations.
To get started, I needed my connection URL. The connection URL has a format that goes like this:
mongodb://username:password@host/?authMechanism=DEFAULT&tls=true&tlsCAFile=path-to-file
You can get the information needed from the summary of your cluster. You will see this immediately after you click on your database cluster when it is done provisioning.
I then connected to my mongo-db in the cloud and performed a few operations to make sure there were no errors.
Creating API Endpoints
Moving on, I had already set up my development environment and the database that I would be working with, the next thing on my agenda was to create the API endpoints. I would not be going over how I created only three endpoints here.
The First Endpoint
The first endpoint I worked on was the AccidentsStats Endpoint
.
This endpoint returns the details for accidents that occurred within a specific year.
I encountered some issues when working on this endpoint. First, the return type was a string
, and depending on the year, the size of the response was huge, ranging from 20megabytes to 40megabytes.
I immediately converted the string
response into a list using the .json()
method like so:
url = "https://api.tfl.gov.uk/AccidentStats/2005"
r = requests.get(url)
main_list = r.json()
I later populated my database with the accident statistics data. I made each year with accident statistics (not all years have accident statistics) a collection
and the response data the documents
for the respective years, so when you hit an endpoint like api/accident-stat/2005
, it returns the documents
in the collection
2005.
Here is the first API endpoint, it handles getting accident statistics for each year.
@api_view(['GET'])
def get_accidents_stats(request, year):
connect_to_mongo()
if request.method != 'GET':
return Response({"Error": "Invalid Response Type"})
cursor_list = [cursor for cursor in db.list_collection_names()]
print(cursor_list)
if year not in cursor_list:
return Response({"Message": f"There is no accident stat in the year {year}"}, status=status.HTTP_400_BAD_REQUEST)
main_cursor = db[f'{year}']
main_list = []
for ele in main_cursor.find({}):
del ele['_id']
main_list.append(ele)
return Response(main_list)
The Second Endpoint
The second endpoint I worked on was the BikePoint Endpoint
This endpoint returns all available BikePoints in London.
I sent a request to this endpoint, and I go a strange return type. A byte
. I thought this was going to be a hard one, converting from bytes
to string
and then to a list
or a dictionary
.
This was a fairly easy task. I looped through the return bytes, converted it to a string, and then converted the string to a dictionary with json.loads(string)
.
Here is the code snippet:
url = f'https://api.tfl.gov.uk/BikePoint/'
r = requests.get(url)
main_string = ""
for ele in r:
string_ele = ele.decode("utf-8")
main_string+=string_ele
new_list = json.loads(main_string)
With this, I could now work with the result list
type. I inserted the list into the bike-points
collection I has created in the database like so:
collection = db['bike-point']
main_list = []
for ele in new_list:
del ele['$type']
main_list.append(ele)
collection.insert_many(main_list)
Now I can work with the bike-point
collection from the database and return all the available BikePoints in London.
Here is the second API endpoint:
@api_view(['GET'])
def get_bike_points(request):
if request.method != 'GET':
return Response({"Error": "Invalid Request Type"}, status=status.HTTP_400_BAD_REQUEST)
connect_to_mongo()
cursor = db['bike-point']
main_list = []
for ele in cursor.find({}):
del ele['_id']
main_list.append(ele)
return Response(main_list, status=status.HTTP_200_OK)
The Third Endpoint
The third endpoint I worked on was the BikePoints/id
This endpoint returns information on a specific BikePoint given it's id.
Since I had already had all the bike points in my database, all i had to do was a database lookup with the id passed in through the request url to find the id of the respective BikePoint in the database.
Here is the third API endpoint:
@api_view(['GET'])
def get_bike_point_id(request, bike_point_id):
connect_to_mongo()
if request.method != 'GET':
return Response({"Error": "Invalid Request Type"}, status=status.HTTP_400_BAD_REQUEST)
cursor = db['bike-point']
bike_point = cursor.find_one({"id": bike_point_id})
if bike_point == None:
return Response({"Message": f"Could not get BikePoint with id {bike_point_id}"}, status=status.HTTP_200_OK)
bike_point.pop('_id')
return Response(bike_point, status=status.HTTP_200_OK)
The code for the remaining endpoints can be found on my github.
Deployment Setup
At this point, I was done with building out most of the API endpoints, and it was time to deploy my API. I chose to use Linode to deploy my project. I provisioned my Linode server and deployed my project.
If you decide to use Linode servers to deploy your projects, here is a quick guide on how to get started.
Head on to the Linode Management console, you'd see a button on the console page that says Create Linode
. Click on it.
On the next page, you'd be greeted with different options o how to your define your server.
Here you can choose from a list of Linux distributions your server should run on.
The region where your server should be launched, and more.
When you scroll down, you will see the options to choose the amount of computing power you want on your server.
It's best to choose the least expensive compute power (Shared CPU) for your server if you are still using your Linode credits.
Next, give your server a name through the Linode Label
input field, plus you have the option of adding tags and ssh keys.
There are additional configurations for your Linode server such as the addition of a VLAN, a backup for your server, plus the option of having a private IP address
Back to my story :).
I deployed my API with a Linode server. I can not go into the deployment process because it's too long, but you can find a detailed explanation of how to deploy your django project to a virtual machine with nginx and gunicorn in this digitalocean article.
I followed the steps in the article and deployed my API successfully.
Cachning API Responses
The API response time was still very slow because of the size of the data, the responses are large, which increased the API response time, which was a concern. I decided to implement a caching system with Redis.
I found amazing resources online on how to go about this, shoutout to Sagar Yadav for his article on how to implement caching in django with redis. I found this really helpful.
However, there was still an issue, the article only explained how to implement this cache with localhost. I needed to set up a remote Redis server and cache the API responses on the remote server.
Luckily for me, I already had a server (where I deployed my API), so I set up my Redis server on my Linux server.
Firstly, I had to install install Redis on my server like so:
sudo apt install -y redis
Next, I had to add my bind my host IP to the Redis server by editing the redis.conf
file like so:
nano /etc/redis/redis.conf
On the line with bind 127.0.0.1 ::1
I added my host IP next to it and saved my file. Then restarted my Redis server like so:
systemctl restart redis
I tested my connection to my remote Redis server with this command:
redis-cli -h <host-ip>
I had no errors thankfully, what was left for me now was to set up Redis with django rest framework.
First, I had to install django-redis
pip install django-redis
Next, I needed to set up my cache connection with django and redis in my settings.py
file
CACHES = {
"default": {
"BACKEND": "django_redis.cache.RedisCache",
"LOCATION": "redis://<host-ip>",
"OPTIONS": {
"CLIENT_CLASS": "django_redis.client.DefaultClient",
}
}
}
Followed by specifying my Session Engine
like so:
SESSION_ENGINE = "django.contrib.sessions.backends.cache"
SESSION_CACHE_ALIAS = "default"
Then added a TTL
for the response caches
CACHE_TTL = 60 * 1
Finally, imported my cache_page
decorator and added it to my API endpoints like so:
@cache_page(60*15)
@api_view(['GET'])
function_to_handle_endpoint
Conclusion
This project is open-source which is open to contributions, I have not replicated all the API endpoints from the Transport for London Unified API. With the help of other developers, I know we can build something much better and more efficient.
For more information on this project, check out the links below: