How We implemented Audit in our SaaS Django Platform
We needed a system on our platform to keep a track of all changes that was happening to easily responds to support requests from our clients.
Context: what is Audit?
A while back with my team when working on a project we needed to add some audit features on the platform to be able to trace what happened in the app and show it to the end users and use it also for customer support requests, we were receiving at that time a lot of requests requiring us to know what change has happened on various objects in the database, so we need a system to help us achieve that. And we knew that when building a web application, especially when it has some business value it's important to provide the ability for the end user to know who did what action, or the changes that occurred on a specific object in the database and who did that change.
For example, if an invoice has been created, processed and validated, and later on we need to display all the history of changes on it, we need to build an audit trail, a system of record that can hold the history of changes like logs; this is called maintaining an audit trail.
The basic Principle
regardless of what framework you are using the principle remains the same, you need to find a way to listen to everything that happens in the application and log those somewhere, it could be a file, a database table or anything else, as long as it can keep all the data you send to it.
With Django Framework the simple way to do that is by connecting to various signals like post_save
or m2m_changed
on all the models at once (it's possible) and process the signal's data to save them somewhere as events, this should ideally be done in a dedicated thread or asynchronously to avoid slowing the application.
Therefore it will be possible as well to choose what type of event should be logged or what models should be tracked, you got the idea ;).
The available solutions in Django packages
There are several packages to achieve this but will showcase only 2 of them here because is actually used them both and they work differently internally.
The django-easy-audit package
github.com/soynatan/django-easy-audit
This package is installed via the command pip install django-easy-audit
and added to the project's settings like this:
INSTALLED_APPS = [
#...
'easyaudit',
]
MIDDLEWARE = (
#...
'easyaudit.middleware.easyaudit.EasyAuditMiddleware',
)
It provides the ability to watch a lot of events such as login
, crud
, HTTP request
in the project and save them into a dedicated set of database tables (models): CRUDEvent
, LoginEvent
and RequestEvent
.
There is a set of settings to change how it works or what models it tracks. such as:
DJANGO_EASY_AUDIT_WATCH_MODEL_EVENTS
DJANGO_EASY_AUDIT_WATCH_AUTH_EVENTS
DJANGO_EASY_AUDIT_WATCH_REQUEST_EVENTS
These settings are used to disable/enable the specific type of event tracking.
The django-simple-history package
This package provides the same result but it behaves slightly differently, It mirrors all the table in the database and store each object changes in the mirror table related to the model. And provide an attribute that you can add to each model to access each instance's history. To add this package to the project you need to run : pip install django-simple-history
and set it up in the project like this:
INSTALLED_APPS = [
# ...
'simple_history',
]
MIDDLEWARE = [
# ...
'simple_history.middleware.HistoryRequestMiddleware',
]
And add the history attribute to all the models you need to track
from simple_history.models import HistoricalRecords
class SomeModel(models.Model):
history = HistoricalRecords()
And create a migration before running it.
python manage.py makemigrations
python manage.py migrate
It will add the history attribute to the SomeModel
class and create a app_historicalsomemodel
, the table where all the changes happening on the model will appear. It's well explained here: django-simple-history.readthedocs.io/en/lat...
This means the number of tables in the database will probably get multiplied by 2, at least all the models you wrote if you desire to track them all. To access the history (audit trail) of a specific model's instance it's done by using someModel.history.all()
, It returns a QuerySet of SomeModel
with the various version of the object over time (since his creation) and you can use queryset filters to get whatever version you want.
Summary, what to remember
We did test these two and choose django-easy-audit
for some reason, I will give you some pros and cons of the 2 libs.
django-easy-audit
Pros:
Very simple to install in the project
Add only 3 tables to the database
Requires no changes to the models or the project
Cons:
It saves all the model's data in the same table, which make the number of rows grow fast (according to the number of the model being monitored)
Provide no functions or utility to browse an object's history easily
Provide a really simple admin integration, just a regular list of events containing JSON objects that need to be processed by the person using the admin
django-simple-history
Pros:
Provide a simple API to navigate the model's history via
.history.all()
Avoid storing too much data in the same table
Provide a good admin integration to navigate the object's history
Cons
Create a copy of each table, which can almost duplicate the number of tables if you track all the data
In some cases migrations were not applied or well applied, the package seems to be the cause since it stopped happening when we removed it from the project.
What I think
I don't like the idea of having too many tables in a database, so I will go most of the time with django-easy-audit
, but I also think audits (model's history) don't need to be stored in the database, since they are anyway not used too much most of the time, it makes more sense to me to store them in another system, file, object storage, stream processing system etc. Just like you would do with logs and logs files.
An improvement I think could be done on these packages is to send the generated data into an external system like a stream of data and avoid putting too much data into the database and retrieving them on demand.
Edit: after a comment by Tom Dyson, it turns out there is a way to achieve this in django-easy-audit
, simply by using the DJANGO_EASY_AUDIT_LOGGING_BACKEND
setting s and implementing a method to send the data into another logging system as in the example:
import logging
class PythonLoggerBackend:
logging.basicConfig()
logger = logging.getLogger('your-kibana-logger')
logger.setLevel(logging.DEBUG)
def request(self, request_info):
return request_info # if you don't need it
def login(self, login_info):
self.logger.info(msg='your message', extra=login_info)
return login_info
def crud(self, crud_info):
self.logger.info(msg='your message', extra=crud_info)
return crud_info
learn more here: https://github.com/soynatan/django-easy-audit#settings
Resources
https://github.com/soynatan/django-easy-audit
https://django-simple-history.readthedocs.io/en/latest/index.html