Context: what is Audit?
A while back with my team when working on a project we needed to add some audit features on the platform to be able to trace what happened in the app and show it to the end users and use it also for customer support requests, we were receiving at that time a lot of requests requiring us to know what change has happened on various objects in the database, so we need a system to help us achieve that. And we knew that when building a web application, especially when it has some business value it's important to provide the ability for the end user to know who did what action, or the changes that occurred on a specific object in the database and who did that change.
For example, if an invoice has been created, processed and validated, and later on we need to display all the history of changes on it, we need to build an audit trail, a system of record that can hold the history of changes like logs; this is called maintaining an audit trail.
The basic Principle
regardless of what framework you are using the principle remains the same, you need to find a way to listen to everything that happens in the application and log those somewhere, it could be a file, a database table or anything else, as long as it can keep all the data you send to it.
With Django Framework the simple way to do that is by connecting to various signals like
m2m_changed on all the models at once (it's possible) and process the signal's data to save them somewhere as events, this should ideally be done in a dedicated thread or asynchronously to avoid slowing the application.
Therefore it will be possible as well to choose what type of event should be logged or what models should be tracked, you got the idea ;).
The available solutions in Django packages
There are several packages to achieve this but will showcase only 2 of them here because is actually used them both and they work differently internally.
The django-easy-audit package
This package is installed via the command
pip install django-easy-audit and added to the project's settings like this:
INSTALLED_APPS = [ #... 'easyaudit', ] MIDDLEWARE = ( #... 'easyaudit.middleware.easyaudit.EasyAuditMiddleware', )
It provides the ability to watch a lot of events such as
HTTP request in the project and save them into a dedicated set of database tables (models):
There is a set of settings to change how it works or what models it tracks. such as:
These settings are used to disable/enable the specific type of event tracking.
The django-simple-history package
This package provides the same result but it behaves slightly differently, It mirrors all the table in the database and store each object changes in the mirror table related to the model. And provide an attribute that you can add to each model to access each instance's history. To add this package to the project you need to run :
pip install django-simple-history and set it up in the project like this:
INSTALLED_APPS = [ # ... 'simple_history', ] MIDDLEWARE = [ # ... 'simple_history.middleware.HistoryRequestMiddleware', ]
And add the history attribute to all the models you need to track
from simple_history.models import HistoricalRecords class SomeModel(models.Model): history = HistoricalRecords()
And create a migration before running it.
python manage.py makemigrations python manage.py migrate
It will add the history attribute to the
SomeModel class and create a
app_historicalsomemodel , the table where all the changes happening on the model will appear. It's well explained here: django-simple-history.readthedocs.io/en/lat...
This means the number of tables in the database will probably get multiplied by 2, at least all the models you wrote if you desire to track them all. To access the history (audit trail) of a specific model's instance it's done by using
someModel.history.all(), It returns a QuerySet of
SomeModel with the various version of the object over time (since his creation) and you can use queryset filters to get whatever version you want.
Summary, what to remember
We did test these two and choose
django-easy-audit for some reason, I will give you some pros and cons of the 2 libs.
Very simple to install in the project
Add only 3 tables to the database
Requires no changes to the models or the project
It saves all the model's data in the same table, which make the number of rows grow fast (according to the number of the model being monitored)
Provide no functions or utility to browse an object's history easily
Provide a really simple admin integration, just a regular list of events containing JSON objects that need to be processed by the person using the admin
Provide a simple API to navigate the model's history via
Avoid storing too much data in the same table
Provide a good admin integration to navigate the object's history
Create a copy of each table, which can almost duplicate the number of tables if you track all the data
In some cases migrations were not applied or well applied, the package seems to be the cause since it stopped happening when we removed it from the project.
What I think
I don't like the idea of having too many tables in a database, so I will go most of the time with
django-easy-audit, but I also think audits (model's history) don't need to be stored in the database, since they are anyway not used too much most of the time, it makes more sense to me to store them in another system, file, object storage, stream processing system etc. Just like you would do with logs and logs files.
An improvement I think could be done on these packages is to send the generated data into an external system like a stream of data and avoid putting too much data into the database and retrieving them on demand.
Edit: after a comment by Tom Dyson, it turns out there is a way to achieve this in
django-easy-audit , simply by using the
DJANGO_EASY_AUDIT_LOGGING_BACKEND setting s and implementing a method to send the data into another logging system as in the example:
import logging class PythonLoggerBackend: logging.basicConfig() logger = logging.getLogger('your-kibana-logger') logger.setLevel(logging.DEBUG) def request(self, request_info): return request_info # if you don't need it def login(self, login_info): self.logger.info(msg='your message', extra=login_info) return login_info def crud(self, crud_info): self.logger.info(msg='your message', extra=crud_info) return crud_info
learn more here: https://github.com/soynatan/django-easy-audit#settings