To Delete or to Soft Delete, That is the Question!

Andrey Belyaev

Soft deletion is a widely used pattern applied for business applications. It allows you to mark some records as deleted without actual erasure from the database. Effectively, you prevent a soft-deleted record from being selected, meanwhile all old records can still refer to it.

When would it make sense to use Soft Delete?

If you ask a few developers randomly, most probably you will hear that soft deletion is used to be able to restore deleted records. However, if this is the only business requirement that pushes you towards integrating soft-deletion - think twice.
Consider the following requirements in addition to “simple data restore” before making a decision on using soft deletion in your project:

History tracking and audit (e.g. for legal reasons)
Keeping reference integrity and avoid cascade delete
You need “Graceful” delete. E.g. a long-running business process may need data that might be “deleted”, but still required for this particular process to finish

If you see that your business requirements match the list above, it means that you may need to implement the “Soft Delete” feature in your application.

And implementing soft delete is easy! You just add the “Is_Deleted” or “Delete_Date” column to your tables (or attributes to your Document) and replace all delete statement invocations in your application code to updates. And yes, you need to modify all retrieve statements to take into account this new attribute. Or is it better to create a view? Or a materialized view? In this case, we may need to implement “instead of” triggers or stored procedures API to deal with views. And by deleting an entity you may want to delete some attributes it refers to. In this case, to be able to restore it is also required to store deletion context. And… this is where a developer’s life gets complicated.

Challenges of Soft Deletion

In fact, implementing “proper” soft delete is not an easy task. Sometimes developers choose NOT to implement soft delete just because it takes a lot of efforts. You might think about other options like table mirroring with the special “trash can” table that contains deleted records from the “original” table.

But if it is decided to go forward with the soft delete implementation, you need to take into account some conditions.

Database support. One of the issues with records that were marked as deleted is their unique keys. As an example - we need to store users along with their emails. Emails must be unique, therefore, we need to create a unique index for the table that stores users. And if a user is deleted, we should be able to insert a new user with the same email. So, your database should be able to handle such cases. One of the options - include a soft delete marker field into the unique key. If your marker field is not a simple boolean value, but delete timestamp, it may do the trick.

Another option - special unique indexes. For example, in PostgreSQL it is called partial unique indexes, in MSSQL - filtered indexes. They allow you not to take some rows into account when building indexes by specifying a predicate. So, for the soft delete feature implementation, we can exclude “deleted” records from the index.

If your database supports triggers for views, there is a good chance that you can create your data model using views and instead of type of triggers. But this approach causes performance impact and does not resolve the issue with unique constraints applied to the underlying table(s).

Application Support. Make sure ALL queries mentioning soft-deleted entities are double-checked, otherwise it can lead to unexpected data leaks and critical performance issues. In the ideal world, a developer should not be aware about soft delete existence. So, if possible, try to introduce your data access API and prevent developers from using other facilities to access a database (like direct JDBC connection). You might even want to override standard APIs like JPA’s Entity Manager to support soft delete properly.

Revisit standard APIs usage. First of all, forget about using some of the standard DB and JPA features. If you have your database table and constraint model carefully engineered and tuned, as well as you JPA layer, introducing soft delete requires you to consider the following:

Standard cascade policies will not work. Remember that in case of soft delete, you don’t delete records, you update them. If you still need cascade delete, triggers may help you to implement this behavior.
As for the triggers - you’ll need to review them, too, to ensure that the code from pre- and post- delete triggers is executed in pre- post- update triggers during update-as-soft-delete and not executed during “common” update process.
JPA lifecycle callbacks @PreRemove and @PostRemove will not work. Again, look for other options to implement the functionality that is coded in those callbacks. The rule here is similar to the one applied to triggers - in update-related callbacks you need to distinguish between update and soft delete processes.

You still need to meet regulatory requirements. As an example - GDPR. The regulation says that the data must be completely removed if it is required by a user. To comply with this you still need to implement a “purge” process in addition to soft delete. And this purge procedure affects related data in your database in most cases. So, you need to be careful and not to delete business-critical data.

Soft Delete implementation in Jmix

Soft deletion mechanism in Jmix is transparent for application developers. All you have to do to make an entity “soft deletable” is to add a couple of attributes with @DeletedDate and @DeletedBy annotations. The framework will recognize them as a “soft delete trait”, and behave appropriately: it will mark deleted records instead of actually removing them, and ignore soft deleted records when loading entities by Id or by any JPQL query. Moreover, when loading object graphs, deleted records will be transparently filtered out from to-many collections!

To cascade soft deletion, Jmix offers @OnDelete and @OnDeleteInverse annotations that can be used with reference attributes. They allow you to express cascade operations on the application level instead of the database-level ON DELETE CASCADE or ON DELETE SET NULL rules.

When it comes to uniqueness, Jmix still relies on database constraints, and soft-deleted entities with unique attributes require creation of partial unique indexes mentioned above.

The magic of transparent handling of soft-deleted records in Jmix is based on the own implementation of JPA EntityManager, which transforms queries automatically adding conditions to filter out deleted records. And the soft deletion logic can be turned off for a particular query, so the application can get the records marked for deletion, if needed. It can be used to review, restore, or wipe out the soft deleted data.

Watch how to make a simple working application in 12 minutes.

Try Jmix

Conclusion

Using soft-deletion even for one entity will lead you to a bunch of new development conventions. You need to keep in mind that the delete process is not just delete, but sometimes the update, and not all data in a database table should be available for the application. It means that your code should be engineered in a way to take all that into account in all application tiers - from database schema (document structure, etc.) to user interface (even that).

With a proper framework you will be able to resolve all these problems, but you should still keep in mind that you have “deleted” data that still exists, and build coding standards and conventions that take this into account.