Cuba framework persistence memory leak, improper transaction handling

The code below does not work. It will run out of memory, when it should not (note to that the clear() is not exposed in the top-level EntityManager and you must use the delegate.

The only option is to use the commented out code, which commits the transaction and creates a new one, which has completely different semantics, as the state could be become inconsistent, and would require a lot more synchronization code to ensure its correctness.

The problem is because the ‘cuba’ EntityManagerImpl is retaining all of the objects as well during the transaction - not sure why - it should allow the delegate to retain as needed, since that is part of the spec.

    public void createItems() {
        Transaction tx = persistence.createTransaction();
        for(int i=0;i<1000000;i++) {
            Item item = new Item();
            item.setName("Item #"+i);
            item.setUpc((long)i);
            persistence.getEntityManager().persist(item);
            if(i%10000==0) {
                // the following code will not work
                persistence.getEntityManager().flush();
                persistence.getEntityManager().getDelegate().clear();
                // need to use this instead
//                    tx.commit();
//                    tx = persistence.createTransaction();
            }
        }
        tx.commit();
    }

CUBA EntityManager is not designed for this use case, hence no clear() and detach() methods in its API.

As a workaround, you can use JPA directly via persistence.getEntityManager().getDelegate().persist(), but CUBA mechanisms such as entity listeners, entity log, FTS indexing won’t work for the saved instances. Or better use SQL INSERTs via QueryRunner.batch() for maximum performance.

So, is the only option to perform a bulk insert and have sub-systems like FTS work correctly, is to use the commit/new transaction ?

What other subsystems are affected? I would guess that since the entity listeners are bypassed it would be several ?

Also, if Cuba just used the standard JPA entity listener framework, wouldn’t things work as expected using flush() and clear() ?

So, is the only option to perform a bulk insert and have sub-systems like FTS work correctly, is to use the commit/new transaction ?

I think so.

What other subsystems are affected? I would guess that since the entity listeners are bypassed it would be several ?

The following subsystems are now bypassed if you don’t use CUBA API but JPA directly:

  • Before* entity listeners
  • EntityLog
  • FTS indexing
  • Advanced entity cache eviction (see com.haulmont.cuba.core.sys.persistence.OrmCacheSupport)
  • Delete policy processing for soft-deleted entities (OnDelete, OnDeleteInverse)

Also, if Cuba just used the standard JPA entity listener framework, wouldn’t things work as expected using flush() and clear() ?

Probably yes, but then some of the above mentioned mechanisms would not be possible. Look at what the JPA spec tells about restrictions on lifecycle callbacks (section 3.5.2):

In general, the lifecycle method of a portable application should not invoke EntityMan- ager or query operations, access other entity instances, or modify relationships within the same persistence context[46].[47] A lifecycle callback method may modify the non-relationship state of the entity on which it is invoked.

It renders the “standard” JPA entity listeners completely useless for anything less trivial than changing immediate attributes of the saved entity. JPA implementations differ in what they actually allow you to do in listeners, but EclipseLink is unfortunately very close to the spec in this part.