[ACCEPTED]-Find or insert based on unique key with Hibernate-unique-key
I had a similar batch processing requirement, with 13 processes running on multiple JVMs. The 12 approach I took for this was as follows. It 11 is very much like jtahlborn's suggestion. However, as 10 vbence pointed out, if you use a NESTED 9 transaction, when you get the constraint 8 violation exception, your session is invalidated. Instead, I 7 use REQUIRES_NEW, which suspends the current 6 transaction and creates a new, independent 5 transaction. If the new transaction rolls 4 back it will not affect the original transaction.
I 3 am using Spring's TransactionTemplate but 2 I'm sure you could easily translate it if 1 you do not want a dependency on Spring.
public T findOrCreate(final T t) throws InvalidRecordException {
// 1) look for the record
T found = findUnique(t);
if (found != null)
return found;
// 2) if not found, start a new, independent transaction
TransactionTemplate tt = new TransactionTemplate((PlatformTransactionManager)
transactionManager);
tt.setPropagationBehavior(TransactionDefinition.PROPAGATION_REQUIRES_NEW);
try {
found = (T)tt.execute(new TransactionCallback<T>() {
try {
// 3) store the record in this new transaction
return store(t);
} catch (ConstraintViolationException e) {
// another thread or process created this already, possibly
// between 1) and 2)
status.setRollbackOnly();
return null;
}
});
// 4) if we failed to create the record in the second transaction, found will
// still be null; however, this would happy only if another process
// created the record. let's see what they made for us!
if (found == null)
found = findUnique(t);
} catch (...) {
// handle exceptions
}
return found;
}
You need to use UPSERT
or MERG
E to achieve this goal.
However, Hibernate 9 does not offer support for this construct, so 8 you need to use jOOQ instead.
private PostDetailsRecord upsertPostDetails(
DSLContext sql, Long id, String owner, Timestamp timestamp) {
sql
.insertInto(POST_DETAILS)
.columns(POST_DETAILS.ID, POST_DETAILS.CREATED_BY, POST_DETAILS.CREATED_ON)
.values(id, owner, timestamp)
.onDuplicateKeyIgnore()
.execute();
return sql.selectFrom(POST_DETAILS)
.where(field(POST_DETAILS.ID).eq(id))
.fetchOne();
}
Calling this method 7 on PostgreSQL:
PostDetailsRecord postDetailsRecord = upsertPostDetails(
sql,
1L,
"Alice",
Timestamp.from(LocalDateTime.now().toInstant(ZoneOffset.UTC))
);
Yields the following SQL statements:
INSERT INTO "post_details" ("id", "created_by", "created_on")
VALUES (1, 'Alice', CAST('2016-08-11 12:56:01.831' AS timestamp))
ON CONFLICT DO NOTHING;
SELECT "public"."post_details"."id",
"public"."post_details"."created_by",
"public"."post_details"."created_on",
"public"."post_details"."updated_by",
"public"."post_details"."updated_on"
FROM "public"."post_details"
WHERE "public"."post_details"."id" = 1
On 6 Oracle and SQL Server, jOOQ will use MERGE
while 5 on MySQL it will use ON DUPLICATE KEY
.
The concurrency mechanism 4 is ensured by the row-level locking mechanism 3 employed when inserting, updating, or deleting 2 a record, which you can view in the following 1 diagram:
Code avilable on GitHub.
Two solution come to mind:
That's what TABLE LOCKS are for
Hibernate does 23 not support table locks, but this is the 22 situation when they come handy. Fortunately 21 you can use native SQL thru Session.createSQLQuery()
. For example 20 (on MySQL):
// no access to the table for any other clients
session.createSQLQuery("LOCK TABLES foo WRITE").executeUpdate();
// safe zone
Foo foo = session.createCriteria(Foo.class).add(eq("name", name)).uniqueResult();
if (foo == null) {
foo = new Foo();
foo.setName(name)
session.save(foo);
}
// releasing locks
session.createSQLQuery("UNLOCK TABLES").executeUpdate();
This way when a session (client 19 connection) gets the lock, all the other 18 connections are blocked until the operation 17 ends and the locks are released. Read operations 16 are also blocked for other connections, so 15 needless to say use this only in case of 14 atomic operations.
What about Hibernate's locks?
Hibernate uses row level 13 locking. We can not use it directly, because 12 we can not lock non-existent rows. But we 11 can create a dummy table with a single record, map 10 it to the ORM, then use SELECT ... FOR UPDATE
style locks on 9 that object to synchronize our clients. Basically 8 we only need to be sure that no other clients 7 (running the same software, with the same 6 conventions) will do any conflicting operations 5 while we are working.
// begin transaction
Transaction transaction = session.beginTransaction();
// blocks until any other client holds the lock
session.load("dummy", 1, LockOptions.UPGRADE);
// virtual safe zone
Foo foo = session.createCriteria(Foo.class).add(eq("name", name)).uniqueResult();
if (foo == null) {
foo = new Foo();
foo.setName(name)
session.save(foo);
}
// ends transaction (releasing locks)
transaction.commit();
Your database has to 4 know the SELECT ... FOR UPDATE
syntax (Hibernate is goig to use 3 it), and of course this only works if all 2 your clients has the same convention (they 1 need to lock the same dummy entity).
The Hibernate documentation on transactions and exceptions states that all HibernateExceptions 14 are unrecoverable and that the current transaction 13 must be rolled back as soon as one is encountered. This 12 explains why the code above does not work. Ultimately 11 you should never catch a HibernateException 10 without exiting the transaction and closing 9 the session.
The only real way to accomplish 8 this it would seem would be to manage the 7 closing of the old session and reopening 6 of a new one within the method itself. Implementing 5 a findOrCreate method which can participate 4 in an existing transaction and is safe within 3 a distributed environment would seem to 2 be impossible using Hibernate based on what 1 I have found.
a couple people have mentioned different 8 parts of the overall strategy. assuming 7 that you generally expect to find an existing 6 object more often than you create a new 5 object:
- search for existing object by name. if found, return
- start nested (separate) transaction
- try to insert new object
- commit nested transaction
- catch any failure from nested transaction, if anything but constraint violation, re-throw
- otherwise search for existing object by name and return it
just to clarify, as pointed out in 4 another answer, the "nested" transaction 3 is actually a separate transaction (many 2 databases don't even support true, nested 1 transactions).
The solution is in fact really simple. First 14 perform a select using your name value. If 13 a result is found, return that. If not, create 12 a new one. In case the creation fail (with 11 an exception), this is because another client 10 added this very same value between your 9 select and your insert statement. This is 8 then logical that you have an exception. Catch 7 it, rollback your transaction and run the 6 same code again. Because the row already 5 exist, the select statement will find it 4 and you'll return your object.
You can see 3 here explanation of strategies for optimistic 2 and pessimistic locking with hibernate here 1 : http://docs.jboss.org/hibernate/core/3.3/reference/en/html/transactions.html
Well, here's one way to do it - but it's 9 not appropriate for all situations.
- In Foo, remove the "unique = true" attribute on
name
. Add a timestamp that gets updated on every insert. - In
findOrCreate()
, don't bother checking if the entity with the given name already exists - just insert a new one every time. - When looking up Foo instances by
name
, there may be 0 or more with a given name, so you just select the newest one.
The nice 8 thing about this method is that it doesn't 7 require any locking, so everything should 6 run pretty fast. The downside is that your 5 database will be littered with obsolete 4 records, so you may have to do something 3 somewhere else to deal with them. Also, if 2 other tables refer to Foo by its id
, then 1 this will screw up those relations.
Maybe you should change your strategy: First 2 find the user with the name and only if 1 the user thoes not exist, create it.
I would try the following strategy:
A. Start 27 a main transaction (at time 1)
B. Start a 26 sub-transaction (at time 2)
Now, any object 25 created after time 1 will not be visible 24 in the main transaction. So when you do
C. Create 23 new race-condition object, commit sub-transaction
D. Handle 22 conflict by starting a new sub-transaction 21 (at time 3) and getting the object from 20 a query (the sub-transaction from point 19 B is now out-of-scope).
only return the object 18 primary key and then use EntityManager.getReference(..) to 17 obtain the object you will be using in the 16 main transaction. Alternatively, start the 15 main transaction after D; it is not totally 14 clear to me in how many race conditions 13 you will have within your main transaction, but 12 the above should allow for n times B-C-D 11 in a 'large' transaction.
Note that you might 10 want to do multi-threading (one thread per 9 CPU) and then you can probably reduce this 8 issue considerably by using a shared static 7 cache for these kind of conflicts - and 6 point 2 can be kept 'optimistic', i.e. not 5 doing a .find(..) first.
Edit: For a new transaction, you 4 need an EJB interface method call annotated 3 with transaction type REQUIRES_NEW.
Edit: Double check that 2 the getReference(..) works as I think it 1 does.
More Related questions
We use cookies to improve the performance of the site. By staying on our site, you agree to the terms of use of cookies.