Speaker: Burt Beckwith, SpringSource
Overview- demo of potential performance issues with mapped collections in GORM
- using the hibernate 2nd-level cache
- monitoring and managing 2nd-level caches
- app info plugin
Standard Grails One-to-Many
- library has many visits
- visit class has person name and date, with backreference to library
What's the problem?
- hasMany = [visits:Visit] creates a set
- sets guarantee uniqueness
- adding to the set required loading all instances from the database to guarantee uniqueness, even if you know the item is unique
- likewise for a mapped list–lists don't guarantee uniqueness, but they do guarantee order, so they still have to pull all records from the db to get the order right
- you get a false sense of security since it's lazy-loaded; only partially helpful
- works fine in development when you only have a few visits, but imagine when you deploy to production and you have 1,000,000 visits and want to add one more
- risk of artificial optimistic locking exceptions; altering a mapped collection bumps the version, so simultaneous visit creations can break but shouldn't
What's the Solution?
- don't use collections
- instead of visit belonging to a library, visit HAS a library
- different syntax for persisting a visit
- no cascading; to delete a library you need to delete its visits first in a transactional service method
- you also lose your collection so you can't do library.visits.size(), etc. but you can still use dynamic finders,which is better anyway since you're only pulling what you need
Standard Grails Many-to-Many
- user has many roles, role has many users
- problem is that if all new users are granted a particular role, you get into scaling issues quickly
- with many to many you have an intermediate table with pointers to both tables and can map the join table
- the belongsTo in a many to many can go in either class since it's bidirectional
- but this is the problem since the same amount of data will get loaded either way
- more efficient to treat kind of like one-to-many and create the user, then grant the role
- this way you're adding/deleting single records in a single table due to existence of a domain class describing the relationship
- important to have well-defined equals() and hashCode() methods in your domain classes, as well as implement serializable so you can use second level caching
- wind up with user.addToRole() or role.addToUsers()
- no cascading like before–have to manage this yourself
So never used mapped collections?
- no, you need to examine each case
- standard approach is fine if the collections are reasonably small–for both sides in the case of many to many
- the collections will contain proxies, so they're smaller than real isntances until initialized, but still a memory concern
- great example of something that's convenient and easy out of the box, but when it becomes a problem, you just do it a different way
Using Hibernate 2nd-level Caching
- great, but have to be careful because it can burn you in the same way that a query cache in a database can bite you
- great candidates for caching–anything that's read only and doesn't change often
- can overuse cache to the point where you're spending more cycles flushing and aren't saving yourself any db traffic–can actually make things worse than just hitting the db
Caching Usage Notes
- 1st level cache is the hibernate session itself
- get is always cached
- can significantly reduce db load by keeping instances in memory
- can be distributed between multiple servers to let one instance load from the db and share updated instances, avoiding extra db trips
- "cache true" creates a read/write cache, best for read-mostly object since frequently updated objects will result in excessive cache invalidation (and network traffic when distributed)
- "cache usage: 'read only'" creates a read-only cache, best for lookup data (countries, zip codes, etc.) that never change
Query Cache
- can set cacheable true on all the query options; this caches the query and you can grab class instances from this
Hibernate query cache considered harmful?
- most queries are not good candidates for caching; must be same query with same parameters
- updates to domain classes will pessimistically flush all potentially affected cache results
- DomainClass.list() is a decent candidate if there aren't any (or many) updates and the total number isn't huge
- great blog post by alex miller at http://tech.puredanger.com/2009/07/10/hibernate-query-cache
2nd Level Cache API
- evict one instance: sessionFactory.evict(DomainClass, id)
- can get stats (hits/misses, etc.)
- can look at the stats and get a good sense of whether or not you're caching effectively
- e.g. if miss count is high then your cache strategy isn't effective
- appinfo plugin gives you tons of information about what's going on in your app, what's going on in hibernate, etc.