Scott Davis – thirstyhead.com
NoSQL Databases in General- given the number of big companies using them, clearly they're ready to use today
- time to re-examine our unnatural obsession for relational databases
- rdbms has been around for 50 years now–well understood, great tooling, lots of information
- rdbmses are silos
- still good at what they do, but aren't necessarily well-suited to all data
- as developers we're being forced to use sql to express something that's crucial to the success of your application
- not our native language, kind of foreign when it comes down to it
- we use orm to insulate ourselves from sql
- express yourself in the native language of your choice instead of in sql
Is ORM State of the Art?
- really just a bridge
- why aren't there pure java or groovy datastores?
- persistence is pretty uninteresting to developers
- orm is a reasonable bridge, but a rather leaky abstraction as well
- ted neward: orm is the vietnam of computer science
- "[ORM] represents a quagmire which starts well, gets more complicated as time passes, and before long entraps its users in a commitment that has no clear demarcation point, no clear win conditions, and no clear exit strategy."
What Drew Me to CouchDB
- what if i didn't have to bridge technologies anymore?
- what if i could save my objects in their native format?
- couchdb is actually a json datastore, but grails makes it trivial to transfer pogo <-> json
- just need a thin translation layer
NoSQL Solutions
- Google BigTable
- mongoDB
- CouchDB
- Cassandra
- "this is the future, but no one believes us"
- each one of these are a bit different and each has their strengths and weaknesses
- NoSQL = "not only SQL"
- don't think of nosql solutions as just another database; truly different way to think about persistence
- if you think of it as just another database, it'll be the worst database you've ever used
- need to get out of the mindset of "spreadsheet" type format for data
- start thinking more about the right tool for the job
CouchDB History
- starting point was Lotus Notes
- largely ahead of its time
- document database
- not brand-new stuff–ideas and foundation has been around for a very long time
- Apache project
RDBMS vs. CouchDB
- rdbms
- row/column oriented
- language: sql
- insert, select, update, delete
- CouchDB
- if your data has a more vertical orientation as opposed to horizontal, starts to look more like attachments
- email is a good example: to, from, body, attachment
- language: javascript (map/reduce functions)
- put, get, post, delete (REST)
- "Django may be built for the Web, but CouchDB is built of the Web." — Jacob Kaplan-Moss, Django Developer
- can build entire apps in CouchDB
- Couch = acronym for "cluster of unreliable commodity hardware"
- clustering is much more difficult to do clustering–couch was built from the ground up to be massively distributed, clusters out of the box
- O'Reilly book available — free online
Using CouchDB With Grails
- grails has native json support out of the box
import grails.converters.* class AlbumController { def scaffold = true def listAsJson = { render Album.list() as JSON } def listAsXml = { render Album.list() as XML } }
CouchDB 101
- json up and down
- restful interface
- no drivers since it's just http
- written in erlang
- incredibly fast
- designed for scalability and parallel processing
Installing CouchDB
- sudo apt-get install couchdb
- windows installer available
Kicking the Tires
- ping
- curl http://localhost:5984
{"couchdb":"Welcome","version":"1.0.1"} - can also hit this in a browser, but of course can't do a POST from a URL in a browser
- curl http://localhost:5984
- get databases
- curl -X GET http://localhost:5984/_all_dbs
- create a database
- curl -X PUT http://localhost:5984/albums
- delete a database
- curl -X DELETE http://localhost:5984/albums
- uses standard HTTP response codes, e.g. a 201 response code for a database create
- web UI available – "Futon"
- http://localhost:5984/_utils
- everything you can do from the UI you can also do from the command line
- create a document
- curl -X PUT http://localhost:5984/albums/2 -d '{"title":"Revolver","artist":"The Beatles"}'
- create a document from a file
- curl -X PUT http://localhost:5984/albums/3 -d @abbeyRoad.json
- URIs for documents are essentially your primary key–unique way of representing the document
- don't have to create schemas — just start throwing documents at the database
- documents get etags so they're very cache friendly
- documents also get revisions–keeps tracks of multiple versions of the document
- have to provide version number when updating
- versioning numbers are revision number (integer), then -, then md5 hash of the document itself
- can explicitly compress the database to get rid of old versions to reduce size of database
- couch prefers uuids for the ids, but you can use anything you want
- get UUID(s) from couch
- curl -X GET http://localhost:5984/_uuids
- can tack ?count=N to get multiple uuids
- curl -X GET http://localhost:5984/_uuids
- to update a document, you'll get the latest version of the document, then do the update, then pass your changes back to couchdb which includes the revision number
- one of the major things couchdb gives you since it's document based is that the data is accurate at that point in time
- if the data changes in the future, in an rdbms the old document would get the new data
CouchDB With Grails
- domain class–id and _rev as properties
- can add couchdb stuff to Config.groovy to do stuff like create-drop for couchdb databases
- add stuff to BootStrap.groovy
- showing CouchDBService that has convenience methods around a lot of the URL calls to couch
Map/Reduce
- in sql you say select firstname, lastname from foo (this is map) where state = 'NE' (this is reduce)
- map and reduce are stored in 2 separate javascript functions