CFML XMLTransform() and Character Encoding

Quick tip on using CFML’s XMLTransform() — if you see fun weird characters in the output of the transformation like  and you’ve checked to make sure the response headers from the web server are correctly returning UTF-8, you probably just need to specify the charset of the CFFILE operations when you read the XML and XSLT files from disk.

In my case I was seeing non-breaking spaces being rendered as   which outputs a capital ‘A’ with a circumflex before the non-breaking space. At first I thought maybe the response from the web server was ISO-8559 for some reason instead of UTF-8 but after verifying that was correct, adding charset=”utf-8″ to the CFFILE tags that read the XML and XSLT files from disk, all was right with the world.

Setting MySQL to Allow Multiple Statements In a Single Query In JDBC/OpenBD

This came up in a discussion on the OpenBD Google Group today so I figured it was worth a quick blog post.

Depending on what you’re doing with MySQL you may be in a situation where you need to run multiple SQL statements separated by a ; as part of a single query. By default this isn’t enabled in the JDBC connection with MySQL, but it’s easy to enable in your JDBC connection string.

When creating a MySQL datasource in the OpenBD admin, click on the “Advanced Settings” button, and in the “Connection String” box input the following text:
allowMultiQueries=true

If you’re creating the datasource for the first time this will be all you need to do, but if you’re adding this to an existing datasource you’ll want to bounce OpenBD to make sure the connection pool is cleared out.

To test that the setting change worked, simply write a CFQUERY that contains two statements:
<cfquery name="foo" datasource="foo">
SELECT * FROM foo;
SELECT * FROM bar;
</cfquery>

If that doesn’t throw an error, then you’re all set to run multiple statements in the same CFQUERY tag.
Note that you can enable this setting in bluedragon.xml as well, just make sure your <connectstring> node for your datasource looks like this:
<connectstring>allowMultiQueries=true</connectstring>

And your <hoststring> node should look something like this:
<hoststring>jdbc:mysql://server:port/database?cacheResultSetMetaData=false&amp;autoReconnect=true&amp;allowMultiQueries=true</hoststring>

Using BLOB Images in Open BlueDragon

I’ve been helping my friend Brandon a bit today with Open BlueDragon, specifically with manipulating and displaying images that are stored in a database as BLOB data. Not terribly tricky stuff but it is a bit different than how you might deal with this in Adobe ColdFusion so I thought I’d write up a little how to.

In case you want to experiment from the ground up, and this also helps illustrate everything in detail, let’s start by creating a database to hold BLOB data. I’ll use MySQL, but this works exactly the same (as far as I know anyway) with any database. First, create a new database in MySQL, and then create a files table:


CREATE TABLE files
(id int unsigned auto_increment not null primary key,
file longblob not null);

Now we have a basic table with an auto-increment integer ID and our file data as a LONGBLOB. With your database in place, create a datasource in the OpenBD admin console called blobTest and point to this database.

Next let’s create a simple file upload page called imagetest.cfm:


<form action="imagetest2.cfm"
        method="post"
        enctype="multipart/form-data">
    File: <input type="file" name="theFile" /><br />
    <input type="submit" />
</form>

Nothing fancy there. This form submits to imagetest2.cfm which looks like this:


<!--- upload the file --->
<cffile action="upload"
        filefield="theFile"
        destination="#ExpandPath('.')#"
        nameconflict="makeunique" />

<!--- read the file into a variable --->
<cffile action="readbinary"
        file="#ExpandPath('./#CFFILE.ServerFile#')#"
        variable="fileBinary" />

<!--- delete the file from disk --->
<cffile action="delete"
        file="#ExpandPath('./#CFFILE.ServerFile#"')#" />

<!--- insert the binary data into the db --->
<cfquery name="insertBlob" datasource="blobTest">
    INSERT INTO files (file)
    VALUES(<cfqueryparam value="#fileBinary#"
                    cfsqltype="cf_sql_blob" />)
</cfquery>

File inserted.

In imagetest2.cfm we upload the file, read the file from disk using the READBINARY action of CFFILE, delete the uploaded file from disk (since we’re storing the image in the database we don’t need the file on disk any longer), and then insert the binary data into the database using CFQUERYPARAM and a CFSQLTYPE of CF_SQL_BLOB. So it’s clear, the file field in the database contains the binary representation of the file that was uploaded.

Finally let’s take a look at how to pull the binary data for the image back out of the database, resize it, and then display it in the browser, which we do in a file called imagetest3.cfm:


<!--- get the most recently inserted image out of the db --->
<cfquery name="getimages" datasource="blobTest">
    SELECT id, file
    FROM files
    ORDER BY id DESC
    LIMIT 1
</cfquery>

<!--- resize the image to a width of 200 pixels --->
<cfset imgNew = ImageNew(getimages.file) />
<cfset ImageResize(imgNew, 200) />

<!--- output the image to the browser (assuming image is jpg) --->
<cfoutput>
    <img src="#ToDataUri(imgNew, 'image/jpg')#" />
</cfoutput>

Here we first grab the most recently inserted file from MySQL. If you want to display the image in the browser without manipulating it, meaning what’s in the database is exactly what you want to show to the user, you can skip down to the CFOUTPUT section and use getimages.file as the first argument to the ToDataUri() function. I’ll elaborate on ToDataUri() in a second.

In Brandon’s case he was asking how he would resize the image and do some other manipulation after getting it out of the database, which luckily is pretty simple. First we use the ImageNew() function to create a new image object since this is the type of object that all the image manipulation functions work with.

Next we resize the image to a width of 200 pixels using ImageResize().This function takes an image object as the first argument, and a width as the second argument. An optional third argument is height, and an optional fourth argument is quality. For this example we’re just blindly resizing to a width of 200 pixels no matter what the original size was (which of course isn’t likely what you’d be doing in a real world app), and if you omit height it will change the height proportional to the width. At this point we have the variable imgNew as binary image data and the image has been resized to a width of 200 pixels.

Now we want to output the image to the browser. Adobe ColdFusion accomplishes this via the CFIMAGE tag with an action of WriteToBrowser, but WriteToBrowser is not implemented in Open BlueDragon. Instead, Open BlueDragon introduced the ToDataUri() function, which is a more flexible, standardized way of writing binary data to the browser. ToDataUri() is not specific to image data and can be used with data of any valid MIME type, and the MIME type is specified in the second argument of the ToDataUri() function.

The data URI scheme allows you to call inline data on a page as if it were an external HTTP resource. You can see what the URI looks like by right-clicking on the image displayed in your version of imagetest3.cfm and choosing “view image” in Firefox (or the equivalent in another browser). Mine looks like this:


{base64_data_here}

Where {base64_data_here} is the base 64-encoded version of the image in question. ToDataUri() converts binary data to base 64 on the fly so you don’t need to do that yourself ahead of time using functions like ToBase64() or ImageReadBase64().

That’s it for this little tutorial. Hope it helps others dealing with BLOB data in OpenBD.

Leveraging Google App Engine Mail and XMPP Services with CFML

In my previous few posts on Open BlueDragon for Google App Engine (GAE) I’ve done a basic introduction to GAE, discussed the Google Datastore, and dug into how to use the virtual file system and the GAE Image Service to handle file uploads and simple image processing.

In this post we’ll pick back up with my sample app (.tar.gz, 15.3 MB) from OpenCF Summit 2011 and look into two more of the excellent services available to you on GAE, specifically the mail and XMPP services.

Mail? So What?

Sending email is of course a big yawn–anyone can do that from any platform quite easily. But what if you want your application to be able to receive mail and take actions based on the mail it receives?

The standard solution is typically to set up a dummy email address (myapp@foo.com) and then have your application poll the mailbox for new mail, which in the case of a CFML app means setting up a scheduled task to do a CFPOP call to the mailbox in question. You then get back a list of messages and have to handle them as needed for your application.

This works, but it’s a pretty passive, braindead approach to solving the problem, and you wind up having to write a bunch of code to manage a mailbox designed for humans that’s being used by your application.

With GAE you can have your application respond to incoming email directly without having to set up any mailboxes, because the ability for your app to receive mail is built right into the GAE platform. You simply enable mail services in your appengine-web.xml file, configure specific email addresses or wildcard email addresses in web.xml, and your application can receive mail.

Configuring Inbound Services

Let’s take a look at appengine-web.xml, which if you’re following along with the sample app is in the war/WEB-INF directory. Towards the bottom of the file you’ll see a block of inbound services defined:


<inbound-services>
    <service>xmpp_message</service>
    <service>xmpp_presence</service>
    <service>xmpp_subscribe</service>
    <service>mail</service>
</inbound-services>

Note that by default there are no inbound services enabled on your GAE application, so you do have to specify these. We’ll be covering the XMPP services in this post as well, so you can see in the block above this is where specific granular XMPP services such as messaging, presence, and subscription are enabled.

That’s it! With that simple block in appengine-web.xml when you deploy your application to GAE, it will be ready to receive email and interact with XMPP messages.

At this point you’re probably wondering basic things like the address to which you can send email, how to handle the email once it’s received, how to configure specific email addresses, and a host of other things, so let’s dig in a bit more.

How the Magic Works

Once you have mail defined as an inbound service, your application essentially gets a mail server on the internet at the address YOUR_APP_ID.appspotmail.com. So with mail in the inbound services block in your config file, once you fire up your app on GAE mail can be sent to ANY_ADDRESS@YOUR_APP_ID.appspotmail.com. We’ll talk more about how the ANY_ADDRESS bit works in a minute when we look at writing code to handle incoming mail.

Google very cleverly designed many of their services to send messages to your application via an HTTP POST to a specific URL. In the case of inbound mail, any mail sent to ANY_ADDRESS@YOUR_APP_ID.appspotmail.com generates an HTTP POST to the URL /_ah/mail/{EMAIL_ADDRESS} with email address representing the full email address including the YOUR_APP_ID.appspotmail.com domain.

Since there’s a lot of magic behind the scenes that you don’t have to configure, let’s review how this works before moving on to the code for handling incoming mail.

  1. You enable mail as an inbound service in appengine-web.xml
  2. When deployed to GAE, your app can then receive email at ANY_ADDRESS@YOUR_APP_ID.appspotmail.com
  3. When mail is received at YOUR_APP_ID.appspotmail.com, this generates an HTTP POST to the url /_ah/mail/{EMAIL_ADDRESS} within your application

Now let’s look at how to write code to process incoming mail.

Processing Inbound Mail

With very little effort on our part our application can receive mail. That’s cool in theory, but how do we write code to react to this inbound mail?

Since all mail comes in as an HTTP POST, the way you handle inbound mail is more or less the same way you’d handle any POST request to your application. This makes handling all of this very simple since handling POST requests is something with which we’re all too familiar.

From a CFML standpoint there’s a bit of a disconnect here since, at least until I figure out a clever hack, the thing that receives the mail has to be a Java servlet. Not to worry, the code’s quite straight-forward and we can hand off any real work we want to do to CFML once we get the mail in the servlet.

Bit of a detour to explain why (at this point) CFML can’t handle the inbound mail directly. In order for your application to take action when mail is received, you have to define a servlet mapping for the special incoming mail URL of /_ah/mail/{EMAIL_ADDRESS}. Servlet mappings do support wildcards, so for the sample application I’ve defined a servlet mapping of /_ah/mail/* that will route all inbound mail, regardless of the specific email address, to the same servlet.

You could of course define specific servlet mappings for specific addresses, e.g. one servlet mapping for /_ah/mail/user1@yourappid.appspotmail.com and another for /_ah/mail/user2@yourappid.appspotmail.com if you needed two completely separate servlets to handle the different users. I hinted at this above, but I’ll point out here again that you do not need to configure mailboxes or anything along those lines for specific users. Your application will receive absolutely any mail sent to ANYONE@yourappid.appspotmail.com and how you handle the ANYONE part is up to you to define in your code.

Back to the CFML discussion. When I was first messing with the in the local GAE environment, I first thought I’d be clever and just define a real directory path in my application of /_ah/mail, throw an index.cfm file in there to act as a controller, and I’d be off to the races. That actually kind of worked locally, but the issue is when you deploy to GAE it won’t let you deploy your application if you have /_ah as a real directory since that’s a reserved directory for things like the admin console.

With that door shut, I next figured I could define the servlet mapping for /_ah/mail/* to point to the CFML servlet like so:


<servlet-mapping>
    <servlet-name>cfmServlet</servlet-name>
    <url-pattern>/_ah/mail/*</url-pattern>
</servlet-mapping>

This kind of worked as well, meaning I could at least see it was hitting the CFML servlet since I could dump things in Application.cfc and see what was going on. The issue here turned out to be that I coudln’t really leverage anything like onRequestStart in Application.cfc and take any action there, because when a POST comes in to /_ah/mail/* there’s no page argument, which is what onRequestStart expects.

I’m convinced there’s a way to get this to work (or I’m at least convinced I haven’t yet exhausted all avenues of failure here), but since I was under a time crunch to get this rolling for OpenCF Summit, I decided to write a simple servlet to handle inbound mail and then hand things off to CFML for some additional work. I haven’t had time yet to revisit this so if anyone has any ideas based on what I’m describing of how we could get mail to hit CFML directly I’d be very interested to hear them.

The MailReceiverServlet

In the sample app project the code for the MailReceiverServlet is in the src directory, and then in the package org.opencfsummit. As I mentioned above it’s pretty straight-forward, but let’s go over the mail specific parts since this will help shed some light on how mail comes in via the POST to /_ah/mail.

Inside the servlet’s doPost method we first declare some variables and set some defaults starting on line 23:


Properties props = new Properties();
Session session = Session.getDefaultInstance(props, null);
MimeMessage message;
Address[] toAddresses = null;
Address[] fromAddresses = null;
ArrayList<String> recipients = null;
ArrayList<String> senders = null;

Couple of things to note here. First, just so there’s no confusion the Session in this case is a javax.mail.Session, not an HttpSession. So you can see here we’re dealing with actual Java mail stuff when the POST hits our servlet, which is also evidenced by the MimeMessage and javax.mail.Address arrays. The two string ArrayLists at the end we’ll use in a minute as an easy way to pass off the data we want to be available in CFML.

Next up in MailReceiverServlet we get the data from the inbound mail and parse out the to and from addresses:


try {
    message = new MimeMessage(session, request.getInputStream());

    toAddresses = message.getAllRecipients();
    recipients = new ArrayList<String>();

    fromAddresses = message.getFrom();
    senders = new ArrayList<String>();

    for (Address address : toAddresses) {
        recipients.add(address.toString());
    }

    for (Address address : fromAddresses) {
        senders.add(address.toString());
    }
} catch (MessagingException me) {
    // gulp
}

Let’s step through that code briefly.

First, we instantiate a new MimeMessage using a mail session and the data coming from the request’s input stream. Once we have a handle on the message, we can get all of the to addresses contained in the message (message.getAllRecipients()) and all the from addresses as well (message.getFrom()).

We then loop over each address array (toAddresses and fromAddresses) and add each address as a String to the ArrayLists (recipients and senders).

With the simple string versions of the addresses in ArrayLists, we’ll then set these as attributes in the request object and forward the request to a CFML page:


request.setAttribute("senders", senders);
request.setAttribute("recipients", receipients);
getServletContext().getRequestDispatcher("/mailhandler.cfm").forward(request, response);

What this last chunk of code does it puts the senders and recipients ArrayLists in the request object, which translates into the request scope on the CFML side, and then forwards the request to mailhandler.cfm for some additional work.

And of course we have to define this servlet and a servlet mapping in our web.xml:


<servlet>
    <servlet-name>mailReceiver</servlet-name>
    <servlet-class>org.opencfsummit.MailReceiverServlet</servlet-class>
</servlet>
<servlet-mapping>
    <servlet-name>mailReceiver</servlet-name>
    <url-pattern>/_ah/mail/*
</servlet-mapping>

mailhandler.cfm

With the initial incoming mail request handled by our MailReceiverServlet and the request forwarded to mailhandler.cfm (which is in the war directory if you’re following along in the sample app), we can make quick work of the rest of our task.

This might not be the most practical example, but for the purposes of demonstration what we’re doing in mailhandler.cfm is if anyone sends an email to stallmanbook@opencfsummit2011.appspotmail.com the application will respond by emailing a copy of Richard Stallman’s Free Software, Free Society as a PDF attachment. Of course you can do anything you want when your app receives an email but this is the little sample I dreamed up. And this is live by the way, so go ahead and try sending an email to stallmanbook@opencfsummit2011.appspotmail.com to get your free gift. 😉

In mailhandler.cfm we first check to see if the recipients array (which is the ArrayList of strings that we created in MailReceiverServlet and then put in the request object) contains the appropriate email address for this particular action:


<cfif request.recipients.contains("stallmanbook@opencfsummit2011.appspotmail.com")>

Remember that when we configured the servlet mapping for MailReceiverServlet we used a wildcard in the URL pattern (/_ah/mail/*), so all inbound mail to any address will hit MailReceiverServlet. We only want to our app to send a copy of Stallman’s book when one of the “to” addresses on the inbound email is stallmanbook@opencfsummit2011.appspotmail.com

After we check to see if the correct email address is one of the “to” addresses, we loop over the “from” addresses and use our old friend CFMAIL to send an email with the PDF attachment to each “from” address:


<cfloop index="i" from="1" to="#ArrayLen(request.senders)#">
    <cfmail from="stallmanbook@opencfsummit2011.appspotmail.com"
                to="#request.senders[i]#"
                subject="Stallman Book!"
                mimeattach="#ExpandPath('/pdfs/fsfs-ii-2.pdf')#">
Here is the Stallman book you asked for. Enjoy!
    </cfmail>
</cfloop>

Pretty slick, huh?

Now that we’re well-versed in receiving mail, let’s move on to XMPP. Luckily since many of the basic concepts are the same we’ll be able to pick up the pace a bit.

Google App Engine XMPP Service

The GAE XMPP service works very similarly to the mail service, with a few more options specific to XMPP. If you’re not familiar with XMPP, XMPP stands for “Extensible Messaging and Presence Protocol” and while it can be used in a variety of ways, the usage people are most familiar with is for instant messaging. If you use Google Talk, you use XMPP. Also you may be running something like your own Jabber or OpenFire server in your organizaion for non-public IM, and these are based on XMPP.

As a quick reminder, when we configured inbound services for XMPP on our app we enabled:

  • xmpp_message: allows the app to receive XMPP messages
  • xmpp_presence: allows other XMPP users to see your applicaton’s IM user presence (online, offline, busy, etc.)
  • xmpp_subscribe: allows other XMPP users to “friend” your application’s IM user as well as exchange message and status data

As with inbound mail, in order to get this running on your app all you need to do is specify XMPP inbound services in appengine-web.xml. With that in place once you deploy your app it is XMPP enabled.

Also as with inbound mail, your application gets its own XMPP IDs automatically. So once you enable XMPP your application can interact with the XMPP world using one of these two ID formats:

(And yes, the domain is different between these two ID formats. The first is appspot.com, the second is appspotchat.com.)

Yet again as with inbound mail (I hope you’re sensing a pattern), when an XMPP message hits your application it generates an HTTP POST to /_ah/xmpp/message/chat so all you have to do is write whatever code you want to handle incoming XMPP messages.

The XmppReceiverServlet

As discussed above it isn’t really possible (yet) to have a POST to /_ah/xmpp/message/chat hit CFML directly, so we need a servlet that will handle POSTS to this URL pattern. The code for XmppReceiverServlet in the sample app projects is in the src directory, in the org.opencfsummit package.

First we declare some static final service type stuff in our servlet starting on line 21:


private static final Logger log = Logger.getLogger(XmppReceiverServlet.class.getName());
private static final XMPPService xmppService = XMPPServiceFactory.getXMPPService();

This gives us a logger so we can log incoming messages, as well as the service we’ll be using to parse the inbound XMPP message. Side note: logging on GAE is a bit quirky so make sure and check my post about configuring logging on GAE if you run into issues.

In the doPost method in XmppReceiverServlet we’ll parse the message, do some logging, parrot the message back to the sender (just to show we can send XMPP), and then we’ll again forward the request on to a CFML page to do some additional stuff.


Message message = xmppService.parseMessage(request);
JID jid = message.getFromJid();
String body = message.getBody();
log.info(jid.getId() + " --> opencfsummit2011: " + body);

message = new MessageBuilder().withRecipientJids(jid).withBody(body).build();
xmppService.sendMessage(message);

request.setAttribute("toJID", "opencfsummit2011@appspot.com");
request.setAttribute("fromJID", jid.getId());
request.setAttribute("messageType", message.getMessageType().toString());
request.setAttribute("body", body);

getServletContext().getRequestDispatcher("/xmppdump.cfm").forward(request, response);

That’s a bit of a mouthful so let’s walk through this.

First we use the xmppService to parse the XMPP message out of the request object. This gives us an object of type com.google.appengine.api.xmpp.Message and represents the incoming XMPP message. Next, we get the user ID (JID) of the XMPP user who sent the message as an object of type com.google.appengine.api.xmpp.JID, and get the body of the message as a string. We then log this info which I was simply using for debugging purposes as I was working on the sample app.

Next, we build a new XMPP message using the MessageBuilder, and use the xmppService to send the user’s message back to them. Obviously this isn’t a great real-world example but getting the message you send bounced back to you is a nice way to make sure things are working properly.

After that, we put all the relevant information from the XMPP message into the request object so we’ll have it available in the request scope on the CFML side. We then forward the request to xmppdump.cfm.

As with any servlet we need to add this to our web.xml and set up a servlet mapping so this servlet will respond to POSTs to /_ah/xmpp/message/chat:


<servlet>
    <servlet-name>xmppReceiver</servlet-name>
    <servlet-class>org.opencfsummit.XmppReceiverServlet</servlet-class>
</servlet>
<servlet-mapping>
    <servlet-name>xmppReceiver</servlet-name>
    <url-pattern>/_ah/xmpp/message/chat/</url-pattern>
</servlet-mapping>

xmppdump.cfm

Once the request is forwarded to xmppdump.cfm (which is in the war directory of the project) we can do anything we want with this information on the CFML side. In this case I’m first emailing myself the contents of CFDUMP for debugging/testing purposes:


<cfsavecontent variable="foo">
    <cfdump />
</cfsavecontent>

<cfmail from="mpwoodward@gmail.com"
            to="matt@mattwoodward.com"
            subject="foo" type="html">
#foo#
</cfmail>

One thing I didn’t mention earlier about sending mail from your GAE app is that the from address must be either:

  • An administrator/developer on the application, or
  • The address of the user who’s currently signed in with their Google Account, or
  • A valid “receiving” email address for the application (e.g. whatever@myappid.appspotmail.com)

Next, I decided just for fun I’d leverage CFHTTP to create a bit of an XMPP-to-SMS service. Any IMs that come into the application are turned into an SMS and sent to a (in this case hard-coded) mobile number using Twilio.


<cfhttp result="smsTestSend"
        method="post"
        url="https://MY_TWILIO_URL"
        username="bar"
        password="baz">
    <cfhttpparam type="formfield" name="From" value="5551234567" />
    <cfhttpparam type="formfield" name="To" value="5559876543" />
    <cfhttpparam type="formfield" name="Body" value="#request.body#" />
</cfhttp>

As you can see Twilio just takes an HTTP POST and turns it into an SMS message, and here since I have the body of the inbound XMPP message available as request.body, that’s what becomes the body of the SMS message.

Yet another example that’s of dubious use in the real world, but I hope it illustrates the points.

Try It Out!

This sample is live as well. I removed the part that sends text messages but you can log into Google Talk and add opencfsummit2011@appspot.com as a buddy, and any IMs you send will be bounced back to you. OK, enough testing. Stop talking to yourself. 😉 As a next step I was going to try to implement an Eliza-type bot but as with most things these days just didn’t yet have the time.

Conclusion

In this post you learned how to leverage two more services available to you in the amazing Google App Engine platform, specifically the inbound mail and XMPP services. They’re extremely easy to set up, incredibly powerful, and with just a bit of servlet code in the middle this all plays very nicely with Open BlueDragon for GAE.

If you have comments or questions fire away, and if there are specific aspects of GAE you’d like to know more about (I have some additional posts in mind) please let me know. Thanks!

Using the Virtual File System and Image Service in Google App Engine with CFML

In my previous two posts in this series I offered up my introduction to Google App Engine and covered using the Google Datastore from CFML. In this post I’m going to cover using the Virtual File System (VFS) in Open BlueDragon for Google App Engine (GAE), and talk a bit about the GAE Image Service and how to leverage it in your CFML applications.

Uploading Files in Google App Engine Applications

One of the limitations in GAE is that your applications don’t have the ability to write to the file system. Not only was this a bit of a challenge to get CFML applications running at all on GAE since on-the-fly compiling and writing to the file system is at the heart of how CFML engines tend to work, this also prevents you from doing things like allowing your users to upload files to your application.

Necessity is the mother of invention, as the saying goes, and this case was no exception. To get around the inability to write to the file system, Vince Bonfanti created the virtual file system (VFS) project for GAE, which is an Apache Commons VFS plugin that creates a writeable filesystem on GAE by leveraging GAE’s datastore and memcache APIs. Note that the GAE VFS project is independent of OpenBD so if you are developing GAE apps in Java you can leverage the VFS as well.

In my previous post I went over pieces of the sample application I created for OpenCF Summit 2011, and in this post we’ll dig into one aspect of that application in detail so you’ll want to download the sample app (.tar.gz file, 15.3 MB) if you haven’t already. Unzip this file and import as a new project in Eclipse and you’ll be able to follow along as well as run the app from inside Eclipse.

Uploading Photos in the Sample Application

With the sample project running, pull up http://localhost:8888/photos.cfm in your browser.

Ocfs_sample_app_photos

Now in the sample application code, let’s take a look at the photos.cfm file in the war directory. Starting on line 60 you’ll see the photo upload form:


<form name="photoForm"
        action="_uploadPhoto.cfm"
        method="post"
        enctype="multipart/form-data">
    Photo: <input type="file" name="photo" /><br />
    <input type="submit" value="Upload" />
</form>

This is all completely standard stuff, and as you can see the form posts to _uploadPhoto.cfm. The first couple of chunks of _uploadPhoto.cfm look like this:


<cfif not DirectoryExists(ExpandPath('photos'))>
    <cfdirectory action="create" directory="photos" />
</cfif>

<cffile action="upload"
        destination="photos"
        filefield="photo" />

First we check to see if the photos directory exists and if it doesn’t, we use CFDIRECTORY to create it. Then we use CFFILE to upload the file.

At this point you’re probably scratching your head wondering what’s different from how you’d normally do things. That’s OK; we’ll get to the VFS magic in a bit. Just remember that without the VFS you wouldn’t be able to write to the GAE filesystem so you should be impressed that nothing looks different so far.

We’ll go over the rest of the code in _uploadPhoto.cfm in a bit, but go ahead and upload a photo on http://localhost:8888/photos.cfm so you have at least one photo in the applicaton. Once you upload a photo it just outputs all photos on that same screen so you should see something like this:

Ocfs_sample_app_photo_uploaded

 

 

Where Do Files Get Stored?

Now that we have a photo uploaded into the application, let’s take a look at where these files actually get stored and how to configure things so the VFS works in our CFML application running on GAE.

As mentioned above, the VFS uses the Google Datastore and Memcache APIs, so let’s open up our local Datastore browser and see what’s going on there. In your browser go to http://localhost:8888/_ah/admin and click on Datastore Viewer on the left.

Ocfs_sample_app_photo_entity

You’ll notice that in the Entity Kind drop-down there is an entry for /photos/your_file_name.jpg which indicates the photo is being stored in the Datastore, even though in the CFML code we’re doing what we’d normally do to write an uploaded file to the filesystem. If you click on “List Entities” for your photo’s Entity Kind you’ll notice that there are four records (at least in my case) for the file.

Ocfs_sample_app_photo_file_records

Why there are four records for this one file frankly I don’t know, and it’s not all that important to this discussion. At least that’s my excuse for not knowing at this point. 😉 Just realize that when you upload a file this is what happens in the Datastore as the GAE VFS handles the writing of the file.

Next select GaeFileObject as the Entity Kind in the Datastore viewer and click List Entities.

Gae_file_object_entities

Here you’ll see listed some of the directories that OpenBD uses to do its work behind the scenes, but you’ll also notice that there’s an entity with an ID/Name of /photos that has a filetype of “folder,” and there’s also an entity for the photo itself, and for that entity you’ll see that it includes the content size of the file and it has a filetype of “file.”

By default the VFS isn’t configured to allow you to upload to arbitrary directories, so let’s take a look at how to enable this in our application.

Configuring the VFS for File System Access

To enable the VFS for file uploads and other file system interaction, you need to add a servlet-mapping in your application’s web.xml file. This is how you tell your application that when certain URL patterns are used in your application that it’s going to be interacting with the VFS.

In Eclipse open up war/WEB-INF/web.xml and scroll down to line 87. You should see a comment block about the VFS:


<!--
    Configuring and mapping the GaeVfsServlet allows the Servlet/JSP engine (not the CFML engine) to
    "see" and serve application-generated static content (generated images, PDFs, uploaded files, etc...) that
    resides in a virtual file system location.
    See http://code.google.com/p/gaevfs
-->

By default the VFS servlet is commented out, so if you want to use the VFS in your applications you’ll need to uncomment the servlet block for the VFS, which in the sample application is lines 92 – 103 in web.xml

The next step is to add virtual directories as servlet mappings to web.xml so that the application knows to use the VFS servlet for specific URL patterns. On line 104 in web.xml you see the following:


<servlet-mapping>
    <servlet-name>gaevfs</servlet-name>
    <url-pattern>/photos/*</url-pattern>
</servlet-mapping>

In case you were wondering how we could reference a /photos director in our CFML code when that directory doesn’t actually exist on the file system, this is how. When a pattern of /photos/* is matched that work is handed off to the VFS servlet, which does its magic behind the scenes to allow you to use CFML tags such as CFFILE, CFDIRECTORY, CFLOG, and others without actually having write access to the file system.

Take another look at photos.cfm, and on line 48 you’ll see a CFDIRECTORY call:


<cfdirectory action="list" directory="photos" name="photos" />

<cfif photos.RecordCount gt 0>
    <h2>Smile! You're on OpenBD on GAE!</h2>
    <cfloop query="photos">
        <img src="photos/#photos.name#" /><br />
    </cfloop>
<cfelse>
    <p>No photos. Why don't you upload one?</p>
</cfif>

As you can see from that code CFDIRECTORY works exactly as you’d expect, returning a query object containing the directory listing from the VFS directory “photos.” If any photo files are found, they are then displayed using an image tag which also references the virtual photos directory.

Image Manipulation With the GAE Image Service

If you’ve perused the GAE documentation at all you’ll see that one of the many benefits of running your applciations on GAE is the amazing array of services avaialble on the GAE platform, one of which is an Image Service. (I’ll cover the mail and XMPP services in my next post.)

One of the interesting things to note about OpenBD for GAE is that it leverages many of these underlying services without you having to worry about it. For example, another restriction on GAE is that your application cannot make HTTP calls other than through the GAE URL Fetch API. What this means is that the CFHTTP tag on OpenBD for GAE uses the URL Fetch service under the hood without you having to do anything differently in your CFML code.

Let’s go a bit further in the _uploadPhoto.cfm file in the sample application. Since we don’t have any idea what size of image will be uploaded to the application, I decided to arbitrarily resize all images coming in to 300 x 200 pixels so they fit nicely on the page. Obviously if this were a real-world application we’d do some calculations so we could retain the correct aspect ratio for uploaded images and not resize at all unless they went beyond our desired boundaries, but for the purposes of the sample application I wanted to keep things simple.

After the file is uploaded using the CFFILE tag, the file is immediately read back from the VFS as binary data on line 9, which is necessary in order to manipulate the image using the GAE image service.


<cffile action="readbinary"
        file="#CFFILE.ServerFile#"
        variable="imgOrigData" />

In the next block of code we leverage the GAE image service to resize the image to 300 x 200 pixels:


<cfscript>
    isf = CreateObject("java", "com.google.appengine.api.images.ImageServiceFactory");

    imgSvc = isf.getImageService();
    imgOrig = isf.makeImage(imgOrigData);

    resize = isf.makeResize(300,200);

    imgNew = imgSvc.applyTransform(resize, imgOrig);

    imgNewData = imgNew.getImageData();
</cfscript>

At this point we have the resized image in the variable imgNewData, so the final step is to write that to disk using CFFILE, which of course is again leveraging the VFS:


<cffile action="write"
        file="#CFFILE.ServerFile#"
        output="#imgNewData#" />

We then use a CFLOCATION tag to send the user page to the photos.cfm page, which displays all the images in the /photos VFS directory.

That’s all for this post! I hope it provides some good information about the GAE VFS that’s included with OpenBD for GAE as well as giving a bit of an introduction to the services avaiable on GAE using the Image Service as an example.

As always if you have questions, corrections, etc. please post a comment. Next up is a post on the mail and XMPP services which I’ll have ready soon.

Using the Google App Engine Datastore with CFML

In my previous post I gave an intro on setting up the Google App Engine (GAE) plugin for Eclipse as well as installing Open BlueDragon for GAE. In this post I’m going to go over some of my sample application from OpenCF Summit 2011, specifically how CFML applications running on GAE interact with the Google Datastore.

I was initially going to cover the entirety of my sample application in one post but I think it’s going to be better to split this into smaller, more digestible chunks. I’ll start with the Google Datastore, in the next post I’ll address file uploads and image manipulation, and in yet another post I’ll cover the ability for your application to respond to inbound email as well as receive and send XMPP messages.

Let me preface all of this by stating unequivocally that I am not an expert in how the Google Datastore works, and that I’m going to focus on using it from OpenBD for GAE. What I hope this post will do is not only educate people who are just learning OpenBD for GAE, but also prompt discussions about features and functionality that doesn’t yet exist in OpenBD for GAE as it relates to the Datastore, or perhaps there is functionality that is present in OpenBD for GAE that people would like to see work differently.

The Sample Application

If you want to follow along and run things locally you’ll want to download the sample application Eclipse project (.tar.gz file, 15.3 MB). You should be able to unzip the file and then import it into Eclipse but please comment if you have any problems with the application files.

The application itself is pretty simple. It’s a quick session attendee application, and also has photo upload functionality but we’ll cover that in the next post. For now we’re going to focus on how to read and write data from the Google Datastore using the GAE-specific functionality that’s baked into OpenBD for GAE.

Once you import the project into Eclipse, right-click and choose Run As -> Web Application. After a few seconds you’ll see “The server is running” in the Eclipse console, at which point you can go to http://localhost:8888 in your browser and see this:

Ocfs_sample_app_home

Let’s add an attendee so you can get a feel for the very simple functionality of the application. Enter a first and last name (I’ll use my own name) and click the “Add Attendee” button. You’ll then see a bullet list at the top of the screen containing the names of attendees you add. You can clean everything out by clicking the “Delete All Attendees” link, and you can also search for attendees by first name, last name, or both.

As I mentioned above, we’ll cover the photo functionality in a future post.

Google Datastore Overview

As you add, delete, and search for attendees, as you might imagine this data is getting stored somewhere. We didn’t create a database or configure any datasources, however, so where is the data going?

The answer is the Google Datastore, which is the default data storage functionality for GAE. When you run a GAE project locally in Eclipse you get more or less the entire GAE environment, including a local version of the Google Datastore. As we’ll see in a moment, some GAE-specific functionality in OpenBD for GAE allows you to easily interact with the Google Datastore.

The Google Datastore is a schemaless, key/value datastore built on Google Bigtable. At a high level the Datastore can be lumped into the poorly named “NoSQL” database category. What this means is that as opposed to the relational databases and SQL queries that you may be used to, you’ll be storing what are called entities in the Datastore, which are objects organized by type or “kind,” and entities contain name/value pairs of data called properties. Properties in entities can be of various data types, including integer, float, string, date, binary data, and others. Entities also have an ID or key that uniquely identifies the entity in the Datastore.

Because the Datastore is schemaless, entities of the same kind do not necessarily have to contain all of the same properties. Using contact data as an example, you might store two entities that are both of kind “contact.” One contact might have a fax number where another might not. When storing these entities in the Datastore both entities could be defined with a kind (type) of “contact,” but the contact with the fax number would have a fax property, whereas the contact without a fax number wouldn’t have a fax property at all. In other words, the contact without a fax number would not have a fax property that was empty; rather, the fax property doesn’t necessarily exist in the contact without a fax number even though both entities are of kind “contact.”

Entities can be retrieved from the Datastore by key, by property values, or by querying the Datastore using GQL. GQL is a query language with some similarities to SQL, but given the nature of how the Datastore stores data there are of course important differences. We’ll be looking at some basic ways of interacting with the Datastore in your CFML applications in the next section of this tutorial.

For additional information about the Google Datastore, refer to:

Writing to the Google Datastore

Saving New Entities

In the same application we’ll be writing ColdFusion Components (CFCs) to the Datastore. OpenBD for GAE can save both CFCs and structs to the Datastore, and the simple datatypes contained in the value of struct keys, or in the variables scope of CFCs, are mapped to entity properties in the Datastore. Specifically, numbers, booleans, strings, dates, or arrays of these datatypes are what is currently able to be persisted to the Datastore by OpenBD for GAE.

Let’s take a look at how the simple attendee form data is stored to the Datastore. The form on the index page of the application is a standard HTML form with form fields firstName and lastName. The form posts to _addAttendee.cfm. On _addAttendee.cfm an instance of an Attendee CFC is created (you can see the code for the Attendee CFC in the sample application’s model directory), the form data is set to the CFC’s variables scope via the CFC’s init() method, and the CFC is then saved to the Datastore.


<!--- create attendee CFC and save it to google datastore --->
<cfset attendee = CreateObject("component",
            "model.Attendee").init(form.firstName, form.lastName) />
<cfset GoogleWrite(attendee) />

<!--- could also have done:
<cfset attendee.GoogleWrite() />
--->

<cflocation url="index.cfm" addtoken="false" />

What’s noteworthy in the code above is the use of the GoogleWrite() method, which is baked into OpenBD for GAE. This can be used as a top-level function to which a CFC or struct can be passed, or the base CFC in OpenBD for GAE includes a GoogleWrite() method so GoogleWrite() may be called on the CFC instance itself (as shown in the commented-out chunk of code above).

What happens behind the scenes when GoogleWrite() is called is the CFC becomes an entity in the Datastore, the data in the CFC’s variables scope–in this case firstName and lastName strings–become the entity’s properties, the entity is assigned a “kind” of attendee based on the CFC type (N.B. this is the CFC type, not the name of the variable being stored), and the entity is assigned a key as well as an ID/Name by the Datastore when it is saved.

You can view data stored in the Datastore using the GAE dashboard. The dashboard is available both locally as well as on the production GAE platform, but the live (non-local) version has a lot more options and information than the local version.

If you haven’t already done so, or if you deleted all your data, save a new attendee to the Datastore by filling out the form on the index page of the application and click the “Add Attendee” button.

Next, browse to the GAE dashboard at http://localhost:8888/_ah/admin :

Gae_local_dashboard_home

Click on the Datastore Viewer link on the left, then in the Entity Kind drop-down choose “attendee” and click List Entries:

Gae_datastore_entity_list

As mentioned above, the CFC type of attendee has become an entity kind in the Datastore, so any attendee CFCs you save will be put in this same bucket. A key and ID/Name has been assigned to the object, the firstName and lastName properties have been stored, and an additional field of org.openbluedragon.gae.keyname.cfc has been added, which has a value of model.Attendee. We’ll see this in action when we pull data from the Datastore in a moment, but this value is stored so the CFC can be reconstituted in your CFML code when it’s pulled from the Datastore. In other words when you pull this record back into CFML, you’ll get an instance of the Attendee CFC, and if you pull back multiple attendees, you’ll get an array of CFCs as opposed to a query object.

You can also save entities to the Datastore by explicitly specifying a kind and a key name. If you specify the key name when saving you will be able to pull the entity back using a combination of kind and key name as we’ll see when we get into retrieving data from the Datastore. If you don’t specify a key name when you save an entity to the Datastore, it’s auto-assigned an integer value but note that this auto-assigned value cannot be used to pull back the entity. Unless you specify a key name (which shows up in the ID/Name field in the Datastore viewer) you’ll have to use the unique key or query the Datastore to retrieve the entity.

Here’s a quick example of saving an Attendee CFC to the Datastore and explicitly specifying the kind and key name:


<cfset attendee = CreateObject("component", "model.Attendee").init("Homer", "Simpson") />
<cfset GoogleWrite(attendee, "attendee", "MyKeyName") />

This will save the attendee CFC as an entity of kind attendee with a key of “MyKeyName” in the Datastore. Note that if you do another GoogleWrite() using the same kind and key name, this will update the record with the key name specified. (More on updating entities in a moment.)

Now let’s write up a test page that saves a struct to the Datastore. This code is not included in the sample application download so you’ll be creating this file from scratch.

Right-click on the war directory in the project in Eclipse and choose New -> File. Call the file structtest.cfm (or whatever you want to call it), and enter the following code:


<cfscript>
foo = StructNew();
foo.foo = "bar";
foo.bar = "baz";

googleKey = GoogleWrite(foo, "MyNewKind");
</cfscript>

<cfoutput>googleKey = #googleKey#</cfoutput>

Since we’re not saving a CFC in this case, we have to provide a kind under which the Datastore will store this data. This is the second argument in the GoogleWrite() method, this first of which is obviously the data you wish to save. GoogleWrite() returns the key of the entity that was just saved and this is output in the final line.

With this struct data saved, go back to the dashboard, click on Datastore Viewer, and look at the Entity Kind drop-down. You’ll see your new “MyNewKind” available as an entity kind:

Gae_datastore_new_entity_kind

Choose MyNewKind in the entity kind drop-down and click “List Entities,” and you’ll see the struct data has been saved.

Gae_new_entity_kind_display

Because this is a struct and not a CFC, you’ll notice that there is no org.openbluedragon.gae.keyname.cfc property in this entity, since that isn’t needed to create a CFC instance when the data is returned.

Updating Existing Entities

Updating existing entities in the Datastore is done in one of two ways:

  1. Read the existing entity, update the properties that need to be updated, and write the modified entity back to the Datastore
  2. Write an entity directly to the Datastore (without reading first) using the entity kind and ID/name property

The way updates are handled in the Datastore is a bit different than the typical SQL update statements since you’re saving the entire entity again as opposed to updating discrete properties on the entity directly in the Datastore. When updating an entity in the Datastore, you’ll typically first want to read the entity from the Datastore, make the necessary updates to the entity’s properties, and then write the entity back to the Datastore.

For example, if I need to update the first name on an attendee entity with a key of ahBvcGVuY2ZzdW1taXQyMDExcg4LEghhdHRlbmRlZRgNDA I run the following code:


<cfscript>
    attendee = GoogleRead("ahBvcGVuY2ZzdW1taXQyMDExcg4LEghhdHRlbmRlZRgNDA");

    attendee.setFirstName("NewFirstName");

    GoogleWrite(attendee);
</cfscript>

The read/update/write steps may seem redundant but this does ensure that you’re updating the most recent version of the record, at least as of when you did the GoogleRead(). Many other “NoSQL” databases such as CouchDB take this same approach. CouchDB is also a schemaless, document-based datastore and it requires that you pass a version number of the document you are updating to ensure that you’re updating the latest revision of the document.

If you have written an entity and set an explicity ID/Name property you can update the entity by doing a GoogleWrite() and using this same ID/Name value:


<cfscript>
    attendee = CreateObject("component", "model.Attendee").init("NewFirstName", "NewLastName");

    GoogleWrite(attendee, "attendee", "MyExistingKeyName");
</cfscript>

Before moving on to retrieving data from the Datastore, a quick reminder about the schemaless nature of the Datastore. We could have specified a kind of “attendee” when saving the struct above, and it would have saved the struct as the same kind of entity as our attendee CFC, even though the data is not of the same type on the CFML side, and the two entities would not have any properties in common. Here again if you’re familiar with CouchDB, another schemaless, document-based datastore, this will sound familiar.

Reading From the Google Datastore

Retrieving Entities by Key

Now that we have some data in the Datastore, let’s look at the various ways in which the data can be retrieved, some of which were already introduced above.

The simplest data retrieval method is obviously to use a specific key to pull back a specific record. Since there isn’t an example of pulling back a specific record by key in the sample application, we’ll create another quick test file.

Right-click on the project’s war directory in Eclipse and choose New -> File. Call the file retrievebykey.cfm (or whatever you want). Next, open your GAE dashboard, go to the Datatstore Viewer, choose an Entity Kind of attendee, and click on List Entries. Copy the Key property from one of the entities and paste that value into the code below. My key is ahBvcGVuY2ZzdW1taXQyMDExcg4LEghhdHRlbmRlZRgNDA in the example.


<cfset attendee = GoogleRead('ahBvcGVuY2ZzdW1taXQyMDExcg4LEghhdHRlbmRlZRgNDA') />

<cfoutput>#attendee.getFirstName()# #attendee.getLastName()#</cfoutput>

The GoogleRead() function of OpenBD for GAE is used to retrieve entities by key. GoogleRead() takes a single key, an array of keys, or a kind and ID/name as arguments. Note that if kind and ID/name are used this is instead of using the key since the key is unique across entity kinds, whereas the ID/name value (an integer) is not.

Let’s play with this a bit more. If you only have one attendee in your Datastore add another, and view the attendee entities in your GAE dashboard again. Copy two keys and paste into this code (you can create a new file or use the same new file you created earlier):


<cfscript>
    keys = ArrayNew(1);
    keys[1] = 'ahBvcGVuY2ZzdW1taXQyMDExcg4LEghhdHRlbmRlZRgNDA';
    keys[2] = 'ahBvcGVuY2ZzdW1taXQyMDExcg4LEghhdHRlbmRlZRgQDA';
   
    attendees = GoogleRead(keys);
</cfscript>

<cfdump var="#attendees#" />

This will return an array of Attendee CFCs that match the keys in the array.

Retrieving Entities by Kind and Key Name (ID/Name)

Before moving on to querying the Datastore, let’s look at an example of pulling an entity back by kind and ID/name. When you look at the Datastore Viewer in the GAE dashboard, you’ll notice that all your entities are given an integer ID/name by default, which we discussed above. We also discussed the fact that you can explicitly assign an ID/Name value when you write an entity to the Datastore using GoogleWrite().

When retrieving entities by kind and ID/name, if you have explicitly assigned a value for the ID/name property you can retrieve by this value.


<cfset attendee = GoogleRead("attendee", "MyExistingKeyName") />
<cfdump var="#attendee#" />

The ID/name property is case insensitive, so MyExistingKeyName and myexistingkeyname (and all variants thereof) are the same as far as the Datastore is concerned.

Querying the Google Datastore

In the overview section above I mentioned that the Google Datastore uses a SQL-like language called GQL for Datastore queries. This is at the most raw level, however. When you’re using the Datastore through OpenBD for GAE, you’ll be querying the Datastore using a slightly different syntax than the raw GQL syntax, and it’s one based on JDOQL (Java Data Objects Query Language), which is not unlike other ORM-style languages such as Hibernate Query Language (HQL).

As explained on the Google Datastore wiki page on the OpenBD wiki, the syntax supported is actually a subset of JDOQL as follows:


SELECT FROM <kind>
[WHERE <filter>]
[ORDER BY <sort clause>]
[RANGE <from index>, <to index>]

All queries against the Datastore must start with SELECT FROM <kind> but all other parameters of the query are optional.

As an example, let’s first retrieve all entities of kind attendee from the Datastore. This example in a slightly different guise is contained in the sample application’s services.AttendeeService CFC, but here’s the very simple code:


<cfset allAttendees = GoogleQuery("SELECT FROM attendee") />

Note that the kind value (attendee in this case) is case-insensitive. This will return an array of attendee CFCs, not a query object. You may also use the CFQUERY tag and specify DBTYPE=”GOOGLE” to query the Datastore:


<cfquery name="allAttendees" dbtype="google">
  SELECT FROM attendee
</cfquery>

Here too the variable allAttendees will be an array of attendee CFCs, not a query object, even though the CFQUERY tag is being used.

Next let’s take a look at how to retrieve all attendees with the last name Smith.


<cfset theSmiths = GoogleQuery("SELECT FROM attendee WHERE lastName == 'Smith'") />

This should look rather familiar to those of us accustomed to SQL, but note the double equal sign (==) instead of the single equal sign you may be used to.

Multiple where clauses can be strung together using boolean operators. In this case we’ll get back any attendees with a firstName of John and a lastName of Smith:


<cfset allJohnSmiths = GoogleQuery("SELECT FROM attendee WHERE firstName == 'John' && lastName == 'Smith'") />

Note the && operator for “and,” and also be aware that the values of firstName and lastName must match exactly and are case sensitive.

Additional operators are available on numeric datatypes. Let’s assume our attendee entities included the attendees age, and we wanted to get an array of all attendees who are 40 or older:


<cfset oldsters = GoogleQuery("SELECT FROM attendee WHERE age >= 40") />

Finally, we can use the RANGE clause to limit the number of results we get back. Let’s say we only want to get back the first five John Smiths in the Datastore:


<cfset topFiveJohnSmiths = GoogleQuery("SELECT FROM attendee WHERE firstName == 'John' && lastName == 'Smith' RANGE 1, 6") />

The RANGE is one-based and the TO value in the range is not inclusive, hence the use of 1, 6 above to get the first 5 John Smiths in the Datastore.

Limitations

Limitations to querying the Datastore include:

  • Operators in the WHERE clause are limited to ==, >, >=, <, <=
  • Operators for or (||) and not equal to (!=) are not supported.
  • “Like” queries are not supported, meaning for example you cannot run a query that will return all attendees with a last name starting with S (in SQL terms, WHERE lastName LIKE ‘S%’). There are some efforts for creating full-text searching against the Datastore but none of these seem particularly mature at this point.

Deleting Entities From the Google Datastore

Deleting from the Datastore is simple. You use the GoogleDelete() function and pass in one of the following:

  • The key of the entity you want to delete
  • An array of keys of entities you want to delete
  • The object (CFC or struct) you want to delete
  • An array of objects (CFCs or structs) you want to delete

The base CFC in OpenBD for GAE also includes a GoogleDelete() function, so you may call GoogleDelete() directly on an instance of a CFC that has been persisted in the Datastore (e.g. myAttendee.GoogleDelete()).

Let’s look at a couple of examples.


<cfscript>
  // delete by key
  GoogleDelete("");

  // delete multiple records by key
  keys = ArrayNew(1);
  keys[1] = "";
  keys[2] = "";
  GoogleDelete(keys);

  // delete an attendee CFC; this assumes you have
  // done a read so the google key is in the CFC
  GoogleDelete(attendee);

  // delete multiple CFCs; assume attendees is an array
  // of attendee CFCs that have been read so the key is
  // included in each
  GoogleDelete(attendees);

  // call GoogleDelete() directly on an attendee CFC
  attendee.GoogleDelete();
</cfscript>

Examples From the Sample Application

I wound up spending a lot more time discussing the Google Datastore in general and its basic use from OpenBD for GAE than I originally intended, but I think going through things blow-by-blow is pretty important for both learning and reference purposes.

At this point the examples in the sample application are probably self-explanatory, so I’ll leave studying those and tracing the actions in the application and the interaction with the Google Datastore as an exercise for the reader.

As I said above what I’m really interested in doing with this post in addition to educating people is hearing from people who are using OpenBD for GAE if there is new functionality you’d like to see implemented, or existing functionality you’d like to see work differently.

In my next post, I’ll cover doing file uploads and using the Google App Engine image manipulation functionality, which will be a much shorter post than this one. 🙂

 

Intro to Google App Engine for Java and CFML Developers

At OpenCF Summit 2011 we were very lucky to have Chris Schalk from Google come present on Google App Engine. If you’re not familiar with Google App Engine (GAE) you should be! It’s an absolutely fantastic application platform as a service offering from Google with great functionality, very slick features, and incredibly generous quotas for free application hosting. And if you need to go beyond these quotas, you simply configure billing and pay nominal fees for what you use over the free quotas.

GAE lets you deploy Python and Java applications, but one of the most interesting things going on with Java these days is the numerous different languages that run on the JVM. The Java platform being available on GAE opens up some very cool options.

At OpenCF Summit I followed Chris’s presentation with one specific to Open BlueDragon on Google App Engine (GAE). OpenBD is a Java-based CFML runtime engine that allows you to deploy CFML applications to any standard servlet container, and also allows you to deploy your CFML applications to GAE. This is a great option for CFML developers since it’s a quick and easy–not to mention free for many apps!–way to get your CFML applications online without having to worry about setting up a server yourself or getting a hosting account. Not to mention that if you deploy your CFML apps on GAE you get the benefits of running on Google’s infrastructure and have on-demand scalability for your apps.

If you’re not familiar with CFML, it’s an incredibly powerful dynamic scripting language and framework that runs on the JVM. Think of it as a Java Tag Library on steroids. Even if you choose to build the backend of your applications in Java, CFML is a fantastic view layer language that’s a great alternative to JSP, it interacts seamlessly with Java code, and it makes a lot of things that are quite verbose in Java extremely quick and easy. Of course CFML is a full-fledged language as well so you can build entire applications in it quickly.

The tools available for GAE make it very easy to work with. If you use Eclipse, a great option is to grab the GAE plugin for Eclipse. This gives you the entire GAE environment that will run right inside Eclipse and let you develop and test locally. Then when you’re ready to deploy to GAE, it’s a right-click away.

We also have great tools for OpenBD. In addition to being able to grab the GAE edition of OpenBD and drop that into an Eclipse project, you can use the new OpenBD Desktop. This is a desktop application that runs on GNU/Linux, Mac, and Windows, and lets you set up a local development server in seconds. Once development is complete you can then deploy your CFML application to a standard JEE WAR, or you can deploy straight to GAE from OpenBD Desktop.

Openbd_desktop

In this post I’m going to cover getting up and running with GAE in Eclipse, and in my next post I’ll go over the demo application I built for my presentation at OpenCF Summit, and in that post I’ll highlight some of the cool features not only of OpenBD for GAE but GAE itself.

I’m going to cover how to set things up in Eclipse in this blog post, but I’ll have another how to and screencast covering OpenBD Desktop soon.

Installing the Google App Engine Plugin for Eclipse

I’m going to assume my audience is mostly Java or CFML developers who are already somewhat familiar with Eclipse, but if you need assistance with this piece of things please leave me a comment and I will be happy to help.

The GAE plugin for Eclipse installs in the same way as any other add-on for Eclipse. You simply open Eclipse, go to Help -> Install New Software and paste in the appropriate update site URL for your version of Eclipse. This will download everything you need to work with GAE from within Eclipse, including the GAE for Java SDK.

Once the plugin installs and you restart Eclipse, you’ll notice new Google icons in your Eclipse toolbar:

Gae_icons

As well as a new Google right-click menu:

Gae_right_click_menus

To create a new GAE project, you go to File -> New -> Project, and in the Google folder choose Web Application Project.

New_gae_project

You can then run your GAE application from within Eclipse by right-clicking the project and choosing Run As -> Web Application. Running in debug mode and all the other Java functionality with which you may be familiar is of course also available.

Note that because GAE is a platform as a service offering, the entirety of the Java world isn’t necessarily available to you. If you’re curious what is and isn’t available check the JRE Class Whitelist in the GAE docs.

Installing Open BlueDragon for GAE

With a new GAE project created in Eclipse, installing OpenBD for GAE is as easy as downloading a zip file, unzipping, and copying files into your GAE project’s war directory.

When you download and unzip OpenBD for GAE you’ll see these contents:

Openbd_gae_contents

If you’re new to OpenBD for GAE you’ll want to read the README files included.

To add OpenBD to your GAE project, go into the war directory in the unzipped OpenBD for GAE directory, and copy all the files in the OpenBD war directory into the war directory in your Eclipse GAE project. Note that you will overwrite any files with the same name in the Eclipse project, which is what you want to do.

What you do not want to do is delete all of the existing files in the war directory in the Eclipse project and have only the OpenBD GAE war files in the Ecliipse project. To put it another way, you are merging the OpenBD for GAE files with the files that are already in the Eclipse project, and any files with the same name will be replaced by the OpenBD for GAE files.

With these files in place, right-click on the project in Eclipse and choose Run As -> Web Application. You should see something similar to this in the Eclipse console:

Openbd_gae_console_output

You may see some warnings as well but these are typically harmless, and if you already have something running on port 8888 you’ll want to shut that down before launching the OpenBD GAE application.

If everything started up successfully you can then navigate to http://localhost:8888 in a browser and see this:

Openbd_gae_welcome_page

You’re now all set to build CFML applications for GAE!

Working with CFML Code

Working with CFML code in an OpenBD for GAE project is no different that typical CFML development. The only real thing you need to be aware of is that your CFML code must be placed in the war directory in your Eclipse project. This is the root of your application. (Note that if you haven’t worked with CFML or perhaps use a CFML editor other than Eclipse, you’ll want to install CFEclipse, which is a great open source CFML plugin for Eclipse.)

Let’s add a CFML file to our project so you can get a feel for working with CFML code in the context of an OpenBD for GAE project.

In your Eclipse project right-click the war directory and choose New -> File and name the file test.cfm. In this newly created file, add the following code:

<cfset name = "Matt" /> <cfoutput>Hello #name#! Today is #DateFormat(Now())#.</cfoutput>

Save the file, and then go to http://localhost:8888/test.cfm in your browser. You should see this:

Openbd_gae_test_file

That’s all there is to it. You can now build CFML applications as usual using OpenBD for GAE.

Up until recently there have been some differences in supported syntax and functionality between “regular” OpenBD and OpenBD for GAE, but as of the next release of OpenBD the regular Java edition and the GAE edition will have the same exact functionality available, other than where specific functionality is not allowed on the GAE platform. The current nightly builds of OpenBD are based on this new unified codebase between the two editions of OpenBD.

Most CFML code will work fine on OpenBD for GAE. For example, the Mach-II framework as well as ColdSpring both work perfectly, and these frameworks are being used for the open source Enlist application that we started developing during the hackfest at OpenCF Summit.

What’s Next?

Probably the major thing developers will run into immediately when building apps for GAE is that a traditional relational database is not available other than through Google App Engine for Business. If you aren’t on the GAE for Business platform, you’ll be using the Google Datastore, which is a high-performance, highly scalable key-value (“NoSQL”) datastore that can be accessed via JDO or JPA, and also in CFML via GAE-specific functionality built into OpenBD for GAE.

I’ll cover the Google Datastore as well as some of the amazing features of the GAE platform (including receiving mail and receiving/sending XMPP messages) in my next post.

Dynamically Invoking Method Names On a Java Object From CFML

A co-worker contacted me today asking how he might go about solving what turned out to be a rather interesting issue.

From a CFML application (running on Open BlueDragon) he’s calling a .NET web service and getting an array of objects back. By the time all the SOAP magic happens the objects in the array on the CFML side are Java objects.

What he wanted to do next was loop over this array of Java objects and, for each object in the array, call every getXXX() method contained in the object. But the application is getting numerous different types of objects back, some of which have a large number of get methods in them, and he didn’t want to have to hard-code a method call for each one. In addition, the get methods may change from time to time and while we’re supposed to be notified when changes occur, we didn’t want to rely on that.

So consider the following pseudocode:


<cfinvoke webservice="url to webservice"
    method="methodname"
    returnvariable="myJavaObjects" />

So at this point the variable myJavaObjects is an array of homogeneous Java objects.

Next, we want to loop over that array and for each Java object, call all of its get methods.

My first thought was that this is one of those rare cases where Evaluate() might be justified. But I also thought there had to be a better way, so perhaps against my co-worker’s will we spent about an hour hammering through some experiments. I’ll spare you the various things we tried and cut to the chase of the final solution.

One thing I learned while working through this is that CFINVOKE works on Java objects. Who knew? OK, maybe you knew, but I hadn’t ever had cause to try it before so I didn’t know.

So step one is once we get the array of Java objects back from the web service, since we know they’re homogeneous objects, we’ll just use some Java reflection magic on the first one to create an array of the method names beginning with get:


<!--- this returns an array of java.lang.reflect.Method objects --->
<cfset methods = myJavaObjects[1].getClass().getMethods() />
<!--- now we'll create an array of method names starting with get --->
<cfset methodNames = [] />
<cfloop array="#methods#" index="method">
    <cfif Left(method.getName(), 3) == "get">
        <cfset ArrayAppend(methodNames, method.getName()) />
    </cfif>
</cfloop>

So now we have an array of strings that are the method names from the Java object that start with get. In the actual application we’re actually omitting some of the methods starting with get because they’re not relevant (e.g. getClass(), getSerializer(), etc.) so I’m just keeping it simple for the purposes of illustration.

The next step is to loop over the array of Java objects, and on each loop iteration, call each get method and for demo purposes simply output the results. Here’s where we use CFINVOKE to call methods dynamically on the Java objects:

<cfloop array="#myJavaObjects#" index="javaObject">
    <cfloop array="#methodNames#" index="methodName">
        <cfinvoke component="#javaObject#" method="#methodName#" returnvariable="foo" />
        <cfoutput>#foo#<br /></cfoutput>
    </cfloop>
</cfloop>

And with that, we’re getting an array of Java objects back from a .NET web service (with the .NET object to Java object translation being handled transparently by the web service engine of course), we’re using a bit of Java reflection to get a list of the getters from the Java object, and we’re then looping over the array of Java objects and calling the get methods on each Java object.

As an aside, during experimentation we went down the path of using Java reflection directly, but that got pretty messy and didn’t seem to offer any benefit over doing things at a higher level in CFML. Interestingly, while we were messing with some things we had an error generated from CFINVOKE reminding me that under the hood, CFINVOKE is doing all the Java reflection nastiness for you.

Not sure how handy a tip that will be for others but I wanted to blog it while it was still fresh in my mind. There’s probably several other ways to solve this problem so if others have approached this differently I’d love to hear about it.

CFML and Oracle Stored Procedures: Experimentation and Solutions

I love a challenge. Most of the time when someone tells me something can’t be done, I have to prove it to myself even if these attempts are frequently exercises in futility. And even if said something actually cannot be done, I always feel like I learn a ton during the process.

Such was the case when I was working with a coworker recently on getting data back from some Oracle stored procedures into either Open BlueDragon or Railo. Before I proceed, let me state very clearly that I hardly ever touch Oracle so if I’m off-base on any of this I’m happy for an Oracle expert to educate me. In my defense, I will say I did quite a bit of searching on this topic, and more hours of experimentation than I care to admit, so I wasn’t just screaming “Oracle sucks!” and not actually trying to intelligently and logically get things to work. Also bear in mind that in this situation we didn’t write the storedprocs and we do not have the ability to alter them. (Necessity is the mother of invention and all that.)

Back to the task at hand. The storedprocs in question return Oracle REF CURSORs, which is basically Oracle’s way of thumbing their ever-expanding nose at the world by doing things differently than everyone else simply because they want to. (Oracle fans, if there’s a legitimate explanation for this utter nonsense I’m all ears, because I certainly couldn’t find one.)

If your first thought is, “But this just works on Adobe ColdFusion!” you are in fact correct. This is because Adobe CF ships with DataDirect drivers that handle getting an Oracle REF CURSOR into a format that can be used by CF. Because REF CURSORs ain’t your plain old Java ResultSets like the entire rest of the known database universe uses, my assumption is that there’s some translation that goes on either in the driver, or on the CF side, or both, to make this work. But in other CFML engines if you’re using the plain old JDBC drivers–yes, even Oracle’s own JDBC drivers–this doesn’t “just work.”

Let’s assume it’s entirely in the driver. I could either purchase the DataDirect drivers (expensive, and silly to have to do this for this one seemingly small thing), or I could use the ones that ship with CF. Well, a little birdie told me (ahem) that if you try to use the ones that ship with CF anywhere but with CF you’ll get a licensing error when trying to run queries. So that solution’s out.

You might also think you could simply call the storedproc from within a normal CFQUERY tag as opposed to using CFSTOREDPROC and an out parameter. Good thought, but even if you get the syntax figured out, since the storedprocs in question are looking for an out parameter to be passed into the storedproc (figure that one out), there’s no real way to get the data returned from the storedproc into a variable within the SQL statement that you can actually access. I spent a lot of time on this and got the storedproc itself executing fine, but getting at the data was where I spent the bulk of my experimentation time, and I finally gave up.

So where does that leave us? Long and short of it is that without using the DataDirect drivers (even if that would in fact work), there is no way that I could come up with that would allow calling an Oracle storedproc that returns a REF CURSOR from OpenBD or Railo that would work.

At that point do we throw up our hands in despair and state that it simply isn’t possible? Certainly not! We expand our thought process, put all options on the table, and forge ahead because we must not let technology, particularly of the Oracle variety, defeat us!

Since I was at the point where there was no way to have OpenBD or Railo deal with what the Oracle storedproc was returning (and believe me I’m happy to be proven wrong here), I decided the most expedient route would be to write the database access piece in Java and have CFML call that Java object. Once I’m in Java I can work with the Oracle and JDBC datatypes directly and have more control over what I will return to the CFML engine once I get the REF CURSOR back from Oracle.
The question then becomes if it’s better to convert the REF CURSOR into a native CFML query object on the Java side, or to return something like a standard Java ResultSet to CFML and use the ResultSet directly. There are numerous other options as well, particularly once we’re dealing with this in Java and have the freedom to do pretty much whatever we want.

I won’t go into the gory details, but in one of my early attempts I ran into a problem where I couldn’t access the ResultSet because by the time the Oracle delegate class was delegated to the Tomcat delegate class, I had to hack around a bit and get the underlying delegate on the Oracle side just to display the data. Neat problem to solve, but not exactly the most straight-forward solution.

I then decided to try creating a query object native to the CFML engine from the Java ResultSet. This worked well, and is certainly a valid approach, particularly if you know you’re going to deploy to a specific CFML engine. If you want to build something that will work on any CFML engine things get more complicated because your Java methods have to return a specific datatype, so you can’t have a single method in your Java class that will return the various underlying CFML query object types to the respective engines.

But as a wise man once said, all problems in computer science can be solved by another level of indirection (except, of course, the problem of too many levels of indirection). So I created the Java class to talk to Oracle and return the REF CURSOR, and then wrapped that puppy in a CFC that has conditional logic to return a CFML query object native to the engine on which it’s running (i.e. either OpenBD or Railo). Works great!

The only outstanding issue at this point that makes me a hair uncomfortable is that in order for all this to work, I can’t explicitly close my callable statements, ResultSets, and datasource connections on the Java side. If I close those items in Java and then return the ResultSet to the CFML engine, then I can’t actually iterate over the ResultSet since it’s closed.

I could of course close the ResultSet from CFML (I wouldn’t have access to the callable statement or the database connection directly), but I’m not nearly as concerned about the ResultSet as I am about the database connection. At least I’m using the Tomcat datasource connection pooling so that should mitigate this concern a bit. In theory Tomcat and the JVM will handle closing this stuff when it’s no longer in use, but I need to do a bit of load testing to be sure it’s going to work well under load.
There’s also some attributes in the Tomcat <Resource> XML definition that you can leverage to alleviate potential problems. Specificaly the maxActive, maxIdle, removeAbandoned, and removeAbandonedTimeout attributes can help keep things cleaned up since I’m in situation where I can’t clean things up manually.

As far as the database connection pooling goes, thus far we’ve seen problems on Windows with Tomcat 6.0.26. It works for a while but then starts throwing database connection pooling errors and all the connections time out. Drop this all on Red Hat Enterprise Linux with Tomcat 7, and things are rock solid. Go figure.

I’m happy to share the Java and CFC code if anyone’s interested, and I’m really curious to hear how others have solved this issue because I’m still not convinced what I came up with is the best solution, even though it’s certainly working well. So far, anyway (knock on wood).

CFML Developer Position – Smithsonian Institution

Contact Loren Scherbak (contact info below) if interested.

========================================
Job Link on USAJOBS: http://cot.ag/h7MFf1

The Archives of American Art at the Smithsonian Institution is offering a rare, Federal Grade 11 programming position. We are a Coldfusion 9 shop with a lot of creative opportunities to work with XML. We only have one dedicated programmer position, but this position gets to work with a great team comprising a web/usability specialist, a metadata specialist, and a database administrator. The Archives is in the forefront of the archival community in using library catalog data (MARC) and Encoded Archival Description (EAD XML) to describe fully digitized collections on our website.

I can talk with anyone at great length about our work, but I want to get this announcement out to as many people as possible as we have a very short window for the job opening. The application is due by Dec. 3rd.

Loren Scherbak
Archives of American Art
Smithsonian Institution
202-633-7968
scherbakl@si.edu