Rick Wagner's Blog: 2008

Monday, October 20, 2008

Great Erlang podcast with Joe Armstrong

I've recently taken an interest in Erlang, as a way of preparing for massive muticore machines. So far, I've found the language interesting and somewhat easy to use. (This based on not much practice and some of the excellent tutorials available on the web.) Today, I'd like to share a top-notch resource for others in the same boat: an interview with Joe Armstrong on Software Engineering Radio.

For those who've never heard it before, SER is a series of podcasts on all sorts of interesting topics. The URL is here:

http://www.se-radio.net/

Joe's interview is #89. Happy listening!

Rick

Thursday, October 16, 2008

Tech update-- New Tools in practice

Wow, it's been a while since I've written. That's because it's been busy, busy, busy working on that very data-intensive application I wrote about last spring.

Here's an update on some lessons learned:

ELT (Doing transformations in the database instead of marshalling data in and out) is working out well. It really does save lots of time and makes the whole process more reliable.

Message oriented middleware (through Apache's ActiveMQ) is working out nicely. I really like using the 'Competing Consumer' model to scale services horizontally. Just have your service nodes read from a single queue, add more services to add more horsepower!

Mule has allowed us to easily switch back and forth between desktop and distributed modes. This helps a lot in development. Note that we still have some sore spots-- our database isn't one that lends itself to 'one per developer', so we development and testing isn't as easy as we'd like in that respect.

Automated unit testing is great. There are some parts of the application (especially those around external data sources) that aren't so easy to test. There are various techniques (i.e. dbunit, mock objects) that can be of use, but they all seem a little cumbersome. We need a better answer here.

Agile development has worked out well. This was a first exposure to rapid iterative development for some team members, so it was a bit of a slow start for some of us. Not everybody is convinced yet-- some of our developers like more complete specs before starting their work. I like it, though.

That's it for today!

Tuesday, May 27, 2008

Book Review: Java EE 5 Development using GlassFish Application Server by David R. Heffelfinger

When I first picked up this book, I wasn’t sure what to expect. I’m usually an Eclipse/JBoss kind of guy, but occasionally I’ll take a look at other App Servers and IDEs just to see where things are going. It had been a while since I looked at Sun’s application server, so I admit I had preconceived ideas about GlassFish when I started. Boy, was I in for a surprise.

The book starts out explaining the installation process for GlassFish, then how to deploy and undeploy applications. After that, it goes into administration of domains, database connectivity, and connection pools. Not too shabby for the first chapter, especially since the book makes generous use of screen shots, something I always appreciate when I’m doing administrative work via GUI.

I tore into the next section and was rewarded with a re-introduction to servlets, HTML forms, HTTP redirection, and other useful similar topics. At one time I did a fair amount of work in this domain, so there really wasn’t anything astonishing for me. I was pleasantly surprised by the clarity and brevity of the author’s style, though—not too much detail, just enough to quickly get me back up to speed and feeling like I understood the core mechanisms of the workspace I was working around. This excellent style was repeated throughout the book, another big plus. The third chapter was similar, except it gave this once-through big pieces introduction to JSPs instead of servlets. I made use of much of the sample source code and found it very easy to work with.

The next chapter covered JDBC, JPA, and JPQL. Once again, just the no frills basics, with excellent and clear examples of how to use the tools under examination. By this point I was beginning to understand how the book came to be titled as it was—it really is a book that gives an excellent overview of the JEE 5 stack, as utilized on GlassFish. In my humble opinion, Sun would be well served to study this author’s manner of presentation—the text doesn’t drown you in corner cases and obscure functionality, but shows how to use the meat and potatoes of every important component of the stack. Comparing the JEE tutorial to this book would be like comparing a meandering trail through a forest to an empty 8 lane superhighway. This book takes you exactly where you want to go, pronto.

The next two chapters covered the JSP standard tag libraries and JavaServer Faces. I haven’t done any production UI work since the advent of JSF, but chapter 6 did a more than adequate job of explaining how it works. I left the chapter confident I could whip up a decent UI in little time if I was pressed to do so. This reinforces my impression of the book—there are enough simple examples to lend the reader confidence that they can implement a working example quickly.

Chapter 7 covered JMS, something I’ve been doing some work in lately. No surprises in the core technology here, but plenty of helpful screen shots showing how to establish your environment. There were excellent minimalist code snippets, of course.

Security was up next, demonstrating how to set up various realms, including custom ones. Of course there was sample source code to validate your environment, something that’s not always present in explanations of this type.

Chapter 9 was EJBs. All the types are covered, as well as transactions, security, the bean lifecycle, and authentication of a client. If anyone’s studying for the Sun Certified Enterprise Architect exam, this chapter could be of use.

Web Services were covered next. It was in this chapter that I had the thought “Wow, Sun really gets ease-of-use!” Whipping up web services with GlassFish is so easy it’s laughable, especially when using NetBeans. (By this time I was so wrapped up in running samples of things that I started using NetBeans, something I do from time to time if I’m working on a Swing UI. This time around I was picking it up to make use of the excellent IDE/Application Server integration. The build cycle, complete with free build scripts, are really handy.) After reading this chapter, I realized my only remaining gripe against GlassFish was use of the command-line interface, which I credited to my own lack of practice. This app server is for real, and given the simplicity of use I have to expect it’s going to be gaining marketshare. Back to the chapter review, we were shown how to quickly produce web services and clients to consume them, all very easily.

Next up was a chapter on technologies complimentary to but outside JEE, including Facelets, Ajax4jsf, and Seam. By this time I was anticipating the result-- another chapter that lent me confidence that all of these were within a few hour’s reach if ever I needed to get started. This is no small feat, considering I don’t consider myself a client-UI coder in the least, at least not for now.

The book includes two appendices that tell how to configure GlassFish for email and how to get started using either NetBeans or Eclipse.

My overall impression of this book is extremely favorable. I’m an IT generalist, which means I’m called upon to work all over the JEE stack, but more often in some places than others. With a book like this on the shelf, I’ll never have to worry about getting blindsided by a request to work on some part of JEE that I haven’t visited for a while. Kudos to the author and to Packt publishing—this book has earned a spot on my bookshelf.

Tuesday, February 26, 2008

Great new book for ESB users

I've recently had the opportunity to look over a yet-to-be-published book from Manning titled "Open Source ESBs in Action". If you're an ESB user, you owe it to yourself to check this one out as soon as it's available.

In an unusual twist, the authors of this book presented ESB best practices and techniques using *two* open source ESBs, Mule and ServiceMix. I'm already a Mule user, so I was pretty happy to see several new ideas to employ in my Mule use. I haven't tried ServiceMix yet, but may someday, time allowing. (I'm confident I can run several scenarios right out of the box, armed with this book.) The authors cover more than a few patterns straight out of "Patterns of Enterprise Integration". Kudos to Manning for allowing their authors to acknowledge this excellent resource, even if it comes from a different publisher. I really do appreciate that!

I liked several ideas I think I'll apply somewhere down the line-- including use of XML on Mule to facilitate validation and transformation, use of JibX to help make object-to-XML transformations, and use of a BPM product to oversee non-integration parts of my ESB. (I suspect that'll de-tangle the configuration a great deal, or at least make it easier to understand.)

By using two ESBs to implement every use case, I think the authors give you some insights you might otherwise have missed if you saw just one side of the story. As I said previously, unusual, but interesting.

I won't go on and on-- let's just suffice it to say this book will have a spot on my bookshelf when it becomes available in the near future. If you use an ESB, or think you might like to, you might consider looking at this informative work.

Wednesday, February 20, 2008

If it ain't got "Hello World", and it ain't got a cookbook...

It might as well not be downloadable. Open source projects need to be user-friendly to gain acceptance, and if they don't gain acceptance they're doomed to be buried by a rival that is. (Notable exception: if the project is alone in the workspace. Doesn't seem to be many of those around, though.)

Case in point-- the ESB market. Near as I can tell, Mule is running away with that game, largely because of it's ease-of-use model. Anyone reading the help docs can run about a half dozen excellent example configurations not long after downloading Mule. Compare that to some of the other ESBs out there-- they may or may not be able to compete on other merits, but they may never get the chance because they don't get a second look the initial few hours of frustration. Another exception is anything with enough critical mass that it's on everybody's short list sight unseen.)

Examples of doing it right:
Hibernate
Tomcat
Terracotta

Examples of not doing it right:
JBoss WS
Jess
Spring Batch

I'd love to see some of those 'not right' projects do better (especially Spring Batch). Let's see how they fare in the long run-- check back in a year or so.

Sunday, February 17, 2008

The Rules and Data shortcut-- doing things the ELT way

Lately I've been working on a very data-intensive project, one that needs to run hundreds of business rules against millions (maybe billions) of rows of data kept in a database. There's a constraint issue here-- our rule engine likes all the facts (data) in 'Working Memory' before it gives you the results you want, so we can't run the rules against *all* the data all at once. This means either serializing the process (which would take much too long, and would leave us vulnerable to changes to the database as time goes by) or distributing the task (which leaves us to work out recovery and restart schemes and looking for ways to manage timing issues, as we'll probably do this asynchronously.)

An idea has emerged, though, that might help us in many ways. If we resort to doing 'ELT' processing (Extract, Load, Transform-- *not* ETL), we can use the database to do our heavy lifting for us. A really great side-benefit is that we can do this with a small number of SQL statements instead of a whole bunch of data access classes and a bunch of business rules. We also get benefit of having the database state much less vulnerable to partially successful operations, as the SQL can easily be made transactional.

Here's a quick 'method 1 vs. method 2' comparison:

Method 1:
Determine what kind of data the rules require.
Pull the data (using Hibernate generated DAOs), put it in Working Memory.
Fire the Rules (which have to know the particulars of the Entity objects).
Put the rule results back in the database.
*Note: The data is 'pulled out' of the db, then the results are 'put back in'.

Method 2:
Using temporary tables and SQL, do as much of the rule work as possible in the database.
Do simple calculations, putting results in temporary tables that are created and destroyed as often as needed. Make small jumps in state from 'step' to 'step'. This will make more SQL, but simpler SQL.
After an 'End state' temp table is populated with the results of the rule evaluation, put the rule results back in the database. (In this case, an update OR insert, depending if we already had a result for some of the Entities under evaluation.) Some databases (including MySQL) give you convenience functions to handle inserts (if no row with this key) or update (if key exists).
*Note: The data never 'leaves' the database. The database doesn't necessarily work any harder (because it's doing fewer compute-sensitive large joins, selects, etc.) instead of a whole lot more run-of-the-mill DAO operations (gets).

Since the db is doing all the work, all we need is a single client to make the SQL requests instead of an army of distributed clients in Scenario #1. This makes for much simpler recovery and restart in the event something goes wrong (and fewer machines and instructions working, so fewer opportunities for things to go wrong.)

I'm much more of an application developer than a DBA, so this logic-in-SQL idea is not my first choice, but I think I like this ELT stuff. I'll just have to be careful to keep the SQL as simple as possible, as this seems to be the strongest negative that I can find in this situation.

Thursday, January 3, 2008

Welcome to my blog!

Welcome to Rick Wagner's Blog. I'm a software developer, working in Little Rock Arkansas. I expect I'll be writing a lot about software development challenges in weeks to come.

For starters, I'll let you know what's on my mind today:

- Building dynamic queries can be tricky. I'm trying to build an application that will pull data from different tables and columns based on records in an input file. If the input file has a record for George Washington, I may have to go pull data from the GeneralsTbl and the PresidentsTbl. If the input file has a record for George Curious, I may have to go pull from the PrimatesTbl and the CartoonsTbl. I'm in need of some sort of MetaData facility, but am not sure what quite yet.

- Once I have my data gathered, I have to present it to a rules engine. (I'm using JBoss' Drools, which I like very much.) We need the rules and the data to be flexible (i.e. the GeneralsTbl might not be available tomorrow, but we might have a new one, MilitaryPersonnelTbl, to pull from). I think I can get away with using lists (and some handy reserved words Drools gives us to use collections), but will want to be careful. Writing the rules templates to enable plain-speak rules can be difficult, but should pay off if our rules are reasonably stable.

- I'm using the Mule ESB to act as a component model and service framework. Mule allows easy use of Spring, and I think is often used with ActiveMQ as a Messaging provider. I think both Spring and ActiveMQ are starting to feature-creep towards Mule's core business, so it will be interesting to see how things play out in the long run.

That's it for today! In future entries I hope to provide tales of what worked well, what didn't, how obstacles were overcome, etc.

Thanks for reading!

Rick

Rick Wagner's Blog