Today kicks off the first day of talks for Rails Underground in London. Thankfully they've got usable wifi, and two days of what sound like mostly excellent talks. So over the next couple of days I'll hopefully be able to post summaries of the various talks I'm able to squeeze into.
So first up...;
Taking a queue from Martin Fowler's "Patterns of Enterprise Application Architectures" Fred goes in to discuss the various trade offs that happen in different application styles. Essentially, approaches that get you going quickly usually become exponentially more difficult as the complexity of your problem increases. Rails with it's "convention over configuration" offers a great way to prototype very quickly, but by Fowler's assumption this should get slower and slower to extend as your app complexity grows. So Fred decided to start from scratch, take his learnings from smalltalk and see how things would turn out on a new project.
Models were just pure ruby, storing to a YAML persistance layer. Sinatra took control of the controllers, with HAML/SASS for the view layer. I turns out that with very little code, YAML provided an easy way to persist the objects quickly to disk. Incrementing filenames on each save also meant that object state was versioned, and actions were simple to undo/redo (and note we're talking full objects being serialised to disk with fill state, not just instance variables/attributes).
Then to link the model with the view comes in Sinatra. It's so light and easy, does nothing more than it needs to, it's essentially just a set of regexps that push a request to the appropriate code. And how much more do you really need to do in most controllers?
Next was the view, and for anyone using HAML there is nothing groundbreaking to learn here. It's awesome, heaps better than the prescribed ERB, and you should be using it (my conclusion on Fred's discussion and my own experience, don't flame him directly). It's not just less typing, the prescriptive structure makes the code more readable and being able to use variables in CSS is incredibly liberating and DRY.
Some good debate post keynote on whether it was really a comparison of rails vs alternatives, or just a test project using stuff rails already offers you without using rails.
A high paced and amusing presentation about trying to refactor old apps. Why would you need to consider a rewrite? "Maybe it was PHP on Rails, or some guys first Rails project". The options are burn it to the ground and start again, or try and refactor it. The default option should always be to try and refactor. It's usually tested, and it works, and if you re-write it you're basically shutting down and delivering absolutely nothing for an extended period town. At which point you eventually release, with the same feature set. And even if you're re-writing in an agile fashion, your first releases still have only a minor percentage of the features expected by the users and will be considered a fail.
You refactor unless:
You rewrite if:
When going through the estimation phase, it's often not accurate to just take a cursory glance at the code and see how easy it is to understand. So much can be hidden under the surface that it's usually best to try a spike at implementing something new and see how easy it is.
If you're considering a re-write, you need to approach it as a complete outsider and say "I know nothing, what does this thing need to do". Otherwise you open yourself up to the stealth feature that was hidden deep in the code that you lost in your re-write. Boot it up, use the application...; and watch the client use the application.
To make it work and keep the client happy, make sure that any re-write is actually adding value to the existing system. Deliver early, deliver often.
That's it on re-writing, next is how to make refactoring manageable.
Gwyn also added in a shameless plug for his "noisy partials" plugin which will insert HTML comments in the rendered views to help you identify where exactly things are being rendered. The "partial dependencies" plugin will draw a nice pretty dependency graph showing the links between all the partials in a project.
Split the long methods into smaller workable and meaningful chunks. You can then work on refactoring these bite sized pieces while you're working through the codebase. If there's code you don't understand, delete it and see what breaks. If nothing breaks, kiss it goodbye...; "Deleted code is debugged code". Make sure that you're only working on the small bites, and get it back to a working state before you continue. Otherwise you'll forget what it was you were originally working on, and before you know it you'll have upgraded your entire rails stack and plugins and nothing works. And don't refactor code just because you hate it, much sure you hate it and it's in your way.
Check out Sequel as the database adapter if you're trying to move legacy data between SQL database schemas.
Another talk on dealing with old rails projects and code, am I detecting a general theme here? So what constitutes legacy apps here?
Essentially, code that is difficult to change. It all boils down to technical debt.
At Hash Rocket they have rescue missions which they do for clients they inherit with these problems. First off they'll spend a couple of days doing a code audit. As Gwyn pointed out though, much of the detail isn't uncovered until you try and implement something.
So what to do and how to make it workable?
The specific things hash Rocket do in their approach is:
Obie got introduced to ruby via Aslak of Cucumber fame back in 2004, with rails in 2005 for a client in London.
First piece of advice, controversy can be useful. One of the first things Obie did was writing the top 10 reasons why Java sucks. Just make sure you can take the abuse, especially if you deserve it...; he's willing to admit his initial ideas for Rails Maturity Model sucked. Confidence always win. On the flip side of that, don't ever seem desperate. The moment you go in and convey, even subconsciously, that you need the work you make the client wonder why you need the work. They'd much rather work with someone who is in demand. Referrals will be the lifeblood of your business, so go ahead and videotape your clients. Get them used to it, video meetings, testimonials, and help document the project. It's a tactical nuclear weapon.
Make it personal, pick up the phone and call a potential client. It's way too easy to just fire off an email or IM.
You need to write bulletproof contracts. Keeping them simple is not ideal, too many potential problems. Master services agreements, warranties, digital signing/acceptance, etc. all need to be included. Don't negotiate under pressure. You'll end up agreeing to things that you shouldn't to and it will cost you more in the long run. And when things go wrong, lawyers cost a lot of money.
Don't undercapitalise. Don't do fixed bid contracts. None of Hash Rocket's contracts state a deliverable, you're buying time and expertise. There is no definition of scope. By documenting scope you're breaking the philosophy of agile and you lose the ability to adjust scope. You'll probably need to gather some momentum before these are applicable. Make sure you charge what you are worth (a slide showing that Obie billed out at $250/hr, other devs at $190/hr for short term work $150/hr for anything greater than 6 months). Allow some budget for non-paying clients. When you're extending credit, some times they're just not going to pay (either because they're bastards, they ran out of funding, they've drained the kids college fund, etc.). Allow some budget for product development, everyone wants to be 37signals eventually. Don't invoice manually, Harvest is a good example of how to handle it easily.
You shouldn't have to defend agile. Just start from the assumption that "this is how we work" and that it's that the normal case, then you're not having to justify a change in approach. Always do agile by the book first, don't try and tweak it and then wonder why it doesn't work. Storycard the work, either on index cards or in something like Pivotal Tracker.
Execution is absolutely critical. If you're not consistently striving for excellence then it's all worthless. Going hand-in-hand with that is perception is reality. If you can't relay to the client that you're doing an excellent job then it's worthless also. Bend the rules, the client just wants a result and doesn't really care how you go about it. In some cases you should even break the rules. Another simple one is to establish contact, especially with big value contracts, call them every day. Mind their budget, especially if you've made the jump to time based contracts. The responsibility falls back on to you to make sure you're not burning through all their cash so have transparency on who's working and what the cash burn rate is. Do what you can to win their people over. Fire clients if they deserve to be fired. Hire people on a contract to perm basis (bah! I say they should all be contractors if they're any good). Keep your employees constantly learning. And make them pair all the time. Empower your employees for change, be open for them to challenge the generally accepted practices (like pairing all the time). You need to actively work to make that possible. Make the work environment appealing and always have fun. Keep everyone in the loop all the time.
The current version of JRuby is 1.3.1, it's ruby 1.8.6 compatible (give or take, it can't do continuations and some other things). Has some ruby 1.9 support (somewhere 75%-90% done).
It's roughly equivalent in performance terms to ruby 1.9, with real native threads and runs rails fine. It takes at least 0.5secs to start up, but can take several seconds so it's not great. Once up, it should be faster than 1.9 in almost all cases. Most ruby application bottlenecks are in the core classes and not ruby itself (string manipulations, working with hashes, etc.) and the JRuby performance in these classes is mixed which makes providing meaningful benchmarks almost impossible. Some cases are great, others are bad. Also because of the iterative optimisations that the JVM does internally a single run of an application doesn't show real world performance. To demonstrate Charles ran a fractal generation program 5 times and you could see how much quicker the latter runs were.
It's the only ruby implementation with true native threads. Again another demonstration showing how with JRuby you can effectively max out two cores, yet with regular ruby, even 1.9, you only ever really get the equivalent of 1 core working (spread across the two cores). Next was a really cool demo that I'll do no justice here. Basically using FFI to call C functions and have them then execute ruby callbacks.
Next was a demo of starting an app (Typo) using Glassfish. Just download the gem, go into the app and:
"glassfish -e production"
If you're running threadsafe rails it will start just one instance of the JVM and keep memory usage low. In any event, it takes care of concurrency and all the usual problems for you.
It's at about maybe 80% of 1.9.1. The 1.9.2 release adds a bunch more, but they'd like some help (so get in touch if you're interested/able). He also gave a cool example of a change to regexps in 1.9 that I'd not seen. If you want to extract a grouped match from a regexp pattern you'd normally do something like:
str = "Welcome to Rails Underground" matched = str.match(/Welcome to Rails (.*)/) matched == "Underground"
The problem is if you change the regexp to:
matched = str.match(/(Welcome) to Rails (.*)/)
matched now becomes "Welcome" rather than underground, and your code brakes. In ruby 1.9 you can name grouped matches:
str = "Welcome to Rails Underground" matched = str.match(/Welcome to Rails (?<conf>.*)/) matched[:conf] == "Underground"
One thing that isn't missing is 1.8 compatibility, it's a mostly solved problem. There are some edge cases that wont work, but it's got the best coverage of all the alternative estimates. On the todo list is better performance, better java integration. To highlight some of the upcoming performance improvements Charles went through the intermediate code the new compiler generates which was quite fascinating to see. Makes me glad I don't have to build compilers for a living though. As a result there is also a "ruby2java" executable which generates java classes from ruby code now. And to further help with the java and ruby integration there is a "become_java" method you can call on a ruby class to turn it into an object that can be used natively within java.
The other tasks are rubifying the java libraries. Hibernate, Ant, Maven, etc. It's basically an attempt to keep all of the rich functionality and performance that java offers, without the inherent java ugliness. The big news is that Hibernate is basically done now and just needs to be wrapped up in a nice ruby DSL. But you can use Hibernate as a persistance layer in JRuby now.
You're going to need a lot of servers, so you may as well make the job as easy as possible. You've got lost of serving options, you should be using passenger though. There's now Ruby Intelligent Packaging called RIP which is great and you should use it, except it doesn't work...; so just keep an eye on it. You also need to have continuous integration setup and working. CruiseControl.rb isn't great but it works.
As far as actually deploying, capistrano is there. But use webistrano, it's a thin layer over capistrano. The main benefit is you get an audit history so you can see who deployed what, where, and when. Next is puppet to manage your server config. It's not as quick is making disk images, but it's more flexible and much easier to maintain (Personally, I'd advocate Chef instead of puppet).
You need multiple servers (either actual dedicated iron for each, or virtualised environments) to deploy to. A test server for external testers and/or clients to run through the completed development prior to a production deployment. A pre-production server that closely mirrors the production server. It's there as a final sanity check prior to going live, and there to check there isn't some configuration difference on the production box that will trip you up. Make sure you test both the up and down migrations, it should work, but it often doesn't. And there is nothing worth than discovering the rollback won't work once you get to production.
On production, make sure you've got notifications setup. "Exception Notifier", "Get Exceptional", "HopToad", "New Relic" test them all and make a call, just use one. Service monitoring, make sure things stay up with monit or god. To help with peak loads consider a Content Delivery Network like Limelight Networks.
You'll probably need a debug server. If you push things all the way to production and you get errors, it's quite possibly data related. You need to copy the logs, assets, and all other data over to try and work through. In practice, New Bamboo don't run a separate debug server but just re-provision pre-production very quickly (it's there, it's got the production code already, it's almost ready to go).
The Ruby Invoicing Framework tries to give a solid foundation where you can build a solid web application. It offers a few classes:
They all inherit from LedgerItem which throws is a naming throw back to accounting terms, but it essentially offers you access to discrete billing line items and two companies, one on each side of the transaction. What it gives you for free is an automatically rendered invoice (an actual invoice you can post/email to the client) with all the legally required fields included.
Accountant need to know the exact dates of transactions, that bank statements reconciling, and VAT/sales or other applicable taxes.
insert various slides about accounting practices and how ledger items are calculated. I'd expect anybody who runs their own business already knows all this stuff
I've had to roll most of this stuff myself over the past 12-18 months, and for the most part our schemas and approach is almost identical. Just to save myself the maintenance headache, I'll probably look at porting over to this invoicing gem. I also need to look at the existing open standards (UBL, XBRL-GL, OAccounts, and OASIS) and see how they fit in.
I've covered the background on what CouchDB is in previous posts, or it's otherwise readily available with a quick search so I'll leave you to find the really high level stuff out yourself. Documents are stored as JSON, you get subsets of documents via views. Done.
Unlike relational databases which you pay a performance hit on indexes when you insert data, CouchDB makes you pay the penalty on the first read of a view. That means if you path insert a bunch of data, the next request to see that information is going to be slow. In a typical read heavy web application, this isn't that noticeable. If you're inserting data often, then there are ways to mitigate the problem.
So when should you use CouchDB? If you want a schema-less database. FriendFeed was used as an example, given the amount of data they store adding a new column in MySQL could take several hours. Often it also relates more directly to your real-world models. George went through an example from constructing his "5ft Shelf" site and the complexities associated with books (numerous ISBNs, different titles in different countries, difference retail prices in different locales, different editions, etc.) and how difficult it is to map all these permutations in a relational system. The final scenario is when you know you're going to need replication or sharding, either for offline capability or scaling.
You shouldn't use it is when your problem domain is very fixed. Finance is a good example, real estate transactions are another.
Enter George's couch_foo plugin which is a way of interfacing with CouchDB in an ActiveRecord fashion. Because everything is ultimately stored as JSON, there's no real concept of datatypes. So long as the attributes can be translated to JSON and reconstructed from JSON you're good to use any datatype/object you type. But the plugin does natively support validations, associations, callbacks, and pretty much everything you'd expect from AR. One gotcha is that if you try and order by an attribute that isn't exposed by the key, it will return all the results and then order them in ruby. It's a double sting if you then try and limit, as you've pulled back a heap of records you never needed.
Performance wise, George said there's a whole heap of naive benchmarks claiming CouchDB is faster than this or that. It's a different approach, some things are going to be quicker, some are going to be slower. The latest CouchDB release (0.9) does offer some speed improvements over previous versions though.