BarCamp Brussels 2006 – 2PM – 3:30PM

14-16 AM
picture by pietel

Screen scraping with Ruby and IE, or the improper use of the WATIR Framework by Pascal Van Hecke

After a small coffee break I listened to an introduction to the WATIR Framework by Pascal and was pleasantly surprised. I didn’t have any notion of WATIR, and screen-scraping isn’t exactly the hottest topic around, but what I found was a great and simple tool to do some quick tests on a website. WATIR, short for Web Application Testing in Ruby, is a framework written in Ruby that can interact with your Internet Explorer and tell it to do things. The good thing here is that the tool will see exactly what the user will see, and with todays AJAX-y applications it’s not always easy to write a good test script for every feature of your site, so mimicking users and evaluating what the browser sends back seems ideal, at least for quick tests.
With few lines of Ruby you tell WATIR to go to a site, enter a value in a form field, click a button, test and evaluate the response … Here’s are some lines from the sample on the WATIR-homepage;

ie.goto(test_site)
ie.text_field(:name, "q").set("pickaxe")
ie.button(:name, "btnG").click
if ie.contains_text("Programming Ruby")
   puts "Test Passed. Found the test string: Programming Ruby. Actual Results match Expected Results."
else
   puts "Test Failed! Could not find: Programming Ruby"
end

For the moment it’s only working with IE (so you’re bound to Windows), although the FAQ mentions it becomes available for Firefox too in the next release. (But according to Pascal, the author’s been promising for a long time now …)
Anyways, this WATIR seems like a good way to quickly write some simple tests and while it’s probably not the most advanced nor complete testing suite around, I’ll definitely have a look at this. I believe it would come in handy for us, eg. to automatically test some basic features of our various sites as soon as we launch a new version, instead of getting the whole tech crew gathered to test that upload functionality again. A few lines of Ruby could do that for us. And it would be a good way to finally start delving into Ruby a bit, ’cause although I like what I’ve seen from Ruby (and in one of the next sessions I’ve seen some more), I haven’t yet found a project where some Ruby could be used.
Pascal mentioned you could also use WATIR to reclaim the data you’ve got stored on various sites by using it as a screen-scraper. Someone else also mentioned the Selenium-tool. The platform and browser compatibility of this tool sure looks more promising then the IE-only of WATIR, so I’ll be checking that one out too. Anyone who has got experience with the two tools?

Google’s dirty little secret by Bart De Waele

Another presentation that was good for a packed room (‘pr0n‘ / ‘dirty’, you get the link, right?), this talk from Bart of Netlash was a case study of why the visitor statistics of a specific site suddenly dropped terribly. I’ll quickly sketch the situation: the website has 90% traffic via google.nl, 10% via google.be and has a server based in Amsterdam. They move the server to Brussels where it gets a new ip, all other things stay the same, but the visitor stats drop 3 days later with the 90% of traffic from google.nl vanishing completely. This led Bart to conclude Google decides which language/country you’re website is in, and therefore where it should include the site in the search results (localisation), based on the ip of your server. While it seems impossible for me that this would be the only criteria (Top-level-domain name, language, links from other sites, … ?! I’m sure all of this plays it’s role.), the ip-change sure had a drastical impact on that site. Imagine your commercial site had a 90% drop in traffic … Horror.
Since we also have different sites targeted at different countries I’d like to see some more discussion about this, I’ll start here at digg.com and continue with these: Why isn’t my site returning when I search for results from a particular country?, Inside Google Sitemaps: Tips for Non-U.S. Sites, How search results may differ based on accented characters and interface languages.

Ruby Live Demo by Dennis Lamotte

During these 20 minutes Dennis, made a simple blog application in the Ruby On Rails framework, showing how easy he thinks it is to quickly create web applications with Ruby. While this presentation was much like the vidcasts available elsewhere on the net, I did learn a few new things, for instance about the automatic generation of unit tests and how Ruby tries to keep your database in sync when you down or upgrade to different versions of your application. Nice features, me thinks, which makes me more anxious to really kickstart a Ruby project somewhere soon.

Still I have the feeling that with Rails so much stuff is happening behind your back; a few commands and several files are created and or modified at once … Lots of magic involved. Which is a good as well as a bad thing. I raised the obvious question: “Is Ruby On Rails ready for really big and complex sites with huge traffic?” 37signals.com was indeed the answer :) .
A video of this presentation is available at blogologie.be.
This presentation also made me install TextMate and CocoaMySQL, two tools Dennis uses to develop. Since I recently made the switch from Windows, the quest for good macosx apps is on again …

back to the Barcamp Brussels 2006 overview

One comment.

  1. [...] Screen scraping with Ruby and IE, or the improper use of WATIR [...]