Stefan Tilkov talks about REST

Old but good, this podcast  on with Stefan Tilkov talks around the ideas of RESTful applications – why building applications this way makes them part of the web, rather than just “on” the web, and why the REST style exploits the existing architecture of the web.

There’s also a great introduction to REST ideas at

I like the way Stefan characterises RESTful applications as being a specialisation of the REST principles – meaning that an application can provide the basic operations (allowing data to be read, caching observed, mime types honoured etc) but still having a level of functionality that can only be used by a client which understands the api more fully (so, POST operations that will create new domain objects, that require specific inputs, for example).

And there’s a nice write up of some of the frequently-heard objections to REST at, describing how you can achieve things like asynchronous operations. Many of these techniques are things that we’re currently using on projects at Talis.

Cultural Agoraphobia

Interesting to see that John Naugton was talking about open data and “cultural agoraphobia” in The Observer this weekend..

Talis, where I work, has just announced its Talis Connected Commons, which offers free storage of semantic data sets to anyone – as long as the data is open. Peter Murray-Rust, who is quoted in the Observer article,  is one of the people who will hopefully be making use of the Talis triple store.

And I’ve just been listening to Paul Miller’s recent podcast with Reuven Cohen about the Open Cloud Manifesto, which tries to create a coherent idea of what an “open cloud” might be (although that seems to be more concerned with interoperability and portability, rather than openness of data). That seems to have caused some controversy along the way, even making it into The Economist.  

It certainly feels like there’s a tipping point approaching for the next version of data on the web, but (as usual) the barriers are more cultural than technical.

Validate your inputs

Todays podcast listening for the commute was Bruce Sams talking about web app security from Software Engineering Radio.

Starting with a live demo of some hacking techniques (surprisingly effective even with just the audio), it covers some of the popular attacks – SQL injection, javascript in input fields, cookie stealing, guessing adjacent ID numbers and so on.

Apparently about 70% of web app vulnerabilities come from the inputs to the system – we spend a lot of time worrying about things like SSL and encrypted logins, but actually the vast majority of attacks use the applications themselves.

Sams says that when he’s asked for his top 10 tips for making your web app secure, he says:

  1. Validate your inputs properly
  2. See (1).

An interesting aspect of validation, though, is that it applies not just the obvious things like form fields and text strings, but to all the HTTP header elements as well.

For example, WordPress MU (in versions prior to 2.7) had a function that would echo the HTTP Host header without having sanitised it. The attacker can craft a request that contains some Javascript in the Host header which, when echoed, can grab cookies (or other evil cross-site scripting stuff)..

Resetting the oil service light on a 2006 BMW 320i

The oil service warning light on the BMW was still on after the last oil service, so I looked around for how to reset it. Lots of advice on the internet, but mainly involving special tools (aka paperclips) and dodgy shorting-out of pins in the diagnostic port.

There’s also talk of holding down the trip reset button whilst turning the ignition key to the first position, then releasing and pressing and holding the button again. But there IS no ignition key in the recent BMWs..

The answer turned out to be a bit more easily guessed (after a bit of trial and error). When you start the engine, the oil service warning shows on the dashboard, at which point you just press the BC button on the end of the indicator stalk (which is how you access most of the diagnostics anyway). The word “reset” appears, and another click and hold on the BC button makes it show “resetting” (and a little clock icon so you know it’s busy). And that’s it.

Put away the paperclip.

PhpUnit Mocks that suddenly stopped working

We came across a strange thing with the mock framework in PhpUnit .

We had some test code that created a mock for a Store class, and gave it a method to mock out:

        $mockStore = $this->getMock(‘Store’, array(‘save’));

That seemed to work fine in the tests, we could setup expectations, the correct return values got returned, and so on.

Then, in some unrelated code, we changed some require_once statements, and suddenly the test with the mocks stopped working – it threw exceptions trying to create the Store mock. 

It turns out that the constructor for the Store class needed some parameters, and the mock framework needs you to supply those parameters when you create the mock, because it will call the real constructor behind the scenes.

The mock had been working OK previously because the real Store class hadn’t been loaded anywhere by the test – but then the changes to some require_once statements elsewhere in the code meant that the Store class WAS now loaded, and instantiated by the call to create a mock.


So the mock framework in PhpUnit will create a mock for you even if it has no idea what class it is that you’re trying to mock out – you could do

        $mockStore = $this->getMock(‘foo’, array(‘save’));

and it would still give you a mock that worked fine. If it DOES find a class of that name, though, it’ll instantiate it.

In fact, this behaviour is probably a good thing for TDD – it means you can create the mocks you need before you’ve ever created the real classes. You just have to be aware that when you DO create the real class, the mocks will start being created based on the real thing.


The solution in our test was to use the flag that tells the mock framework not to call the original constructor – it’s a bit more clumsy, because you also have to supply some other additional parameters:

        $mockStore = $this->getMock(‘Store’, array(‘save’), array(), ”, FALSE);

JISC dev8D conference 2009

I spent a couple of days last week at the JISC dev8D “un-conference” (see and gave a talk on building semantic web applications. The slides are at

Some of the highlights for me from the lightning talks were Georgi Kobilarov of the Freie Universität Berlin, and George Kroner from Blackboard.

Georgi Kobilarov gave a great description of Linked Open Data, why it matters, and how dbpedia make use of it – specifically, for powering the tagging vocabulary of BBC data so that cross-site connections can be automatically made. Nice to have real-world examples of the strength of LOD, and it points the way for more uses of the dbpedia data set.

George Kroner  talked about the IMS Learning Tools Interoperability APIs in the latest version of Blackboard. The APIs are also implemented by Sakai, so (in theory, at least) plugins for one will work in the other. I haven’t seen any examples of it in practice, but hopefully will soon. (More in-depth discussion at

Thursday 12th included some discussion sessions about VLEs (Blackboard, Moodle, Sakai) and about using collaborative tools (Google Docs, VOIP, MediaWiki). Both were enlightening, though the fact that SOAS have moved their entire email and collaboration infrastructure to Google Docs caused some heated(-ish) debate about the dangers of concentrating even more data in the hands of Google. No firm conclusions were reached..!

Versioning the Cloud

London Legion No. 33 Higman

It always amused me that when I searched for myself in Google Images (go on, everyone does it..) I’d find this unrepresentative picture of me from the days when I played ice-hockey.

But then recently it disappeared. The team website got revamped, and – since I haven’t been in the team for years – my dodgy picture vanished.

A bit of experimentation with the WayBackMachine, though, turned up lots of content from the previous incarnations of the team site. The WayBackMachine has made a valiant effort to record snapshots of the entire Internet, going back to 1996. It’s a bit hit-and-miss, but there’s enough there that you can retrieve long-deleted contents, if you know what you’re looking for.

There’s a a wider problem, though, that’s getting some attention at the moment – how do we preserve the Internet (or a snapshot at any given point in time) so that future enquiries about the “way things were” can be answered? (Here’s Lynne Brindley of the British Library talking about Digital Heritage).

A more subtle problem, for semantic web enthusiasts, is this: if we’re now working with a web of data, rather than a web of documents, how can we tell the exact version was of every piece of data that contributed to the results of a query? Data may be collated from any number of sources, some of which may be more reliable than others, so that – even on a given day – you may get varying results for the same query, depending on precisely which bits of information were available at that instant.

And the ontologies that describe the data may also change, so that the relationships between bits of data will have mutated. (See SemVersion for some thoughts on the way ontologies could be versioned).

If saving snapshots of the web of documents is difficult, then doing the same for the web of data will be an order of magnitude harder.

Just in case, then, I’m preserving my dodgy old picture here, for all time (until this blog gets deleted, anyway).

CruiseControl is dead, long live Hudson

We’ve been using Cruise Control as our continuous integration system for ages, but problems with Subversion checkouts finally drove us to try Hudson as an alternative.

It’s fantastic – configurable from the UI, it archives build logs as well as artifacts, it’s got console output in the browser, and so on. Why didn’t we switch earlier?!

Anyhow, only slight modification need to make the lava lamps work with Hudson.

The lamps are controlled by an IP power switch, with the 4 outlets turned on or off by hitting a url with some GET parameters (ok, it’s not RESTful, but come on..). A cron job fires invokes the script every minute with the “check” command to parse the RSS feed of latest build results from Hudson, and light the lights accordingly.

Cron also calls the “off” function after 5pm, to save the lamps from burning out overnight.

The script is something like this :

check() {
        # check Hudson on localhost and switch on lamps accordingly
        FAILCOUNT=`wget -O - -o wget.log http://localhost:8080/rssLatest | grep FAIL | wc -l`
        PASSCOUNT=`wget -O - -o wget.log http://localhost:8080/rssLatest | grep SUCCESS | wc -l`
        if [ $FAILCOUNT != 0 ]; then
                echo  `date` " : BUILD HAS FAILED"
        elif [ $PASSCOUNT != 0 ]; then
                echo  `date` " : BUILD IS OK"
pass() {
        # switch outlet 1 on and 3 off
        wget http://ip-switch.local/Set.cmd?CMD=SetPower+P60=1+P62=0 -q --delete-after
fail() {
        # switch outlet 3 on and 1 off
        wget http://ip-switch.local/Set.cmd?CMD=SetPower+P60=0+P62=1 -q --delete-after
off() {
        # switch them all off and go home
        wget http://ip-switch.local/Set.cmd?CMD=SetPower+P60=0+P62=0 -q --delete-after