Old Servers Never Die …

Old Servers Never Die …

… they just slowly virtualise away.

‘Research servers slowly fade away’ – Part 2

This post might have to be filed under the category of “Bloomin’ Obvious” but anyway…

In a recent email I commented that “most researchers  don’t tend to decommission a machine until it dies from underneath them” – my respondent’s reply was that this is one of the findings from the KPMG Audit that the Digital Campus initiative was seeking to address.

So as I was performing a more exhaustive audit of the School’s server resources I was mindful to see if there were any servers that could be put out to pasture. The answer was that perhaps 5% of them fell in to this category and they were the Sun Solaris machines that are now well and truly past their prime.

The other servers? Well, the metal has been repurposed in to virtual hosts and the instances have been converted in to virtual clients. The virtual server files are held on a big, networked iSCSI array and the instances, themselves, are split over a number of machines that have either been bought as bespoke virtual hosts or converted from an old, ‘physical’ server.

This means that the instances can now keep going for as long as we have the determination to maintain them. If a host fails then the client can be moved to another machine and the metal either repaired, replaced or scrapped.

There is really no need to retire an old virtual server if we have the capacity to keep it going alongside the newer ones.

This doesn’t mean that old servers should never be decommissioned – but virtualising them makes the process less abrupt – they can be kept running until all their users migrate to a newer instance and then the setup can be mothballed.

There are many reasons a researcher may argue for maintaining access to an old setup (see previous post), virtualisation means that there’s one less reason to remove them within a short timescale.

 

IT Support or Supporting IT?

I was talking to a colleague the other day and he mentioned how difficult it was to get IT Support in their unit.

He said that their IT Support technician was always busy sorting problems out for the students and fixing other machines so he was rarely able to help staff with their issues.

I was intrigued, as ‘sorting student problems out and fixing machines’ sounds quite a lot like reasonable duties for a technician’s ‘IT Support’ to me, so I asked for an example.

“Well, the other day somebody was showing me their Word Document and it had an Excel Pie chart in it. We discussed changing a couple of parameters in the statistics and so he opened up the Excel sheet, changed the details and the chart in the Word Document updated automatically! That seemed really useful to me so I emailed our chap to ask him to help with setting something like that up. I’ve not heard anything back from him yet and it’s been two weeks!”

Now, whilst this has an element of IT in it and would, indeed, provide support to the story’s main protagonist, I don’t believe that this is what most IT Staff feel is their main remit for ‘IT Support’!

However…

What if this is what a sizable amount of non-IT Staff feel *is* what is meant by ‘IT support’? i.e. if they’re asked the question “How good is your IT Support?” they answer “Not great”. Not because the IT Support is actually lacking but more because they don’t know that their definition of ‘IT Support’ is more to do with their own Digital Literacy and less to do with what their IT Support staff actually have as their main role.

If we’re trying to fix a gap in provision of IT Service we should always identify the true definition of the problem rather than blindly assume that we’re all thinking the same thing for the same terms.

Preserving Murphy’s Legacy

We had support call the other day from one of our academic researchers. He was working with one of his old postdocs and they were busy putting a project together based on one of their old papers.

The trouble was that the postdoc was having difficulty accessing the setup they’d originally used. It took a few emails to finally get to grips with the problem; he was using a deprecated server trying to access software that had been updated and made obsolete three years ago. We had archived the software but not completely removed it, we know our staff and the chance of something like this happening was reasonably high.

The thing is that this is nothing unusual. We try to ensure that our staff have access to the most up to date machines and software that their research money can buy, however no-one likes to let go of a setup that has served them well in the past – they simply have too much time and experience invested in it.

To be honest, the only times we get to finally let go of old software is when the manufacturer no longer provides a license for it and server hardware only when the cost to repair is prohibitive. Actually, these days, with virtualisation, old ‘hardware setups’ can often keep going on new metal.

Enabling our staff to use the systems that they are used to, that produce the results in a fashion they understand and can easily work with is a service that we provide. Having to re-learn a package every time it gets updated can be remarkably frustrating to power users (and this applies to Microsoft Office as well as some of the more esoteric suites) and sometimes an update can change the result of a simulation for the worse – not what one needs when one is trying to produce research material from that simulation data.

People can look at our servers and may question why we’re still running them and why they run packages that are several years old. The answer is that we’re trying to ensure that our staff have the services they want as well as the services they need.

It is possible to remove a live system after deprecating it – warn the users and watch the usage level, when it drops to zero for a month or so then take the system off-line. Then leave it there for a year (or some other length of time that has been communicated to its user base). Then mothball it. Then pull it out of storage and recommission it when an academic researcher and his postdoc try to develop a project based on an old paper. Or it could just be left ready to power-up because history has taught us that removing something that still works will require its reinstatement at some point. That point is usually when it takes quite a lot of effort and knowledge-trawling to get it back to its former state – the Law of Murphy is pretty consistent in this respect…

Where do we draw the line between working with Murphy and ruthless efficiency?