Open source, analytics and the pace of change

Open source, analytics and the pace of change
I love spotting ironies such as how this years Strata | Hadoop World conference (the UK one) spent more time discussing Apache Spark and whether it was a successor to Hadoop or another tool in the box than it did discussing Hadoop and it’s applications. It was great to see members of the insurance industry there amongst the retailers and banks as well. “But wait?!?!!” I hear you say, “Hadoop isn’t all that old is it?” Herein lies the great challenge for the CIO faced with requests for open source tools. These are dynamic, social projects without the same stickiness as those legacy systems insurers spend time worrying about. Not only do users / consumers / fans of open source software shift between projects but the contributers / developers do too. With the rising use of tools like R, Python, Linux, GIT, Hadoop, Spark, Docker, Capistrano and all manner of wacky projects on the go and being adopted by insurers how should a CIO respond? Prohibition tends to lead to shadow IT and surprises down the line far more unwelcome than managing some new software. The key advice is to understand these types of projects can be more transient than other enterprise software. Experiment with them but be careful of expensive, enterprise installations that are hard to extract later down the line. In truth insurer adoption of some of these technologies will outlive the fashion for them but it still requires planning for their removal or worst case, their ongoing support. I promised analytics in the title too didn’t I? Well Spark is all about real time analytics and is having an interesting impact in the machine learning and predictive modelling space. It gets around some of the issues with interacting with Hadoop while still delivering performance. With open source projects survival of the fittest is the order of the day, far more so than in classic insurance software markets. Hadoop has it’s place, with many insurers globally investing in it.We will see new fashions in analytics approaches and more opensource tools I’m sure. Some will follow the Dodo. For those interested in Hadoop have a look at my report from 2011, when Hadoop was new and cutting edge. It seems it requires an update.