Is data really the new oil?
I’d say it is if you believe that data science and analytics represent the fourth industrial age. Oddly enough, when I was here last fall, I met with the minister of interior and his team, and we talked about data science and analytics – what is right and wrong about it. In the end, he said to me, ‘data is the new oil.’ I take it from a reputable source that is, in fact, true and I do believe it. Ultimately, without data, this world is going to be very challenging.
Data science is still like a mysterious science for IT leaders because it wasn’t around when most of them graduated. Do you think that it is a research skill that CIOs don’t have?
The emergence of the citizen data scientist has changed the optics on data science in general. There was a time, even five years ago, you couldn’t talk about algorithms to anyone but PhDs and statisticians immersed in IT. Today, we see routinely analysts building sophisticated data models to do important things within their companies.
The enigma around data science has dissipated, and it is no longer about guys in dark rooms with thick glasses doing PhDs. It has come out of shadows, and we have built an intuitive platform that makes it easy to do to work – even guys like me can build things like regression models and deploy them without much effort. I think the market has changed and will continue to change until we see data scientists working only on edge cases while everyone else is going to work on data that keep organisations profitable.
Are you suggesting that any corporate executive without coding experience can build data models?
Nike used to have an old saying – if you have a body, you are an athlete. I’d say, if you have a brain, you are a citizen data scientist. It is incumbent upon software vendors to leverage user interface and user experience to eliminate or minimise the friction between the human and computer. And if you do that, you would have amplified human intelligence.
How do you plan to democratise data science?
Most of it is about a platform like ours, which makes it easy for people to engage. After smartphones came out, many people asked me this question – how do I know if I am a data scientist’? My answer was, “if you are downloading apps, you are a data scientist in some ways.” The people who are creating data are the ones asking all the questions, and if you make it as easy as this smartphone experience, they will engage. The harder part of the problem is corporate culture, which is something not in our control. This is a make-or-break moment for Chief Digital Officers around the world because if they can’t change the company culture to one of collaboration between haves and have-nots, they are probably not going to last for long.
Is there any reason why you use R instead of popular Python as the programming language?
We have pre-configured 40 odd popular algorithms and incorporated them as tools in our platform. So, when you drag down a linear regression model or K-means model, you shouldn’t have to go and write it again because it already exists. We also support Python, and over the last five years, the growth of Python has outstripped R. But, it turns out, for the most part, scientists use both for different use cases and what we have delivered is a platform that allows other people to innovate by adding their tools and algorithms.
How about other languages such as Julia and Knime?
We won’t make the mistake of allowing those languages to reside in Alteryx until they are proven. Scala is probably the next one we will embrace, especially when it comes to large data sets through Spark. Even that, is too early, and you have to remember data science isn’t necessarily to the advantage of the first mover.
Any plans to support natural language processing?
It is one of those emerging technologies, which is exciting and cool, but it still hasn’t been proven yet. I’d love to be able to have NLP and NLG, translating complicated things into simple words. But not everyone is as adept as the next person in understanding what to do with a model score.
Discussion about this post