In search of a data scientist? If you know where to look, it might not be such a problem to find one. I have written about the shortage of data scientists before and what I find most interesting about this problem is that more and more things point towards the fact that one possible solution is already here.
The first part of this solution can be found in online education tools such as Coursera – founded by computer science professors Andrew Ng and Daphne Koller from Stanford University – that teaches big data related courses. Another part of the solution can be found in data scientist communities such as Kaggle that combine crowdsourcing and gamification elements and are turning data science into a sport. In case of the latter, I came across a interesting way of using the Kaggle community.
EMC’s Greenplum division (that builds data analytics software ) is joining forces with Kaggle to produce a Big Data engineer marketplace. Chorus, a Greenplum product, offers a data science tool for internal data workers. However this does not necessarily mean that a company has the right people to solve specific data problems. That’s where the Kaggle community comes in.
Chrorus users will be able to search and examine the profiles of Kaggle users – based on rank, expertise and location – who have participated in Kaggle’s online data competitions. These companies can then hire these data scientists by the hour to solve their data problems. Kaggle has registered 55,000 data scientists that are able to tackle problems such as unstructured text data, graph data, missing values in data sets.
So, instead of employees just working with each other, they can call in a Kaggler with a few clicks to help them solve a problem at any time. It’s like working with software that has thousands of pre-installed data scientists. It creates some flexibility to the data science problem: companies can tap into a pool of talented data scientists when needed, without the need to hire anyone extra. Also, it’s really difficult to sense how good a data scientist actually is. Data science requires a range of skills and from a CV it’s quite difficult to assess how good a data scientist actually is. But anyone who spots a Kaggle data scientist that has risen to the top of the leaderboard knows that they’re getting someone more than capable to do the job.