With what professionals should I go when I want to promote a Big Data initiative? This is the most common question that arises when the director or CEO of a company plans to implement a department that optimizes and achieves a competitive advantage through projects that involve Big Data Management. While it is true that even the technological level is in the process of consolidation, the uncertainty is higher from exploitation. Items like what equipment do I need and what their roles and responsibilities should be, need to be defined.
The modern version of IPython Notebook is Jupyter Notebooks, an amazing tool for work. Read more!
There has been a great evolution in the consolidation process of the different solutions that make up the Big Data architectures, both for the reading process as well as for the processing processes, up to storage.
That has been the first obstacle to the adoption of Big Data. Once overcome this first barrier, the second one appears. Perhaps the most important one from the business point of view is: how to exploit all this great mass of data?
This, on the other hand, has needed a not so expensive infrastructure comparatively speaking to the costs associated with the traditional infrastructures of the DataWarehouse, but they did need it from the profiles that have to take charge of this new infrastructure based on solutions very innovative.
Associated with this question is where the differentiation of roles appears. The role that data engineering performs, the “Data Engineer”, and the role that research performs, the data science, the “Data Scientist”.
Are they the same? Do they do the same? Do they have equivalent backgrounds? Obviously, they have points of intersection, but they are very orthogonal profiles, which cover different objectives and involve different levels of depth of knowledge required by their domains of experience. That is, finding that the same person can offer Big Data solutions, interpreting them and modeling this data is very complicated.
This search is not only complicated, but it is even a perjury. A lot of search time is lost, and in case of finding this person, the focus on both fronts is not the most adequate to arrive with the projects to the results expected. There can even be a risk that this superposition of roles cannot fulfill.
To know in detail the characteristics of each of these roles, the “Data Engineer” and the “Data Scientist”, below is a table that describes each of the roles and details their skills, languages and / or tools and their background, where you can see that there is a relationship but that in turn have a marked difference.
To have a Big Data project that meets the objectives and is successful, it is necessary to have the project team resources that can cover these roles.
Further reading on new trends in Business Intelligence world point to the use of Big Data as a tool to get more information about customers in our blogpost: Big Data in Telecoms.