Skip to main content

Development of a Powerful Data Science Team

Data science has undergone an increasing professionalization and standardization during recent years. The frequently intrinsically motivated data tinkerer and diddler, who fills the niche "analysis" in his business with very high company-internal data and process know-how, is reaching his limits.

Increasing demands, especially in the course of a stronger customer focus across all industries, force businesses to professionalize the structures in the area "data science": This includes knowledge, available data sources and their preparation and data science products already used in the business

Professionalization of Data Science.

Thus, we experience the request for consultancy services regarding the development of a powerful data science team more and more frequently in our everyday consultancy work. In the course of the development of an organization, the question quickly arises of how to institutionalize the one-man show described above and thus integrate it into the business's network and/or its organization chart.

The question whether to bundle the data science competence in one team or decentrally distribute it among various departments is decisive in this process. This blog article will focus on this question.

Surely, this is only one question of relevance, but it should be clarified early on. At the end of this article, further questions relevant in this context will follow, such as e.g. the location on the organization chart: should the potential team be a staff unit or should it be included in a department?

The Applications

Which objective is pursued by the development of a data science team? The answer to this question is critical for its orientation: Will the data science team be centralized in a department or does one install individual data scientists in the respective departments?

When defining the objective and/or the data science product, the difference between ad hoc analysis vs analytical application is decisive:

Ad hoc analysis: An analysis is static and not used productively and/or not integrated into the context of an application. In this context, a classical example would be a customer segmentation which resulted from a snapshot of the customer base at a given point in time. Action recommendations are derived, implemented and evaluated at a later context. However, this does not happen continually.

On the other end of the continuum, the analytical application: The segmentation is used to categorize customers who visit the website and thus individualize the user experience. The information required for this purpose can be processed at runtime. However, the customer of a data science department does not always need to be an end customer; thus, there is the possibility to make interactive applications available only for a small circle, e.g. on the intranet. An interactive visualization of a scenario calculation is an example.

De(-Centralization)

Thus, the two extremes are a data science competence centers on the one hand, and department-internal data scientists on the other hand, with their respective advantages and disadvantages.

Centralization in a data science competence center

 

advantages-disadvantages-centralization

 

Decentralization of the data science resources in the respective departments

 

advantages-disadvantages-decentralization

 

Conclusion

As soon as the analytical objective of an organization further develops towards `application´ on the axis `application vs. data products´, centralization within a data science competence center is required. On the one hand, this allows an intensified specialization which is required when developing an application: In addition to the competence to understand the business model, to prepare data and create and implement models, software development skills should also exist within the team. This requires totally separate skills, both in respective front end languages and perhaps also in the technical architecture, and the understanding of software development lifecycles. This comprehensive task portfolio and thus requirement profile for respective skills can hardly be met by one employee alone.  

The biggest criticism of a central unit is the allegedly missing understanding for the business processes in a specific department, e.g. marketing: "How can someone who usually does logistics optimization, suddenly handle bid management" is a frequent objection. In order to provide the department with a competent contact person, who can also serve as sparring partner, a specialization within the data science team is required.

Thus, a continuous development of the data scientists is ensured and the freedom required for modelling is obtained by the team structure. In practice, we frequently experience that good data scientists are not able to fulfil their actual tasks but primarily fulfil short-term analysis and reporting tasks.

As the lines show, the successful development of a data science team is no easy; but it is a very critical decision for the business. The decision for the right organization form, which ideally reflects the objective of my organization, is only one of the relevant issues: Sooner or later, questions arise such as: "Which processes and methods and/or standards are expedient?" or how does "project management work" in the context of "working with uncertainty" - however, more about that in upcoming blog articles.