Data science departments often use older technologies that were in place when they launched. But the new data scientist generation is using newer technologies such as R, Python… How can you solve the challenge of old vs new technology?
Information Solutions Getting Old
Data science department frequently implement older technologies for statistical analysis, such as SAS and SPSS. These solutions were in-place when established data science department initially launched — the learning curve for these older technologies, particularly given their age and inherent complexity, is significant. Most new graduates, however, are branded as “Data scientists” but their skill-sets are rooted in newer technologies, such as R, Python, Spark, Pig, Apache Hive, etc.
The end-result are two sets of “Data scientists,” both representing different generations of statistical analysis methodologies. The challenge of old vs. new technology has exacerbated in recent years due to the growth of the data science industry coupled with the need to hire new talent.
How to Design Your Big Data Technology
From an HR standpoint, there are essentially three paths available, each with their own respective pros and cons:
Abandoning old technologies and switching to new technologies
In this situation, the data science department changes its approach to development by abandoning older technologies (e.g., SAS, SPSS) in favor of newer options. This enables data science departments to hire new data scientists that can onboard quickly and become productive with little downtime. Conversely, changing the core architecture of a data science lab has its own ramifications to both existing employees and the development process as a whole. By catering to newer technology, existing employees face the challenge of updating their skill-sets.
Keeping Old Technologies And Training New Hires
With this option, the opposite approach is implemented: older technologies are retained and new hires are trained to use them. As mentioned, these older options are both complex and robust... they’ve simply been around longer and, consequently, require an advanced skill-set in order to gain proficiency. The immediate benefit of this approach is that, unlike switching to a new approach, there is no disruption to your lab’s productivity. The downsides revolve around the new hire learning curve and the possibility of becoming an antiquated data science laboratory over time.
Keeping old technologies and pursuing new technologies (Hybrid Approach)
A third approach is a combination of the above options: keeping the old and using the new technologies in parallel. In this scenario, established employees are given the freedom to continue development using older technologies while new employees are allowed to develop using the new technologies. In other words, nothing is sacrificed and both paths are pursued at the same time.
How to define your hybrid technology approach?
Obviously a data science department needs to hire people in order to grow, so this challenge cannot be avoided. At the beginning point of a data science department’s evolution, they have to prove the value of their existence — once they have delivered on their first project, the demand (and need to hire new employees) will increase.The solution is to implement a tool that enables all parties, regardless of skill level and expertise, to work together.
In a competitive market, a data science department can only survive if it can reliably deliver results. This means using a tool that is workflow-centric while supporting meaningful collaboration between all employees.
How to define this ideal tool? You will find in our ebook, "Building a Successful Datalab", a complete methodology to help you design this hybrid tool, that will let you combine old and new data technologies.