Cloudera to Databricks migration

Problem we found

A division of a major MedTech Company had no way to increase the capacity of its existing data platform for enhanced services and adding new use cases.

Customer Vision

  • All on-prem and legacy data platforms were being iterated for cloud first & price/ performance.  Multi-cloud was a requirement, so it came down to Databricks vs. Snowflake. Databricks was selected
  • Replace the legacy Cloudera estate which was on-prem, manual and had a relatively high total cost of ownership (TCO) with a modern data architecture that allowed for near real time processing of data and analyze using SAS and Tableau.

Technical Pain

  • No real time availability of data & reporting, with the analytics team relying on old technology and incomplete data
  • Data was stuck in different systems which could not communicate with each other
  • Processing data was slow and manual
  • No ability to add new use cases and enhanced services with existing legacy technology

Solution we implemented

  • End to End implementation of the entire migration from Cloudera to Databricks, completing on time and budget.
  • Reviewed current architecture and Cloudera Jobs
  • Pilot migration of 10 Cloudera transformation jobs that serve as reference for rest of the migrations
  • Full migration of 58 Cloudera transformation jobs in total
  • Converted Scripts Sanity & Data Comparison Testing
  • Highly collaborative communication allowing our teams to work transparently and successfully to deliver project in less than 3 months
Of data silos removed

Positive Outcomes 

  • Removed of 100% of data silos
  • Integrated the Vision Data Analytics application with the customer’s Central Data Layer allowing easy integration with new data sources 
  • Enabled the wholesale migration of other apps using Cloudera and Teradata
  • Enabled the launch of  new use cases without  the need for upfront planning for hardware capacity
  • Near real time availability of data and reporting 
  • Significant performance improvement due to faster processing
  • Significantly lower TCO 
“I am delighted to announce the successful completion of the Cloudera Migration to Databricks. A special thank you goes out to the Team, for their dedication and expertise. Their role was pivotal in ensuring smooth communication and cooperation throughout the project: Sandeep Arabatti, Aaditya Mishra, Jayraj Perumal and Jerry Lee. Thank you all for your hard work, dedication, and contributions to making this project a success. I look forward to more opportunities for collaboration and success in the future.” Snr Director