Qubole to Databricks migration

Problem we found

A subsidiary of a major Fortune 100 Pharmaceutical Company provides robotic devices to healthcare facilities for diagnosis and treatment of lung cancer and kidney ailments.

Customer Vision

  • Process Robotic telemetry data from lab and hospital based deployments to optimize diagnostics capabilities for finding and treating tumors and other chronic ailments 
  • Timely processing of telemetry data is critical to find optimal traversals, find faults, complaints and build optimized learning models to improve the robotic equipments

Technical Pain

  • Complex integration with other sub systems and scalability issues (petabytes of data)
  • Processing data was error prone and lack of existing vendor support
  • Building new ML models and BI Dashboards was time consuming

Solution we implemented

  • Took on End to End implementation of the entire project. 
  • Migrated all jobs from Qubole to Databricks. 
  • Ingested telemetry data and made the data available in silver and gold layers. 
  • Strong collaboration between Computomic,  Databricks Professional Services  & Account team, and the Customer to generate new use cases for ML and DB SQL, in addition to building complex Data Engineering pipelines
  • Highly collaborative communication allowing all teams to work transparently and successfully to exceed customer expectations
  • Advisor and thought leader for all things Databricks at the customer
Saving in Compute costs annually

Positive Outcomes

  • Reduced spend by 77%, saved $1M in compute costs annually
  • 10X data processing performance improvement, from ~8 hours to <45 mins 
  • Accelerated robotics optimization 
  • Simplified the Development and Deployment workflow
  • Simplified the Airflow integration 
  • Implemented Delta as single source of truth for mix of BI and ML workloads

“It has been great working with this team. They have brought in the expertise needed to help us migrate an extremely complex system involving Qubole, Airflow and custom Kubernetes  Serverless functions. ‍Migrating to Databricks has allowed our Data Scientists to get to data quicker for extracting information required to optimize our robots in the field performing medical procedures and better report on faults, complaints and regulatory audit processes” Snr Data lead