A global supplier of semiconductor equipment
The company employed multiple platforms on Databricks for data science tasks, each serving various business groups or units. The owner of one such platform sought to track the allocation of resources (e.g., Databricks Notebooks and Clusters) among different teams via a shareable dashboard for stakeholder usage and additional purposes. However, due to the intricate nature of these requirements, the REST APIs provided by Databricks did not encompass all the necessary information. The metrics needed to be grouped by business units and available for daily, weekly, monthly, and quarterly periods. To summarize, the company wanted to go beyond Databricks APIs to build a dashboard with the following requirements:
Our IA team studied the problem and recommended the implementation of Overwatch, a tool developed by Databricks, for analyzing log data from Databricks workspaces.
Databricks Overwatch is a powerful monitoring and alerting solution designed to provide insights into the performance, cost, and usage of Databricks workspaces and clusters. Overwatch offers granular details such as pipeline performance, cost, ingress, and egress data. It can assist the company by optimizing data-driven decision-making. It allows the company to capture workspace activities through structured datasets.
The team collaborated closely with company stakeholders to ensure Overwatch's successful deployment and integration across all relevant platforms. Additionally, our engineers designed the solution to be extensible, making it suitable for use with multiple workspaces.
The new system enables user activity logging through Databricks Event Hub integration and extracts the logged data on clusters, notebooks, account logins, and jobs using Overwatch. Extracted data is structured in the form of delta tables to be used for dashboard creation and further analysis.
Here are a few highlights of the expanded system: