Use Spark interims to troubleshoot and polish low-code Spark pipelines: Part 1

Let’s take advantage of Spark’s interim metadata to understand our Spark job behavior with low-code tooling.

Anya Bida
Assistant Director of R&D
Texas Rangers Baseball Club
April 22, 2023

Let’s take advantage of Spark’s interim metadata to understand our Spark job behavior with low-code tooling. The Spark UI shows me some nice metrics for job completion time, num rows read, num rows written, and some related details...

...but I want to know how my pipeline behaves over time.

Ok, but manually checking for pipeline success is not a viable goal. I need testing and alerting!

Historical metadata gets super handy when I want to compare my pipeline runs using multiple Spark versions. Check out Part 2 of this blog where we troubleshoot individual dataframes.

How can I try Prophecy?

Prophecy is available as a SaaS product where you can add your Databricks credentials and start using it with Databricks. Or you can use an Enterprise Trial with Prophecy's Databricks account for a couple of weeks to kick the tires with examples. We also support installing Prophecy in your network (VPC or on-prem) on Kubernetes. Sign up for your 14 day free trial account here.

Ready to give Prophecy a try?

You can create a free account and get full access to all features for 21 days. No credit card needed. Want more of a guided experience? Request a demo and we’ll walk you through how Prophecy can empower your entire data team with low-code ETL today.

Ready to see Prophecy in action?

Request a demo and we’ll walk you through how Prophecy’s AI-powered visual data pipelines and high-quality open source code empowers everyone to speed data transformation

Get started with the Low-code Data Transformation Platform

Meet with us at Gartner Data & Analytics Summit in Orlando March 11-13th. Schedule a live 1:1 demo at booth #600 with our team of low-code experts. Request a demo here.

Related content

PRODUCT

A generative AI platform for private enterprise data

LıVE WEBINAR

Introducing Prophecy Generative AI Platform and Data Copilot

Ready to start a free trial?

Visually built pipelines turn into 100% open-source Spark code (python or scala) → NO vendor lock-in
Seamless integration with Databricks
Git integration, testing and CI/CD
Available on AWS, Azure, and GCP
Try it Free

Lastest blog posts

Events + Announcements

5 Takeaways from the Gartner Data & Analytics Summit

Matt Turner
March 20, 2025
March 20, 2025
March 20, 2025
March 20, 2025
March 20, 2025
March 20, 2025
Events + Announcements

Self-Service Data Preparation Without the Risk

Mitesh Shah
March 27, 2025
March 27, 2025
March 27, 2025
Data Strategy

The Five Dysfunctions of a Data Team

Lance Walter
March 6, 2025
March 6, 2025
March 6, 2025
March 6, 2025
March 6, 2025
March 6, 2025