Hitting data driven home runs: How the Texas Rangers win by harnessing Prophecy in their data mesh architecture

Hitting data driven home runs: How the Texas Rangers win by harnessing Prophecy in their data mesh architecture

Learn how the Texas Rangers use Prophecy and Databricks Lakehouse as the foundation of their data mesh architecture to gain a competitive advantage with low-code data engineering.

Learn how the Texas Rangers use Prophecy and Databricks Lakehouse as the foundation of their data mesh architecture to gain a competitive advantage with low-code data engineering.

Alexander Booth
Assistant Director of R&D
Texas Rangers Baseball Club
May 25, 2023

Table of Contents

Within the Texas Rangers, we have always recognized the significance of utilizing data to gain a competitive advantage. As advanced analytics and data science became more prevalent in the sports industry, we knew that we needed to elevate our data architecture to stay ahead of the curve. However, this transformation did not occur overnight.

Looking back, it is clear that where we are as an organization today is vastly different from where we were just a few years ago, particularly regarding our data architecture. In this blog post, we will delve into how our team's data mesh architecture, which is powered by Prophecy and Databricks Lakehouse, has transformed our data management practices and allowed us to field the best team possible.

Initially, our baseball operations data environment was characterized by a legacy on-premises architecture that lacked flexibility, had limited extensibility, and was incredibly challenging to scale. When we transitioned to a cloud data warehouse, we encountered new issues such as maintenance heavy and costly data management, and data accuracy concerns. We have now transitioned to a modern data mesh architecture that uses Databricks Lakehouse and Prophecy to provide us with greater flexibility, agility, and scalability in our data management, while also enabling our team to self-serve their data needs. So what led us to make this crucial change?

Our need for collaboration

We actually run a pretty lean data operation inside of the Rangers. Our entire team is composed of 20 individuals, but they are dispersed across different domains with different responsibilities. For example, we have data analysts that focus on handling the analytics and intelligence reporting in a wide range of domains including: the amateur draft, international player pools, minor leagues, scouting; and even handling trade value optimization for any potential roster moves that might be considered. Each of these contributors work with distinct data sources and models, however, collaboration across domains is critical as, again, we are lean, and there is often overlap in responsibilities. Further, the statistics and metrics used as KPIs need to be consistent in their intelligence reporting across these different domains. Our new data mesh architecture allows for that independence within a federated governance structure. Prophecy’s team-based workflows allow for multiple users to be able to work together on the same data product or pipeline, regardless of their specific domain. And also, these data products are completely shareable and reusable, so our analysts can leverage each other's work while avoiding duplication of effort.

Adaptability and flexibility

The only constant in baseball is change. Baseball is a very fluid sport, and year over year we see massive amounts of change whether they’re rule changes, bat and ball changes, new strategies on the field, etc. Not to mention the explosion in available data sources that track everything from player movement, to weather data, to even pitching and hit tracking down to the most minuscule detail. Based on the dynamic nature of the game, it was critical to have the right environment in place to help us quickly and easily adapt to these changes, as they can create data drift in our predictive analytics and recommendation reporting that we need to take into account quickly. We need the ability to be agile with our analytics, including supporting quick development for prototyping new ideas that could immediately impact our strategy on the field. This is important when the game is changing so rapidly around us. Prophecy and our new mesh architecture allows for that agility.

Self-service analytics are critical

The area where Prophecy has helped us the most has to be how it has enabled our data analysts to work independently to create the production ready ETL pipelines that drive our self-service analytics. The ability for these contributors, who are not comfortable coding, let alone coding with Spark, to use Prophecy’s visual code environment to build high quality and performant data pipelines on their own is invaluable. This removes a whole host of bottlenecks to our self-serve analytics, as our analysts are empowered to do their own ingestion transformations instead of waiting on our data engineers. By giving them the ability to do their own ETL within our mesh framework, everyone on our team can work more efficiently doing the work that they specifically need to get done. This greatly benefits our data engineers as they can now focus on more difficult tasks, as well as use Prophecy to quickly create the more simplistic pipelines as needed. 

The results are game changing

So what are the quantifiable benefits that we’ve seen with our new data architecture? Prior to our use of Prophecy, the task of developing our big data pipelines was restricted to our lean data engineering team, which was already working at max capacity. Further, our knowledge of Spark was limited, so some problems required extensive research and code reviews to address. This ended up in a slowdown in all areas, from ingestion of new data sources, to bug fixing, typically allowing us to only resolve a couple of challenges per sprint cycle. We just were not moving fast enough to give the team the intelligence it needed to ensure a competitive advantage. But now, with Prophecy as a low-code ETL provider, we can increase our velocity for integrating new data sources, some even by 7x, going from a week to integrate down to a single day. This has led to a 10x increase in velocity with KPIs to stakeholders, thanks to Prophecy’s integration with our CI/CD systems. Now deployed pipelines can impact production environments immediately. 

Finally, Prophecy has empowered some more of our less-technical analysts to create pipelines for their specific data sources - pushing responsibility to the edge. These pipelines can then be integrated by our engineering team into the production ecosystem, providing monitoring and some governance resources. Empowering other members of our data teams to build quality ETL pipelines allows our dedicated engineers to focus on more pressing and difficult problems.

Prophecy for the win

Every winning strategy on the field is supported by a foundation of data. That data needs its own technical strategy, especially in this new age of big data in baseball. We believe that Prophecy and Databricks are necessary tools as we create this World Series caliber data architecture that fosters scalability and innovation. Baseball is a team sport, and it requires not only our data team, but a team of technologies working together to ensure the success of our data strategy.

Additionally, I recently participated in a great webinar with the good folks from Prophecy and Databricks that you can watch on-demand here where we go into some detail about: 

  • How we use data to identify and evaluate potential players for the Rangers
  • Why we chose the Prophecy low-code data platform as the data engineering foundation on our Databricks lakehouse
  • Best practices and tips for becoming a high-performance data engineering team

Oh, and if you’re interested in seeing an example of how we might use Prophecy to prepare data for a baseball analytics model, check out this fun demo video.

About the author

Alexander is a data scientist, data engineer, and application developer with extensive experience in using machine learning and artificial intelligence techniques, based on insights from big data, to communicate actionable decisions that help generate a competitive advantage for the Texas Rangers. He specializes in sports analytics with a particular passion for learning how innovation and new technology can shape the game of baseball.

Ready to give Prophecy a try?

You can create a free account and get full access to all features for 21 days. No credit card needed. Want more of a guided experience? Request a demo and we’ll walk you through how Prophecy can empower your entire data team with low-code ETL today.

Ready to give Prophecy a try?

You can create a free account and get full access to all features for 14 days. No credit card needed. Want more of a guided experience? Request a demo and we’ll walk you through how Prophecy can empower your entire data team with low-code ETL today.

Get started with the Low-code Data Transformation Platform

Meet with us at Gartner Data & Analytics Summit in Orlando March 11-13th. Schedule a live 1:1 demo at booth #600 with our team of low-code experts. Request a demo here.

Related content

PRODUCT

A generative AI platform for private enterprise data

LıVE WEBINAR

Introducing Prophecy Generative AI Platform and Data Copilot

Ready to start a free trial?

Visually built pipelines turn into 100% open-source Spark code (python or scala) → NO vendor lock-in
Seamless integration with Databricks
Git integration, testing and CI/CD
Available on AWS, Azure, and GCP
Try it Free

Lastest blog posts

Events

Data Intelligence and AI Copilots at the Databricks World Tour

Matt Turner
October 29, 2024
October 29, 2024
October 29, 2024
Events

Success With AI Takes Data, Big Data!

Matt Turner
October 7, 2024
October 7, 2024
October 7, 2024
ETL modernization

Weigh Your Options As You Move Off Alteryx

Raj Bains
November 18, 2024
November 18, 2024
November 18, 2024