Events + Announcements

Prophecy takes in $47M, to scale up and reimagine data integration with AI

Raj Bains

Assistant Director of R&D
Texas Rangers Baseball Club
‍

January 16, 2025

February 14, 2025

I’m pleased to share that Prophecy today announced a $47M Series B1 round. Smith Point Capital led the round, with HSBC joining as a new investor and participation from existing investors including Berkeley SkyDeck, DallasVC, Insight Partners, JPMorgan Chase and SignalFire. We’ve made progress quickly, doubling the business since the original Series B, and see a tremendous opportunity ahead with increasing demand for data infrastructure for analytics and AI.

It’s a good time to reflect on the major milestones from our startup journey, as we’re filled with optimism and excitement for the year ahead!

Spotting a crucial problem

Starting out, the first thing we needed to do was find a high-value problem to solve. My co-founders and I explored various ideas for new AI and data applications. But it seemed like every idea we explored had a critical dependency on clean, high-quality data. When we looked at what enterprise customers were doing with their data, we saw so many with large and growing backlogs. We saw data requests from business teams where the mismatch in data “demand” and “supply” was causing months-long backlogs. These were large enterprises with money, tools, teams and time, and yet they consistently struggled to reliably get value from their data. Data integration as a category had been around for two decades, but no one had really solved this problem.

We decided to go after this foundational opportunity and re-imagine data integration.

The first test - trial by finance fire

In a critical first step on this journey, we won the opportunity to modernize data transformation at one of the largest payment networks in the world. Their data team was struggling with a legacy ETL tool. The team wanted to take advantage of the scale and speed of Spark but was faced with the high cost and scarcity of data engineers. We secured a seed round and started a project to convert their legacy ETL pipelines into Spark pipelines. We learned from their enterprise data team what it took to support complex pipeline development at a tremendous data scale.

These key requirements emerged:

Visual pipelines

Developer productivity: Developers said that while they could write code, their productivity significantly improved with a visual design environment. This approach allowed them to view transformation logic, preview data, see production errors, and get performance stats—all within the same visual pipeline.
Rapid onboarding: Team leaders highlighted challenges with turnover and onboarding new team members. They also faced difficulty editing existing pipelines that were unfamiliar to current team members. Visual pipelines proved invaluable for users of different skill levels to quickly understand and work with these pipelines.
Collaboration: Code-only environments also created challenges for other roles. Metadata architects lost easy access to lineage, requiring multi-month efforts to meet regulatory compliance. Similarly, production support teams struggled to fix and redeploy quickly when faced with high volumes of inconsistent code written by different developers. Visual pipelines helped bridge these gaps, enabling broader usability and collaboration across teams.

The Power of Code

However, architects in the data platform teams extolled the virtues of code.

Flexibility: Code provides unmatched customization and adaptability, meeting the demands of enterprise-grade workflows.
Reliability: Adopting software best practices—open-format code, version control, tests, and CI/CD—ensures robust and maintainable pipelines.
Performance and scale: As the engineers at our prospective customer pointed out to us, “nothing scales better at runtime than well-formed code.”

We experienced firsthand one of the defining tradeoffs of the Data Integration market. Code is fast and flexible and engineers are scarce and expensive. Visual design is productive and collaborative but can limit flexibility, performance, and scale in large-scale enterprise environments.

Visual pipelines + code: The best of both worlds

Our unique approach was to attack this false choice and develop a solution that merged the strengths of each approach.

Open, standardized Spark code: The visual design environment had to generate Spark code that conformed to objective best practices, enabling the adoption of software engineering standards.
Customizable visual components: Enterprises needed a way to consistently apply unique and sometimes proprietary transformations and integrations. Rather than adding dedicated transformations for every situation, it was clear that our customers needed to be able to extend the visual layer by writing code for new components and to create internal standards and connectors that scale across teams.

Our holistic approach to solving the data transformation challenge paid off. We were able to migrate thousands of pipelines from legacy ETL to the much more efficient Spark. We had proven our product approach and secured Series A funding.

Era of the modern data stack - following customers on a “contrarian” journey

The second milestone was for us to take those hard-won lessons and apply them to multiple mission-critical projects for large enterprise customers across retail, healthcare, finance, and insurance.

Our founders were deeply engaged with those early customers and at times we felt like “contrarians.” There was market buzz around product-led growth, but our enterprise customers weren’t asking us to prioritize delivery of a freemium model or to engineer our product for self-service onboarding as so many startups were doing.

Instead, we stayed focused on the next round of customer needs:

Enabling more users

Data engineering teams often have large backlogs, primarily driven by requests from the business teams. These business teams would provide a requirements document, or build data pipelines in legacy desktop data prep tools and ask for them to be rewritten as code and put into production. Customers wanted to empower less technical users to better distribute the work, burn down their backlogs, and get data to users faster.

Modernization and reuse

To make sure all those users weren’t leaving valuable pipelines behind, our customers told us they needed solutions not only for new development but also for consolidating and modernizing their existing infrastructure. It was also clear that we could accelerate our business and help customers if we could reduce the migration challenge. To address this, we developed transpilers (source-to-source compilers) capable of accelerating a customer’s migration and lowering switching costs. We released our Migration Copilot to speed the migration process and apply automation to business logic conversion from legacy ETL platforms.

As we continued to develop to customer requirements, the modern data stack was attracting attention, including from the investor community. Point solutions focused on product led growth seemed to pop up in every niche of the data ecosystem. I was in more than one meeting where a highly successful, application-oriented investor suggested “riding the MDS wave.” But that wasn’t what our customers were asking for.

I was fortunate to find and recruit investors from enterprise backgrounds who understood our customers’ needs and our vision and wanted to back us. Having lived through the Hadoop wave, I expected this trend to fade quickly—and it did. As we deployed our unified solution to unlock data transformation at more and more enterprises, attention on the modern data stack began to fade.

Today, the enterprise trend is clear: unified solutions that fully leverage the cloud and AI are prevailing. Open-format lakes like Iceberg and Delta dominate storage, Databricks leads in data processing, and a similar shift is now happening in tooling and productivity.

Building a startup to meet customer demand in the changing environment of the last few years required us to remain agile and move quickly. We successfully navigated COVID buying freezes, the over-exuberance of 2021 followed by a wave of investor and market hesitation in the aftermath, the modern data stack trend, and the product-led growth wave. Through it all we stuck to our plan and vision, delivering for customers.

But the data space never stays still for long. We’re now navigating the next challenge: the AI-driven change that is set to cause major shakeups in the data integration market.

The new era of generative AI

The next step is to take our re-imagining of the data integration space to the next level with AI while scaling up our team to help organizations investing in data to take advantage of AI. A Series B of $35M, followed by a Series B1 of $47M now enables us to pursue both without compromise.

Re-thinking data integration

As genAI has brought renewed focus to data, it is also changing how we deliver solutions and forcing us to ask questions.

A recent conversation with a large enterprise revealed this possibility. They are not happy with the cost, complexity, or velocity of their current approach:

Business users develop requirement documents in Word format to describe what data they need computed and how each of the columns is to be generated from source data, with English descriptions.
Data analysts then take requirements and develop data pipelines using legacy data prep tools that pull data from their Spark data platform and transform it for business needs.
Data platform teams develop code for production, but due to long backlogs have to keep hiring more and more data engineers, fighting an expensive and losing battle.

The question here is - to solve this, are we building a better horse instead of a car? In this old way of thinking, the same logic would need to be written three times. Do we make it incrementally better, or re-imagine it?

Accelerating to the future

With our expertise in compilers and AI, can we not re-think how data integration is done - fundamentally changing how quickly and efficiently enterprises can provide data for analytics and AI?

We have exciting times ahead of us as we scale and re-imagine the space simultaneously. We’re very thankful to our investors and early customers who have trusted us on this journey. Be on the lookout for exciting product updates this year!

Ready to give Prophecy a try?

You can create a free account and get full access to all features for 21 days. No credit card needed. Want more of a guided experience? Request a demo and we’ll walk you through how Prophecy can empower your entire data team with low-code ETL today.

Ready to see Prophecy in action?

Request a demo and we’ll walk you through how Prophecy’s AI-powered visual data pipelines and high-quality open source code empowers everyone to speed data transformation

Get started with the Low-code Data Transformation Platform

Meet with us at Gartner Data & Analytics Summit in Orlando March 11-13th. Schedule a live 1:1 demo at booth #600 with our team of low-code experts. Request a demo here.