ELT is not the disruption — data engineering is!

ELT is not the disruption — data engineering is!

The disruption is agile software practices in data engineering made usable for the many. Allow me to explain.

The disruption is agile software practices in data engineering made usable for the many. Allow me to explain.

Raj Bains
Assistant Director of R&D
Texas Rangers Baseball Club
October 20, 2020
February 14, 2025

Table of Contents

The disruption is agile software practices in Data Engineering made usable for the many. Allow me to explain:

Prophecy is focused on Enterprise Data Engineering. We see a wide gap between what enterprise customers need, and the startup & VC ecosystem - with irrational exuberance - this time about bottoms up ELT taking over the world. Having been through the NoSQL and Hadoop waves before, I want to give a cautionary note and explain what’s actually going on.

For context, I was the product manager of Apache Hive at Hortonworks through its IPO and talked to 100+ Enterprises using SQL for ETL. As Prophecy CEO, I’ve talked to 100+ Enterprises using various ETL products. I’ve seen my share of inscrutable SQL scripts (try getting performance with complex transforms), and equally gnarly data processing code. Technically, I’m an expert in Compilers, PL and Databases.

Data Engineering!

Data Processing Landscape

These are interesting times in the data space, Spark and Snowflake are well established and converging to the same feature set that is required by Data Engineering today

  • SQL & Transactions are a must have: SQL is declarative and productive. Transactions, change data capture and merges are essential.
  • Tabular Data & SQL is not enough: Data Engineering requires a lot more than Tabular data, there is data from Json, documents, images and one needs non-SQL transforms for data engineering and machine learning.

Define ELT & ETL

  • ELT means that you Load all your data in Data Warehouse in tabular format and then use a set of SQL queries to Transform it, finally merging into my target tables.
  • ETL means a processing engine does Transforms and then Loads data into Data Warehouse as the final step. The processing engine can be very powerful where you can write code, have code versioning, configurations, resolved configs, rules engines. There are SQL operators for productivity as well.

AbInitio for example is an excellent ETL product in large Enterprises where every user I have talked to loves the product.

Omg! Omg! ELT taking over the world - bottoms up disruption!!

We’re cheering for dbt as a fellow startup/product adding great value. It provides SQL with agile software development practices. You load all your raw data directly into data warehouse and do transforms there. For this to work, the raw data is small and there is no machine learning or complex data or complex transforms - not the world we live in.

The reverse is happening

To succeed in data engineering for Enterprises with massive data sets, and complex use cases - complex data, complex transforms and machine learning - Snowflake is moving beyond SQL with Snowpark, and the tooling will follow. SQL Only is a losing battle in data engineering outside of simple use cases. Snowflake is now building a closed source Spark.

Data Engineering: The Real Disruption!

Basically, ETL/ELT has lagged software engineering in agile development techniques.

Data Engineering is the move to code-first development, with agile practices - git, tests, continuous integration and continuous deployment. It’s bringing data pipelines out of closed source, boxed software products into mainstream development.

Now we have to solve the key Data Engineering problem - Usability. Code is too hard for many users whereas Visual or SQL development makes it more accessible. Here are the solutions:

  • Dbt is getting traction in startups with SQL editor and agile practices.
  • Prophecy is getting traction in the Enterprise with unique IDE that has both Visual and Code development with SQL, Scala and Python support. All the users can simply develop high quality Spark code with agile practices
  • Prophecy also provides metadata, lineage, observability, performance debugging, scheduling - features critical to the Enterprise.

That’s pretty much it! The disruption is agile software practices in Data Engineering made usable for the many.

Ready to give Prophecy a try?

You can create a free account and get full access to all features for 21 days. No credit card needed. Want more of a guided experience? Request a demo and we’ll walk you through how Prophecy can empower your entire data team with low-code ETL today.

Ready to see Prophecy in action?

Request a demo and we’ll walk you through how Prophecy’s AI-powered visual data pipelines and high-quality open source code empowers everyone to speed data transformation

Get started with the Low-code Data Transformation Platform

Meet with us at Gartner Data & Analytics Summit in Orlando March 11-13th. Schedule a live 1:1 demo at booth #600 with our team of low-code experts. Request a demo here.

Heading

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.

Heading 1

Heading 2

Heading 3

Heading 4

Heading 5
Heading 6

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.

Block quote

Ordered list

  1. Item 1
  2. Item 2
  3. Item 3

Unordered list

  • Item A
  • Item B
  • Item C

Text link

Bold text

Emphasis

Superscript

Subscript

Ready to give Prophecy a try?

You can create a free account and get full access to all features for 21 days. No credit card needed. Want more of a guided experience? Request a demo and we’ll walk you through how Prophecy can empower your entire data team with low-code ETL today.

Related content

PRODUCT

A generative AI platform for private enterprise data

LıVE WEBINAR

Introducing Prophecy Generative AI Platform and Data Copilot

Ready to start a free trial?

Visually built pipelines turn into 100% open-source Spark code (python or scala) → NO vendor lock-in
Seamless integration with Databricks
Git integration, testing and CI/CD
Available on AWS, Azure, and GCP
Try it Free

Lastest blog posts

Data Engineering

Survey says… GenAI is reshaping data teams

Matt Turner
February 21, 2025
February 21, 2025
February 21, 2025
Events + Announcements

3 Ways to Connect with Prophecy at the Gartner Data & Analytics Summit!

Mitesh Shah
February 18, 2025
February 18, 2025
February 18, 2025