Introducing the Prophecy Generative AI Platform
Introducing the Prophecy Generative AI Platform
Build powerful gen AI applications on private enterprise data in just hours
Build powerful gen AI applications on private enterprise data in just hours
Table of Contents
As the enthusiasm for generative AI intensifies, many enterprise data users are eager to actively participate and develop impactful applications of their own. They recognize the power of ChatGPT, which intelligently responds to queries based on publicly available Internet data.
But can ChatGPT also provide insights into a company's internal HR policy? Can it be used to establish an initial line of customer support, answering queries on behalf of an organization’s support team and serving as a self-service portal for customers?
Until now, the answer to those questions has often been “no.”
That’s because, while generative AI is a powerful new primitive, how to use it within an enterprise is not so obvious. To build impactful, generative AI enterprise applications, you must effectively integrate various, private information sources with the intelligence of large language models (LLMs)—all while preserving data security. And that’s not an easy task.
That’s why, when we engage with organizational leaders in data, they often have numerous questions and uncertainties regarding how to begin. Here are the initial inquiries we frequently hear:
- What’s the potential of this enterprise generative AI app? (Answering this question completely involves identifying the potential use cases that can enhance the productivity of their existing team.)
- How do I implement it? (Answering this question completely involves identifying all the necessary tools needed to construct internal generative AI applications.)
- What kind of team do I need, and what skills are required to develop internal generative AI applications?
- How long will it take to build my first generative AI application?
- How can we mitigate risks? Specifically, how can we safeguard against biases, security risks, data misuse, data leaks, prompt injection and jailbreaking?
The cumulative outcome of these concerns is often AI paralysis, where enterprises struggle to initiate the development of their initial application.
Some excellent blogs on this topic have been written, like this one on emerging architectures for LLM applications from a16z. But unfortunately, these blogs cater to investors and engineers in the ecosystem—they don’t offer the necessary simplification and actionable insights data practitioners like you need.
In this blog, we intend to close that gap by answering the real-world questions practitioners ask and presenting a straightforward pathway for you to create your first high-value, secure generative AI application—app development you can start AND finish within just one week.
What’s the potential—what can I even do with generative AI?
Common enterprise generative AI use cases include:
- Support bots: Use generative AI to enable your support team—and, eventually, your customers—to ask questions based on various internal sources (product documentation, Slack messages, support tickets, etc.).
- Personalization engines: Use generative AI to deliver personalized marketing content that’s more specifically tailored to individual customer profiles.
- Knowledge summaries: Use generative AI to create holistic knowledge summaries by quickly retrieving relevant information from a wide variety of private, unstructured documents (e.g., HR policies, legal documents, etc.).
But that’s just the beginning. The number of ways you can use generative AI in your enterprise is virtually limitless. And, once you gain just a little bit of experience with it, you’ll see many more use cases for it quickly surface.
So, to understand the potential of generative AI for your unique enterprise, you need to start by answering two key questions:
- First, in the context of our enterprise, which use cases can we most effectively tackle using generative AI?
- Second, which of those use cases is the most suitable one to tackle first—one where we can demonstrate fast success and initiate then cultivate the organizational skills and expertise we need in this domain?
How do I implement generative AI?
To successfully implement generative AI applications, you must first decide which gen AI model approach you should use: building custom models, specializing existing models or providing relevant, private data as prompts to the models. Selecting the best approach for interacting with gen AI models will drive the approach and tools you need for your application.
Now, let's proceed with exploring in detail the process of choosing a model approach.
Picking which generative AI model you want to use
When considering how to utilize generative AI models, the main challenge lies in effectively combining the intelligence of a generative model like GPT-3.5 with the internal knowledge specific to your enterprise—private data the model is unaware of. But once you’ve decided your paradigm for interaction with generative AI models, it greatly simplifies the problem at hand, allowing you to instead focus on the specific technologies and steps required for your chosen approach.
When evaluating each generative AI model approach, it’s crucial to consider the:
- Time and effort it requires
- Quality of the results you’ll obtain
- Implications for data safety and security
Keeping these three considerations in mind, let's explore the options available to you for generative AI models:
- Building your own model: We don’t recommend this approach because generative AI models excel at a wide range of problems, which should eliminate the need for you to build specialized models (like what we saw in previous machine learning use cases like fraud detection). Building your own model is also prohibitively costly and resource-intensive.
- Specializing an existing foundation model: We don’t advise this approach either. Unfortunately, some foundation model startups and tool vendors might suggest creating and managing your own models based on existing models such as GPT-3.5. But this approach involves embedding your enterprise's private data into the model, which raises data security concerns and can potentially lead to data leaks. It’s also a cumbersome and expensive approach, one that should only be considered after exhaustively exploring prompt engineering techniques. Which leads us to our next point…
- Leveraging prompts: What we recommend instead is that you rely on prompts for most of your generative AI applications. That’s because generative AI’s foundation models keep getting better, and their prompt sizes (i.e., the length of text you can enter into ChatGPT) keep growing. By pairing one of these continuously improving foundation models with your relevant private information and asking the question via the prompt itself, you can create simple, cost-effective, fast and secure generative AI apps. It’s a better approach that’s suitable for 90% of applications.
Building a generative AI application based on the prompt engineering approach
The prompt engineering approach for generative AI applications involves three essential components:
- Knowledge warehouse: The knowledge warehouse stores unstructured data sources such as documents, Slack messages and support tickets and serves as the repository for the relevant information utilized by your generative AI model.
- ETL pipeline: You need an Extract, Transform, Load (ETL) pipeline to build and maintain your knowledge warehouse. It extracts data from various sources, performs necessary transformations and loads transformed data into the knowledge warehouse to keep information up-to-date and accurate on a daily basis.
- Application integration: The final component involves building an application that seamlessly combines your generative AI model with the relevant enterprise data stored in the knowledge warehouse in order to answer user queries.
Let’s look at each component in more detail.
1. Building a knowledge warehouse
Vector representations play a crucial role In the realm of storing and searching text documents. These representations—known as vector embeddings—are a series of numerical values that serve as representative vectors for each text document. Vector embeddings make it possible to find similar or relevant documents through a process called “closest vector search.”
A knowledge warehouse serves three main purposes:
- Document storage: The knowledge warehouse is responsible for storing documents along with their corresponding vector embeddings to ensure relevant information is readily accessible.
- Document search: The knowledge warehouse leverages vector embeddings to efficiently search for similar or relevant documents based on a given vector. It’s a way to retrieve documents that closely match specific criteria or requirements.
- Indexing: Indexing is a technique employed to enhance search speed within the knowledge warehouse. (It's important to note that indexing for vector-based searches is an immature area that’s still under development.)
Several options exist for implementing a knowledge warehouse, including vector databases (e.g., Pinecone, Weaviate, Milvus) or commonly used, cost-effective, open-source search engines like Elasticsearch.
2. Populating a knowledge warehouse with ETL on unstructured data
To populate the knowledge warehouse with relevant data, it’s essential to establish an ETL pipeline that operates regularly, such as once a day. This ETL pipeline retrieves data from various sources (e.g., Slack messages, support tickets, documentation) and transforms it into a suitable format for efficient storage and retrieval using a vector database. Because much effort will be spent on data transformation and processing, the quality of this process will have a large impact on the quality of the final product.
Since Apache Spark offers exceptional capabilities in handling unstructured data, we recommend you combine Spark with Prophecy's low-code platform—integrated with Databricks—to achieve an enterprise-grade solution capable of constructing the kind of ETL pipeline you need for generative AI applications.
3. Building a generative AI application that uses the knowledge warehouse
Two primary, technical approaches exist for building these applications.
- Application approach based on Python libraries
- Data pipeline approach based on a streaming ETL inference pipeline
a. Application-based approach
The first approach is to utilize custom Python libraries that have recently emerged for building applications such as LangChain and LlamaIndex. These libraries package the set of functionality required for building a complete application. Unfortunately, they often re-invent functionality where more mature alternatives exist. For instance, they:
- Connect various sources of data to the LLMs. A series of steps for connecting data to the LLMs can be done with a new python library, but this is literally the definition of what ETL is—replacing mature processing engines such as Apache Spark for ETL with a new Python library such as LangChain makes no sense.
- Enable language models to interact with their environment using “agents.” You can use “agents” that interact with the environment with a chain of steps passing data in between. But is there anything new about such a series of steps? This is similar to using a streaming ETL pipeline which has components to interact with the environment, and more critically components to transform and clean the data from interaction with the environment, and chaining of components in a sequence to achieve complex tasks. You can use a library such as LangChain here or a streaming data pipeline - both will work.
- Offer orchestration capabilities. You can use a new library such as LangChain or LlamaIndex for orchestration, but products that are much more mature already exist in the ETL ecosystem (e.g. Apache Airflow). Just because you’re doing AI does not mean you need a new flimsy library to replace mature pieces of your stack
New patterns might eventually surface that make these libraries the best solutions for specific use cases, but such clarity in thought hasn’t yet emerged. In fact, the following summaries right from the websites of LangChain and LlamaIndex show their primary capabilities are ones where existing solutions offer better, more sophisticated capabilities.
b. Streaming pipeline-based approach
An alternative to building a custom Python application is to leverage a streaming pipeline—Prophecy on Databricks—for inference purposes. And it’s a much better approach.
That’s because Prophecy on Databricks already provides components designed for interacting with live applications, environments, data processing, transformations and cleansing. It supports the chaining of tasks—including the chaining of prompts—to facilitate the seamless execution of complex workflows. It also offers orchestration capabilities via Databricks workflows or by using Apache Airflow.
This mature approach will enable you to build production-grade applications your enterprise can rely on. It’s also much simpler and faster to use.
How do I implement a generative AI app?
Watch the video below to see exactly how we quickly and easily implemented our own support bot at Prophecy that consists of the following components:
- Knowledge warehouse: We utilized Pinecone as our repository to store the relevant documents from various data sources (e.g., Slack messages, support tickets, public product documentation, internal product knowledge base, product roadmap information, etc.).
- Batch ETL pipeline: We developed a batch ETL pipeline to regularly populate and update our knowledge warehouse. This involves transforming text-based data, so it’s in the appropriate format for storage and retrieval and sensibly compiled (such as a series of Slack messages) in order to deliver high-quality results.
- Streaming ETL inference pipeline: The support bot incorporates a streaming ETL inference pipeline that takes user input, searches for relevant documents in the knowledge warehouse, sends the question (along with the pertinent documents) to the generative AI model for processing and returns the answer.
- Slack channel: Users can access our support bot through a dedicated Slack channel via a convenient interface where they can ask and receive answers to their questions.
A support bot implementation like ours typically requires one data engineer to dedicate just one week of effort. It’s a far superior alternative to going down the rabbit hole of Python libraries that essentially just re-invent this functionality.
Conclusion
By employing the prompt engineering approach, establishing a robust knowledge warehouse, maintaining an ETL pipeline that keeps the knowledge warehouse up to date, and developing integrated applications powered by streaming inference pipelines, you can finally give your organizational users the ChatGPT-like functionality for private, enterprise data they want.
So the only question left to ask yourself is this: what kind of generative AI application do YOU want to build today?
Getting started with Prophecy Generative AI
To get started with Prophecy Generative AI and explore more information, you can:
- Visit the solution page
- Attend our webinar with Kevin Petrie from the Eckerson Group on July 20th
Ready to give Prophecy a try?
You can create a free account and get full access to all features for 21 days. No credit card needed. Want more of a guided experience? Request a demo and we’ll walk you through how Prophecy can empower your entire data team with low-code ETL today.
Ready to give Prophecy a try?
You can create a free account and get full access to all features for 14 days. No credit card needed. Want more of a guided experience? Request a demo and we’ll walk you through how Prophecy can empower your entire data team with low-code ETL today.
Get started with the Low-code Data Transformation Platform
Meet with us at Gartner Data & Analytics Summit in Orlando March 11-13th. Schedule a live 1:1 demo at booth #600 with our team of low-code experts. Request a demo here.