Published on Feb 2, 2024

Using Canopy, a RAG framework by Pinecone

The Canopy RAG Framework is a tool developed by the Pinecone team to simplify the process of building and managing robust Retrieval-Augmented Generation (RAG), which are often complex to build. Key features of Canopy include the ability to chat within a terminal to compare RAG and non-RAG outputs, thus evaluating the performance of RAG pipelines efficiently.

Before starting, ensure you have Python and pip installed on your system, and you have created the necessary API keys for Pine Cone and OpenAI.

Set Up a Virtual Environment (Optional):

Create a virtual environment:

python3 -m venv canopy-env

Activate the environment:

source canopy-env/bin/activate

Set environment variables for Pine Cone and OpenAI API keys:

Set PINECONE_API_KEY, PINECONE_ENVIRONMENT, and INDEX_NAME with your

actual Pinecone API key and environment.

Activate the environment:

Set OPENAI_API_KEY with your OpenAI API key.

Install the Canopy SDK:

Install the package: pip install canopy-sdk

Check that installation is successful, and the environment is set, run: canopy

Output should be similar to this:

Canopy: Ready

Usage: canopy [OPTIONS] COMMAND [ARGS]…

# rest of the help message

Create a New Canopy Index:

Run canopy new and follow the CLI instructions to create a new Pinecone index configured to work with Canopy.

Uploading Data:

Load data into your Canopy index using the canopy upsert command with the path to your data (supports jsonl, parquet, csv, and .txt formats). Ensure the document fields comply with the required schema (id, text, source, metadata).

You can use this Notebook to explore one way to upload data to the index

https://colab.research.google.com/github/pinecone-io/examples/blob/master/learn/generation/canopy/00-canopy-data-prep.ipynb

Start the Canopy Server:

Start the server with the canopy start command. This exposes Canopy’s functionality via a REST API, including document upload, retrieval of relevant documents for queries, and a chat interface.

You can now start using the Canopy server with any chat application that supports a /chat.completion endpoint.

Evaluate RAG Responses

To interactively chat with your text data, run canopy chat in a new terminal window, setting the required environment variables beforehand.

To compare responses with and without RAG, use canopy chat — no-rag which shows both RAG and non-RAG responses side-by-side.Test the RAG pipeline with different queries, including complex ones with multiple search terms.Observe how Canopy handles the queries and provides responses with and without RAG.

In conclusion, the Canopy RAG Framework, an innovation by Pinecone, stands as a game-changer in the realm of Generative AI applications. It empowers developers to easily and rapidly build and host robust, production-ready chat assistants of any scale. Key features include:

Ease of Use

Canopy simplifies complex processes like chunking, embedding, and query optimization, allowing developers to focus on building and experimenting with RAG applications.

Cost-Effectiveness

It offers free storage for up to 100K embeddings, sufficient for substantial text data, with scalable paid options for larger needs.

Flexibility and Extensibility

The framework’s modular nature enables integration into existing applications or development of custom solutions, adapting to diverse use cases.

Interactive Evaluation

The built-in CLI chat application facilitates side-by-side comparison of RAG and non-RAG workflows, enhancing iterative development and evaluation.

As Canopy evolves, it promises to support more data formats, LLMs, embedding models, and other advancements, making it an essential tool for developers in the rapidly expanding field of Generative AI.