An Introduction to Retrieval Augmented Generation PyData NYC 2024

An Introduction to Retrieval Augmented Generation
.ical

11-06, 13:20–14:50 (US/Eastern), Winter Garden

Large Language models are trained on a lot of public information and may not be accurate for custom data.

Retrieval Augmented Generation (RAG) allows using LLMs for your custom data.

In this session you'll be introduced to components RAG and build a simple RAG in Python on some youtube videos.

Retrieval Augmented Generation (RAG) is a powerful technique to leverage Large Language Models on your custom data.

To motivate this example, we will be building RAG on some youtube videos.

Learning Objectives

What are the components of RAG?
How to generate embeddings for videos?
How to store and retrieve content using vector search?
How to prompt LLMs to answer contextual questions?
How to build using Llama Index?

Agenda

Problem Statement
Motivation for RAG
Extracting information from the videos
Chunking information in video
Generate embeddings
Embedding Retrieval
MultiModality in LLM

By the end of this session, the attendee should feel comfortable building a E2E for their use case.

Tools Used: LlamaIndex, OpenAI

Setup:

OpenAI Api key Setup Page
Google Colab

Please make sure , you can use Colab and OpenAI api before joining

Resource

https://github.com/npatta01/pydata_rag_video

Prior Knowledge Expected –

Previous knowledge expected

nidhin pattaniyil

Machine Learning Engineer working on Search

Ravi Kumar

Data Science @ Walmart, Ex-Bank of America

An Introduction to Retrieval Augmented Generation .ical 11-06, 13:20–14:50 (US/Eastern), Winter Garden

Learning Objectives

Agenda

Setup:

Resource

An Introduction to Retrieval Augmented Generation
.ical

11-06, 13:20–14:50 (US/Eastern), Winter Garden