LlamaIndex: Train ChatGPT (& other LLMs) on Custom Data

LlamaIndex: Train ChatGPT (& other LLMs) on Custom Data

The syllabus covers a wide range of topics, from the basics of LLMs and LlamaIndex to advanced concepts like data agents and integrations.

The course starts by introducing you to LlamaIndex, a powerful tool for connecting data sources with LLMs.

You’ll learn how to set up the environment, work with OpenAI’s API, and understand the core concepts like RAG (Retrieval Augmented Generation) and vector embeddings.

As you progress, you’ll dive deep into indexing techniques, exploring data loaders, node parsing, and vector stores like Chroma.

The course also covers querying methods, including multi-index search, chat engines, and natural language querying of databases like SQLite.

One of the standout features is the focus on LlamaIndex data agents.

You’ll learn about AI agents, their real-world applications, and how LlamaIndex agents can automate problem-solving and decision-making tasks.

The course includes hands-on examples of creating OpenAI agents with recursive retrievers and multiple indexes.

Throughout the course, you’ll work on practical projects, such as building a chatbot interface with Streamlit and integrating it with your LlamaIndex chat engine.

The syllabus also covers advanced topics like customizing prompts, managing token usage, and choosing the right embedding models.

The course is designed to be interactive, with live coding sessions, debugging tips, and personal experiences shared by the instructor.

You’ll learn how to handle dynamic data, track changes, and persist your indexes for efficient development.

LlamaIndex- Develop LLM powered applications with LlamaIndex

LlamaIndex- Develop LLM powered applications with LlamaIndex

You’ll start by setting up your development environment with PyCharm and installing dependencies like the LlamaIndex and LangChain frameworks.

The course covers the core concepts of LlamaIndex, such as creating your first index, understanding the Retrieval Augmentation Generation (RAG) theory, and working with vectorstores and embeddings.

You’ll learn how to load and chunk data into LlamaIndex nodes, ingest them into a vectorstore like Pinecone, and use the QueryEngine for retrieval and augmentation.

One of the projects you’ll build is a documentation helper application that uses LlamaIndex to search and retrieve information from the LlamaIndex documentation itself.

This hands-on project will teach you how to set up the development environment, load data, create a vectorstore, and build a Streamlit frontend.

The course also dives into the internals of LlamaIndex, explaining how retrieval and augmented generation work under the hood.

You’ll learn about LlamaIndex ReAct Agents, which are LLM-powered knowledge workers that can intelligently perform tasks over your data, call external APIs, and modify data using a reasoning loop and tool abstractions.

Additionally, the course covers prompt engineering theory, including zero-shot, few-shot, chain of thought, and ReAct prompting techniques.

You’ll learn how to craft effective prompts to guide language models and improve their reasoning abilities.

The troubleshooting section addresses common issues you might encounter while developing LLM applications, and the course also touches on LangChain, another popular framework for building LLM applications.

Query Your Custom Documents using LlamaIndex

Query Your Custom Documents using LlamaIndex

You’ll start with an overview of LLMs and LlamaIndex concepts, learning about in-context learning and how it differs from fine-tuning.

The course will also cover pricing and the internal workings of LlamaIndex applications.

Next, you’ll learn how to set up your environment, including installing LlamaIndex and the OpenAI package, and how to obtain and update your OpenAI API key.

This is crucial for connecting to external data sources like text files, documents, and PDFs.

The course dives deep into building base programs to read and process different types of documents.

You’ll learn about creating Document objects, working with text files, and handling PDFs and Word documents.

Concepts like SimpleDirectoryReader and VectorStoreIndex will be covered, allowing you to verify the sources of responses and manage indexes effectively.

Indexing and document management are key topics, with hands-on lessons on recursively processing files from directories, persisting indexes, and performing insert, delete, and update operations on indexes.

You’ll also learn how to refresh indexes in real-world scenarios with multiple folders.

Customizing LLM models is a crucial aspect covered in the course.

You’ll learn about changing the underlying LLM model, controlling parameters like max_tokens, and customizing prompts for better results.

The course teaches you how to expose your LlamaIndex application as an API, enabling integration with endpoints.

You’ll also learn how to enable streaming responses, similar to ChatGPT, and how to build a chat interface to converse with your data using the chat engine.

Vector databases like ChromaDB and MongoDB are introduced, allowing you to build LLM applications with efficient storage solutions.

Token prediction and cost analysis are covered to help you optimize your applications.

Finally, you’ll learn how to build user interfaces for your LLM applications using libraries like Chainlit and Streamlit, including chat applications with and without streaming responses.

Throughout the course, you’ll work with hands-on examples and exercises, ensuring you gain practical experience in building LlamaIndex applications.