AgentCreator Concepts

Agent

An Agent is a system of LLM-based interactions that acts autonomously. An Agent can perform tasks, make decisions, and interact with its environment to accomplish a task. Agents can interact with external tools, APIs, and data sources to collect information or initiate actions based on user requests. An Agent can also learn from interactions and improve over time, fine-tuning its outputs to align with user preferences and specific contexts

Tool Calling

Tool calling is a capability that enables Agents to interact with external functions, APIs, and services. As a result, Agent can perform actions other than text generation.

  1. An Agent receives a user request requiring external action.

  2. It maps the request to predefined tools or functions.

  3. The Agent selects the most appropriate tool based on the user intent.

  4. The selected tool executes its function through an API or integrated process.

  5. The Agent processes the returned data and generates a meaningful response

Tool calls are often structured as JSON objects, specifying the tool name and required parameters. To integrate effectively with external systems, Agent developers can define schemas that an LLM can reference. Advanced implementations enable Agents to call multiple tools in sequence or parallel, enabling complex automated processes

Retrieval Augmented Generation (RAG)

Retrieval-Augmented Generation (RAG) is a technique in machine learning that combines the power of pretrained language models with the benefits of information retrieval systems to enhance the generation of text.

Hybrid Approach
Merges language models with retrieval systems.
Function
Enhances text generation by first retrieving relevant document excerpts and then conditioning the language model to generate outputs based on this retrieved information.
Application
Used in question-answering, chatbots, and anywhere where contextually rich and accurate text generation is critical.
Advantage
Provides more informative and contextually relevant text by leveraging a vast corpus of information beyond the language model's pretraining data.

The following diagram illustrates the flow of the user prompt using RAG.

Think of RAG as a smart helper for the LLM to create text. RAG first finds useful bits of information from a large set of documents and then uses them to help write new sentences that make sense and are full of meaningful details. It helps make the text more interesting and full of actual facts because it's using information from predefined sources of truth.

Vector Databases

A vector database is a type of database designed to efficiently store and query data in the form of vectors, which are arrays of numbers that represent complex data points like images, sounds, and text in a multidimensional space. It works by converting rich, unstructured data into vectors using algorithms called embeddings. These embeddings capture the essence of the data in a form that can be mathematically compared. The database then indexes these vectors in such a way that it can quickly retrieve items whose vectors are closest to each other, indicating high similarity or relevance.

Vector databases are particularly useful in applications requiring high-speed similarity searches, such as content recommendation systems, image and audio retrieval platforms, and machine learning model queries where the relationships between data points are more significant than the data itself.

The main advantage of a vector database is its ability to perform what's known as nearest neighbor or distance searches. This allows for finding the best matches in large data sets almost instantaneously, which is a challenge for traditional databases when dealing with high-dimensional data.

Embedding

Embedding is a technique used in machine learning and data processing to transform complex, high-dimensional data, such as text or images, into a lower-dimensional space in the form of vectors. This transformation simplifies the data while preserving its essential characteristics, making it easier to process and analyze.

Chunking

Chunking is the process of dividing large data sets or extensive text inputs into smaller, more manageable segments or chunks. This method is crucial for improving the efficiency, accuracy, and scalability of data processing and machine learning tasks.

RAG leverages chunking to enhance the vector storage/embedding. It divides large texts or data sets into chunks, allowing the model to retrieve relevant information from these chunks to generate more informed and accurate responses.

Multimodal

Systems that can process and generate content across multiple types of data or media, such as text, images, audio, and video. Multimodal LLMs can integrate and understand information from various sources. By processing multiple data types, these multimodal LLM-based Agents can arrive at an understanding of context.