Local LLM RAG Apps With Ollama, DeepSeek-R1, and SingleStore

Local LLM RAG Apps With Ollama, DeepSeek-R1, and SingleStore

In a previous article, we explored how to use Ollama and DeepSeek-R1 with SingleStore for a simple example. In this article, we’ll build on that example by working with a PDF document from the internet. We’ll store the document and its vector embeddings in SingleStore, then use DeepSeek-R1 to identify blockchain investment opportunities. The notebook … Read more

AI-Driven RAG Systems: Implementation With LangChain

AI-Driven RAG Systems: Implementation With LangChain

Retrieval-augmented generation (RAG) is revolutionizing artificial intelligence by combining powerful generative AI models with sophisticated information retrieval systems. This comprehensive guide explores foundational concepts essential for understanding RAG, including information retrieval, generative AI models, embeddings, and vector databases, followed by a detailed, practical step-by-step implementation using LangChain. Understanding these fundamentals and their practical application through … Read more

Breaking the Context Barrier of LLMs: InfiniRetri vs RAG

Breaking the Context Barrier of LLMs: InfiniRetri vs RAG

Large language models (LLMs) are reshaping the landscape of artificial intelligence, yet they face an ongoing challenge — retrieving and utilizing information beyond their training data. Two competing methods have emerged as solutions to this problem: InfiniRetri, an approach that exploits the LLM’s own attention mechanism to retrieve relevant context from within long inputs, and … Read more

Combining RAG and AI Agents

Combining RAG and AI Agents

Editor’s Note: The following is an article written for and published in DZone’s 2025 Trend Report, Generative AI: The Democratization of Intelligent Systems. Enterprise AI is rapidly evolving and transforming. With the recent hype around large language models (LLMs), which promise intelligent automation and seamless workflows, we are moving beyond mere data synthesis toward a more … Read more

Build Multimodal RAG Apps With Amazon Bedrock and OpenSearch

Build Multimodal RAG Apps With Amazon Bedrock and OpenSearch

Scenario Customer support tickets with screenshots, technical documentation with diagrams, and a mountain of legacy PDFs — all containing valuable information, but impossible to query efficiently. “There has to be a better way,” I thought. That’s when I dove headfirst into the world of multimodal retrieval-augmented generation (RAG). The Multimodal Revelation Like many developers, I … Read more

Building an Agentic RAG System from Scratch

Building an Agentic RAG System from Scratch

In this post, we’ll explore the concept of Agentic RAG, its architecture, and why this powerful combination is reshaping the future of AI systems. Plus, we’ll walk through implementing a basic version of an Agentic RAG system from scratch! What Is RAG and Agentic RAG? To start, let’s clarify what RAG is. Retrieval-augmented generation (RAG) is … Read more

Financial Data and RAG Usage in LLMs

Financial Data and RAG Usage in LLMs

The importance of integrating artificial intelligence in finance stems from its ability to process vast amounts of data at unprecedented speeds, enabling financial institutions to make more informed decisions and improve operational efficiencies.  To understand what AI brings to the table, let’s first delve into the basics of AI as per recent trends. Let’s start … Read more

Enhanced Monitoring Pipeline With Advanced RAG Optimizations

Enhanced Monitoring Pipeline With Advanced RAG Optimizations

Observability Integration Observability is the cornerstone of reliability and trust in any production-grade retrieval-augmented generation (RAG) pipeline. As these systems become more complex — handling sensitive data, supporting real-time queries, and interfacing with multiple services — being able to trace and measure each step of the data flow and inference process becomes critical. From retrieving … Read more

AI-Powered Professor Rating Assistant With RAG and Pinecone

AI-Powered Professor Rating Assistant With RAG and Pinecone

Artificial intelligence is transforming how people interact with information, and retrieval-augmented generation (RAG) is at the forefront of this innovation. RAG enhances large language models by enabling access to external knowledge bases, providing highly accurate and context-aware answers.  In this tutorial, I’ll guide you through building an AI-powered assistant using RAG to create a smarter, … Read more

Building a Simple RAG Application With Java and Quarkus

Building a Simple RAG Application With Java and Quarkus

Introduction to RAG and Quarkus Retrieval-augmented generation (RAG) is a technique that enhances AI-generated responses by retrieving relevant information from a knowledge source. In this tutorial, we’ll build a simple RAG-powered application using Java and Quarkus (a Kubernetes-native Java framework). Perfect for Java beginners! Why Quarkus? Quarkus provides multiple LangChain4j extensions to simplify AI application … Read more