F1GPT: Retrieval-Augmented Genration Chatbot

Introduction: Why I Built an AI Assistant to Learn Formula One

When my friends kept insisting I should watch Formula One, I found myself completely lost. The sport seemed incredibly complex - with technical regulations, driver rivalries, team strategies, and decades of history that everyone else seemed to know by heart. As someone who learns best by building, I decided to create my own AI-powered Formula One assistant to help me navigate this fascinating world.

What started as a simple learning project became F1GPT - a sophisticated Retrieval-Augmented Generation (RAG) chatbot that can answer any Formula One question, from basic rules to complex technical regulations.

Live Demo: https://f1gpt.ankitml.com

The Problem: Information Overload in Formula One

Formula One isn't just a sport - it's an ecosystem of engineering, strategy, history, and human drama spanning over 70 years. When I first tried to understand F1, I was bombarded with technical jargon like DRS and ERS, complex regulations, rich history, and current dynamics that assumed prior knowledge.

Existing solutions fell short because general AI chatbots had knowledge cutoffs, F1 official apps focused on timing rather than education, fan websites used technical language, and search engines returned overwhelming amounts of unstructured information.

I needed a solution that could access comprehensive F1 information, understand beginner questions, provide contextual responses, and learn over time.

Understanding RAG: Why Traditional Chatbots Weren't Enough

Large Language Models like GPT-4 or Gemini have fundamental limitations for specialized domains. They suffer from knowledge cutoffs, can hallucinate information, lack real-time updates, and provide generic responses without domain-specific depth.

Retrieval-Augmented Generation (RAG) solves these problems by combining two systems: a retrieval system that searches through curated F1 knowledge, and a generation system that uses an LLM to create natural responses based on retrieved data.

User Question → Embedding → Vector Search → Relevant Documents → LLM + Context → Response

This ensures responses are grounded in factual data, information stays current, context is domain-specific, and hallucinations are minimized.

Project Architecture: Building a Cost-Effective Solution

As a student developer, cost-effectiveness was crucial. I designed an architecture that maximizes capability while minimizing expenses.

Architecture Flow

Frontend (Next.js + TypeScript)

↓

API Routes (Next.js API)

↓

RAG Pipeline:

├── Embeddings (Gemini text-embedding-004)

├── Vector DB (Astra DB)

└── LLM (Gemini 2.0 Flash)

↓

Data Sources (Web Scraping)

This architecture provides scalability through serverless functions, cost-effectiveness with minimal ongoing costs, maintainability with clean separation of concerns, and speed through edge deployment.

Data Collection: Scraping the F1 Universe

The success of any RAG system depends on the quality and comprehensiveness of its knowledge base. I identified several categories of F1 information sources.

Scraping Implementation

I built a comprehensive scraping system using Playwright for robust, browser-based data extraction. The system targets main content areas while removing unwanted elements like ads and navigation, extracts clean text while preserving structure, and handles dynamic content loading with proper wait conditions.

The data processing pipeline includes text cleaning to remove JSON objects and normalize formatting, content chunking into 512-character segments with 100-character overlap, and quality filtering based on content length, F1 keyword density, and source reliability.

Embeddings Strategy: Gemini text-embedding-004

I chose Gemini's text-embedding-004 model for several compelling reasons. The free tier provides generous limits for development, the model delivers state-of-the-art performance with 768 dimensions that balance quality and efficiency, and the API integration is simple and reliable.

The cost analysis was clear: Gemini embeddings offered zero ongoing costs for embeddings, no infrastructure requirements, consistent performance with reliable API response times, and scalable pricing that only charges when exceeding free tier limits.

Vector Database: Astra DB for Scalability

Astra DB emerged as the optimal choice for vector storage and similarity search. Built on serverless Cassandra with vector search capability, it supports 768-dimensional embeddings from Gemini, uses cosine similarity for optimal F1 content matching, and provides global distribution for low-latency access.

The business benefits include a generous free tier with 5GB storage and 20M operations, simple setup with minimal configuration, excellent documentation with clear integration guides, and production-ready enterprise-grade reliability.

Database Schema

F1Document {

_id: string

$vector: number[768] // Gemini embedding

text: string // Original content

url: string // Source URL

category: string // Content type

timestamp: string // Scrape date

}

AI Integration: Gemini 2.0 Flash

Gemini 2.0 Flash offered the best combination of performance and cost for chat generation. The model provides excellent quality comparable to GPT-4, a large context window of 1M tokens, fast response times optimized for speed, and strong instruction following capabilities perfect for RAG prompts.

The RAG prompt engineering involved defining an F1 expert persona, injecting relevant retrieved documents as context, providing response formatting guidelines, and including fallback instructions for handling missing information gracefully.

Building the RAG Pipeline

The complete pipeline transforms user questions into comprehensive F1 answers through a five-step process.

Pipeline Flow

graph LR

A[User Question] --> B[Generate Embedding]

B --> C[Vector Search]

C --> D[Retrieve Documents]

D --> E[Assemble Context]

E --> F[LLM Generation]

F --> G[Stream Response]

Step 1: Query Processing - The system receives and ormalizes the user's question, preparing it for embedding generation.

Step 2: Embedding Generation - Gemini text-embedding-004 creates a 768-dimensional vector representation of the query.

Step 3: Vector Search - Astra DB performs cosine similarity search to find the top 5 most relevant F1 documents.

Step 4: Context Assembly - Retrieved documents are combined and truncated to fit within the model's context limits while maintaining

relevance order.

Step 5: Response Generation - Gemini 2.0 Flash generates a streaming response using the assembled context and original question.

Optimization strategies include caching frequent query embeddings, relevance scoring based on F1 keyword density, context diversity from varied source types, and fallback mechanisms for edge cases.

UI/UX Design & Deployment

The F1GPT interface captures the excitement and precision of Formula One while remaining accessible to beginners. The design system employs racing-inspired colors (racing red, carbon black, telemetry blue), Orbitron typography for tech aesthetics, glass morphism effects for modern surfaces, and Framer Motion for smooth animations.

For deployment, Vercel was chosen for its seamless Next.js integration, serverless scaling, global CDN, and straightforward CI/CD from GitHub, ensuring fast and reliable delivery of the F1GPT application.

Conclusion: From Beginner to F1 Enthusiast

Building F1GPT transformed me from someone who knew nothing about Formula One into a genuine enthusiast who can follow races, understand technical discussions, and appreciate the sport's complexity. More importantly, it demonstrated the power of building to learn - sometimes the best way to understand a domain is to create tools that help others learn it too.

Technical Achievements

The project resulted in a production-ready RAG chatbot with comprehensive F1 knowledge, cost-effective architecture serving thousands of queries, modern responsive UI rivaling commercial applications, and scalable data pipeline processing multiple F1 sources.

F1GPT has helped dozens of newcomers get started with Formula One, providing a judgment-free environment to ask basic questions and learn at their own pace. By open-sourcing the project, I hope to help other developers learn RAG implementation, provide a template for domain-specific chatbots, and inspire others to build learning tools for their interests.

This project taught me that the best way to learn something complex is often to build something that helps others learn it too. Whether you're interested in Formula One, RAG systems, or just building cool projects, the combination of curiosity, modern tools, and persistence can lead to amazing results.

Ready to explore F1GPT yourself?

Try it live: https://f1gpt.ankitml.com
View the code: https://github.com/ankitk75/f1-gpt

Welcome to the exciting world of Formula One! 🏎️

Try it live:

View the code:

GitHub - ankitk75/f1-gpt

Back

Page updated

Google Sites

Report abuse