Gen-AI Developer Classroom notes 16/Aug/2025

Usecase: Making ai applictions which are aware of your private data

RAG augments a LLM with external knowledge source(database, documents or API calls)
Instead of depending on only whats in the model’s parameters, this system retrieves releavent information at query time and fees it into LLM’s context
How it works:
- User queries ==> converted into embedding
- Embedding search => retrieve relevent chunks from vector database
- Retrieved data => injected into LLM Prompt (Context Window)
- LLM generates the respnse grounded in both its pretraining & the retrieved info.
Pros:
- Easy to update (adding or removing documents ) – no training required
- keeps responses accurate and upto date
- Avoid hallucinations by grounding answers in real data
- cost-effective compared to fine tuning
Cons:
- Dependent on retrieval quality (garbage in -> garbage out)
- Limited by context window size
  Best Use-cases:
- Dynamic or frequently updated knowledge (eg regulations, product catalogs, research paper, defects …)
- Domain Specific Q&A (Customer support bots, internal company knowledge assistants)
- Multi-document reasoning (chat with pdfs, textbooks or code bases)
- Easy to prototype where requirements chnage often

Fine-tuning means adapting the base LLM’s weights on additional training data so it learns domain specific patterns, styles or tasks
How it works:
- Collect task-specific dataset (instruction, response, classification, reasoning steps etc)
- Train the base LLM with this dataset
- The models parameters are updated, making the knowledge/patterns native to the model
Pros:
- Encodes knowledge directly -> no need to retrieve every time
- Faster inference
- Can captutre domain-specific reasoning styles (medical diagnosis, legal logic)
Cons:
- Requires high quality curated dataset
- Expensive (compute + data labelling)
- knowledge becomes stale unless retrained
- Risk of overfitting
Best Usecases:
- Stable domain knowledge that doesnt change frequently (medical terminology, legal contracts)
- custom tone/style (brand-specific writing style)
- Highy specialize tasks
- When retrieval isn’t feasibel
When to use what

Use RAG first (As it is cheaper)
Move to fine tuning only when you need stable reasoning, style or low latency specilization
Combine both for enterprise grade systems

You need a chatbot that can answer queries like
- where is my order
- whats your refund policy
- Show me latest iphone accessories available
Brainstorm:
- knowledge type: FAQ, Policies, product catalog, order transcation history -> they change frequently
- Style: Needs to be polite
Bestfit: RAG

Context: An assitant helps employees and managers to understand internal policies, contracts and HR rules. Typical queries
- Am i eligible for paternity leave
- What are security requirements for handling client type
- Can i expense this type of travel
Brainstorm
- knowledge type: Company polcies, contract, HR manuals -> Often update
Bestfit: RAG

Context: We are building an assistant that
- summarizes earnings calls
- Extracts KPIs
- Tracks guidance changes, risks catalysts from news
- Produces analyist style memos with citations
Bestfit: Hybrid (RAG + Tools + Light finetuning)