LearnwithVishnu
Basics → Production → Architect
← Home
🤖AI Learning
BeginnerPractitionerAdvancedArchitectLLMs, MLOps, RAG, AI Agents, MLOps pipelines, AI in DevOps — from awareness to production deployment
What is AIHow LLMs WorkMLOpsRAGAI AgentsAI in DevOpsRoadmap

🤖 What is AI — Beyond the Chatbot

AI is not just a chatbot — it is an entirely new computing paradigm

When most people say "AI" they mean ChatGPT or Gemini — conversational bots. That is one small application. Artificial Intelligence covers: machine learning, deep learning, computer vision, NLP, reinforcement learning, and large language models. Each of these has been transforming industries for years before ChatGPT made it visible to everyone.

AI BranchWhat it doesReal application
Machine LearningLearns patterns from data to make predictionsFraud detection, medical diagnosis, recommendations
Deep LearningMulti-layer neural networks for complex patternsImage recognition, speech-to-text, translation
NLPUnderstanding and generating human languageChatbots, document summarisation, sentiment analysis
Computer VisionUnderstanding images and videoFace recognition, defect detection, self-driving
LLMsLarge Language Models — predict next token in textChatGPT, Claude, Gemini, GitHub Copilot
Generative AICreates new content — text, images, code, audioMidjourney, DALL-E, Stable Diffusion
Reinforcement LearningLearns by trial and error with rewardsAlphaGo, robotics, RLHF for LLMs
MLOpsRunning ML models in production reliablyModel serving, monitoring, retraining pipelines

🧠 How LLMs Actually Work

The mechanics behind ChatGPT, Claude, and Gemini

A Large Language Model is a neural network with billions of parameters trained to predict what token (word fragment) comes next. Training: ingest trillions of tokens from the internet and books, adjust billions of parameters to predict better. Result: a model that has compressed human written knowledge into its weights.

The 3-stage training process

  1. Pre-training — raw internet text, predict next tokens. Billions of examples. Weeks on thousands of GPUs. Cost: $50M-100M for frontier models.
  2. Instruction Tuning (SFT) — fine-tune on human-written Q&A pairs. Teaches the model to answer helpfully, not just autocomplete.
  3. RLHF — humans rank model responses. Reward model learns human preferences. LLM trained via RL to score higher. This is what makes Claude/GPT behave as assistants.

Why LLMs hallucinate — the core limitation

LLMs do not look things up. They predict plausible-sounding text based on patterns. For frequent training data topics: reliable. For obscure facts, recent events, specific numbers: they generate confident-sounding wrong answers. Fix: use RAG (Retrieval-Augmented Generation) to ground answers in real documents.

Key concepts

  • Context window — how much text the model processes at once (4K to 2M tokens). Larger = more expensive but can reason over more data.
  • Temperature — 0 = always same answer (deterministic). 1+ = creative, varied. Use 0 for code, 0.7 for creative writing.
  • Embeddings — convert text to vectors. Similar meaning = similar vectors. Used for semantic search and RAG.
  • Tokens — 1 token ≈ 4 characters. Pricing is per token. "tokenisation" = 2-3 tokens.

⚙️ MLOps — Running ML in Production

MLOps = DevOps for Machine Learning. Your DevOps skills transfer directly.

DevOps conceptMLOps equivalent
Code version controlModel + dataset versioning (DVC, MLflow)
CI/CD pipelineML pipeline: data → train → evaluate → register → deploy
Container registryModel registry (MLflow, Hugging Face Hub)
Kubernetes deploymentModel serving on K8s (KServe, Triton, vLLM)
Application monitoringModel monitoring: accuracy drift, data drift, bias
A/B testingChampion/challenger model testing
RollbackModel version rollback if performance drops

MLOps tools stack

🔬
MLflowExperiment Tracking
Track experiments: log parameters, metrics, artifacts. Compare runs. Register models. Open-source, runs anywhere.
🏭
KubeflowML Pipelines on K8s
Kubernetes-native ML workflows. Each pipeline step is a container. Build train-evaluate-deploy pipelines.
🤗
Hugging FaceModel Hub
GitHub for ML models. 500K+ pre-trained models. Fine-tune and deploy. Every major LLM available here.
vLLMLLM Serving
High-throughput LLM inference. Serves Llama, Mistral, Qwen at 10-20x speed. Essential for production LLM deployment on K8s.
📊
EvidentlyModel Monitoring
Data drift detection, model quality metrics, visual dashboards. Open-source.
🔧
RayDistributed ML
Scale training and serving across machines. Ray Train + Ray Serve.

🔍 RAG — Give LLMs Your Data

RAG = Retrieval-Augmented Generation

Raw LLMs only know their training data. RAG lets them answer questions about your documents — internal wikis, runbooks, codebase, customer data — by retrieving relevant content at query time.

How RAG works

  1. Index — split documents into chunks → convert each to embedding vector → store in vector database (Pinecone, ChromaDB, pgvector)
  2. Retrieve — user asks question → convert to embedding → find most similar chunks → retrieve top-K
  3. Generate — send retrieved chunks + question to LLM → answer grounded in your documents
from langchain.vectorstores import Chroma
from langchain.embeddings import OpenAIEmbeddings
from langchain.chains import RetrievalQA

vectordb = Chroma.from_documents(docs, OpenAIEmbeddings())
qa = RetrievalQA.from_chain_type(llm=llm,
    retriever=vectordb.as_retriever(search_kwargs={"k": 4}))
result = qa.run("What is the TeMIP alarm resync procedure?")

DevOps RAG use cases

  • Runbook assistant — ask questions, get answers from your actual runbooks
  • Incident assistant — feed recent alerts and logs, get probable root cause
  • Code review — retrieve coding standards, check new code against them
  • Security policy checker — check infrastructure configs against compliance policies

🤝 AI Agents — Autonomous Action

Agents take actions, not just answer questions

A chatbot responds. An AI Agent executes tasks by calling tools, making decisions, and taking multi-step actions autonomously. Components: LLM (brain) + tools (what it can call) + memory (context) + planning.

🔧
Incident Response AgentAutonomous
Alert fires → queries metrics API → checks recent deployments → searches runbooks → posts analysis to Slack → creates Jira ticket. Zero human for L1 triage.
💰
Cost Optimisation AgentWeekly
Scans AWS/Azure → identifies idle resources → calculates savings → creates PR to resize → applies after approval.
📝
PR Review AgentAssisted
Reviews code → checks standards → runs security scan → summarises for reviewer. Cuts review time 40%.

Agent frameworks

  • LangGraph — stateful multi-agent workflows. Best for complex, multi-step pipelines.
  • CrewAI — multi-agent teams with roles (researcher, coder, reviewer) that collaborate.
  • AutoGen (Microsoft) — conversational agents that talk to each other to solve problems.
  • Semantic Kernel — enterprise-grade agent SDK for Python and .NET.

⚡ AI Tools for DevOps Engineers — Use Today

🐙
GitHub CopilotCoding
Autocompletes code as you type. Generates tests, writes docstrings, suggests fixes. ₹1,600/month.
🖥️
CursorIDE
AI-first code editor. Chat with your entire codebase. Ask it to refactor, debug, write from scratch.
📊
Datadog AI / Dynatrace DavisMonitoring
AI root cause analysis. Correlates metrics, traces, logs. Identifies root cause automatically.
🔒
GitHub Advanced SecuritySecurity
AI-powered SAST. Understands code semantics, far fewer false positives.
🚀
AWS CodeWhispererCoding (Free)
Free for individuals. Specialised in AWS SDK code. Security scanning included.

Prompt engineering that actually works

  • Be specific about context: "K8s 1.29 on EKS, pod in OOMKilled. Here is pod spec: [spec]. What are likely causes?"
  • Ask for step-by-step: "Think through this step by step before answering" — reduces errors significantly
  • Provide examples: Show one example of the output format you want
  • Iterate: Treat it as a conversation. "Add error handling." "Make it more concise."

🗺️ AI/ML Learning Roadmap

MonthFocusProject to build
1Python + NumPy + PandasData analysis scripts, understand arrays and DataFrames
2ML basics — scikit-learnTrain a classification model on real dataset (kaggle.com)
3LLM APIs — OpenAI/AnthropicBuild a RAG system on your own documentation
4MLOps — MLflow, Docker for MLContainerise a model training pipeline with experiment tracking
5LangChain/LangGraph agentsBuild an incident triage agent that calls your monitoring APIs
6Production deployment — vLLM on K8sDeploy an open-source LLM (Llama 3) on Kubernetes
The DevOps AdvantageYou already know Kubernetes, Docker, CI/CD, and cloud. That puts you ahead of 80% of ML engineers who cannot deploy their own models. MLOps + DevOps is the highest-demand intersection in tech right now.

🎯 Interview Questions

AI LEARNING · BEGINNER
What is the difference between AI, Machine Learning, and Deep Learning?
These are nested concepts. AI (Artificial Intelligence) is the broadest term — any technique that enables machines to mimic human intelligence. Includes rule-based systems, expert systems, and modern ML. Machine Learning is a subset of AI — systems that learn patterns from data without being explicitly programmed. You feed data, the algorithm finds patterns, makes predictions. Examples: fraud detection, spam filtering, recommendation engines. Deep Learning is a subset of ML — uses neural networks with many layers. Particularly powerful for unstructured data (images, audio, text). Requires large datasets and significant compute. Examples: image recognition, speech-to-text, ChatGPT. In 2024: when people say AI they usually mean LLMs or generative AI. When DevOps engineers talk about AI they increasingly mean MLOps — running ML models in production Kubernetes infrastructure.
AI LEARNING · ENGINEER
What is an LLM and how does it work?
A Large Language Model (LLM) is a neural network with billions of parameters trained to predict the next token in a sequence. Training: Pre-training — ingest trillions of tokens from the internet and books, adjust parameters to predict next token better, runs weeks on thousands of GPUs ($50-100M for frontier models). Instruction tuning (SFT) — fine-tune on human Q&A pairs to follow instructions rather than just autocomplete. RLHF (Reinforcement Learning from Human Feedback) — humans rank responses, reward model learns preferences, LLM trained to score higher. This makes Claude and GPT respond helpfully. Why hallucination happens: LLMs predict plausible-sounding text based on patterns — they do not look facts up. For frequent training topics: reliable. For obscure facts, recent events, specific numbers: confidently wrong. Fix: RAG grounds answers in real documents. Key concepts: context window (how much text the model processes at once), temperature (0=deterministic, 1+=creative), embeddings (text as vectors — similar meaning=similar vectors).
AI LEARNING · ENGINEER
What is RAG and how would you implement it for a DevOps use case?
RAG (Retrieval-Augmented Generation) solves the key LLM limitation: models only know their training data. RAG retrieves relevant documents at query time and adds them to the LLM context. How it works: Indexing (offline) — split documents into chunks, convert each to an embedding vector, store in vector database (Pinecone, ChromaDB, pgvector). Retrieval (at query time) — convert user question to embedding, find most similar chunks via vector similarity search, retrieve top-K chunks. Generation — send retrieved chunks + question to LLM, answer is grounded in your actual documents. DevOps use cases: Runbook assistant — ask "how do I handle a Kafka consumer lag spike?" and get answers from your actual runbooks. Incident assistant — feed recent alerts and logs, get probable root cause analysis. Code review helper — retrieve coding standards, check new code against them. Architecture assistant — answer design questions using your existing ADRs. Implementation: LangChain or LlamaIndex for orchestration, any vector DB for storage, OpenAI/Anthropic API for the LLM.
AI LEARNING · ENGINEER
What is MLOps and how does it relate to DevOps?
MLOps applies DevOps principles to machine learning. Training a model is 20% of the work — production deployment, monitoring, and retraining is 80%. DevOps to MLOps mapping: Code version control → model + dataset versioning (DVC, MLflow). CI/CD pipeline → ML pipeline: data → train → evaluate → register → deploy. Container registry → model registry (MLflow, Hugging Face Hub). Kubernetes deployment → model serving (KServe, Triton, vLLM). Application monitoring → model monitoring: accuracy drift, data drift, bias detection. A/B testing → champion/challenger model testing. Rollback → model version rollback if performance drops. Key MLOps tools: MLflow (experiment tracking — log parameters, metrics, compare runs), Kubeflow (ML pipelines on Kubernetes — each step is a container), vLLM (serve LLMs 10-20x faster), Evidently (data drift detection). The DevOps advantage: if you already know Kubernetes, Docker, CI/CD, and cloud — you are ahead of 80% of ML engineers who cannot deploy their own models.
AI LEARNING · ARCHITECT
What are AI Agents and how are they used in DevOps?
AI Agents are LLM-based systems that take actions, not just answer questions. Components: LLM (brain), tools (functions it can call — APIs, code execution, database queries), memory (conversation context), planning loop (decide next action). DevOps agent examples: Incident Response Agent — receives PagerDuty alert, queries Prometheus API, checks recent deployments in ArgoCD, searches runbook vector database, posts root cause analysis to Slack, creates Jira ticket. Fully automated L1 triage. Cost Optimisation Agent — scans AWS/Azure weekly, identifies idle resources, calculates savings, creates Terraform PR to resize, applies after human approval. Pipeline Debug Agent — CI fails, agent reads error logs, identifies failing test, suggests fix in PR comment. Frameworks: LangGraph (stateful multi-step workflows), CrewAI (multiple agents with roles that collaborate), AutoGen (Microsoft, conversational multi-agent). Current reality: agents work well for well-defined tasks with clear tools. They struggle with ambiguous situations and judgment calls. Use for repetitive structured automation — not for decisions requiring business context.
AI LEARNING · PRODUCTION
How do you monitor and evaluate LLM output quality in production?
Evaluating LLM outputs differs from traditional testing — no single correct answer. Strategies: Automated metrics for RAG — faithfulness (does the answer come from retrieved context?), answer relevancy (does it answer the question?), context precision (are retrieved chunks relevant?). Tools: RAGAS, DeepEval. LLM-as-judge — use a strong LLM (GPT-4, Claude) to evaluate responses from a weaker model. Define a rubric: accuracy, completeness, conciseness, safety. Works for open-ended responses. Human evaluation — for high-stakes applications, sample responses weekly and have domain experts rate them. Create a golden dataset of Q&A pairs and check for regression. A/B testing — route 10% of traffic to new model or prompt version, compare user satisfaction and task completion rate. Production monitoring — track proxy metrics: user thumbs-up/thumbs-down, conversation length (short dead-end = unhelpful), follow-up clarification questions (indicates unclear response). For DevOps runbook assistants: did the on-call engineer follow suggested steps? Did the incident resolve? Outcome metrics are more valuable than any automated score.
Continue Learning
☸️ Kubernetes🐍 Python DevOps🛡️ DevSecOps🏠 Home
🤖
AI Assistant
Ask anything about this topic
👋 Hi! I have read this page and can answer your questions.

Try asking: "Explain this topic in simple terms" or "Give me an example" or ask any specific question.