Chapter 24

Interview Question Bank

Learning Objective

Practice architect-level answers across GenAI concepts, RAG, agents, security, DevOps, Docker, and Kubernetes.

Core Interview Questions

Q1: What is GenAI?

A: GenAI creates new content such as text, code, summaries, and structured outputs. In enterprise projects, it is useful for language-heavy tasks such as summarization, extraction, classification, and reasoning over documents.

Q2: What is RAG?

A: RAG retrieves relevant enterprise knowledge and provides it to the LLM so responses are grounded in trusted sources.

Q3: Why LangGraph over LangChain?

A: LangChain is good for chains and RAG components. LangGraph is better for stateful, branching, multi-agent workflows with memory and human review.

Q4: How do you prevent prompt injection?

A: Use defense in depth: input validation, system prompt boundaries, least privilege, grounded RAG, output validation, monitoring, and human review.

Q5: What is Docker?

A: Docker packages applications and dependencies into portable containers.

Q6: What is Kubernetes?

A: Kubernetes orchestrates containers by handling deployment, scaling, networking, self-healing, and rolling updates.

Q7: What is CI/CD?

A: CI/CD automates build, test, validation, packaging, and deployment so releases are consistent and controlled.

Q8: How do you manage context windows?

A: Use token budgeting, chunking, RAG retrieval, summarization, and avoid sending unnecessary history to the model.

Q9: How do you reduce hallucinations?

A: Use RAG, grounding, strict prompts, citations, validation, confidence scoring, and human review.

Q10: How do you estimate GenAI cost?

A: Estimate requests, input tokens, output tokens, model price, embeddings, infrastructure, vector DB, monitoring, and human review cost.

Scenario Questions

Scenario 1: A healthcare assistant provides answers not found in policy documents.

Best answer: Improve RAG retrieval, force answers to cite retrieved sources, add validation, and route uncertain cases to review.

Scenario 2: Model cost suddenly doubles.

Best answer: Check token usage, prompt changes, retrieval chunk count, output length, traffic volume, caching, and model routing.

Scenario 3: A document contains instructions to ignore system prompts.

Best answer: Treat document text as untrusted data, detect injection patterns, isolate instructions from content, restrict tools, and validate outputs.

Architect Takeaway

Interviewers are not looking for definitions only. They want to know if you can design safe, scalable, cost-aware production systems.

Ch 23: Production Monitoring and Observability

Ch 25: Final Quiz