Lesson 13
Prompt engineering, AI agents, and AI-augmented Scrum for modern development
Large Language Models are neural networks trained on vast amounts of text data. They predict the next token (word fragment) in a sequence, but this simple objective gives rise to remarkable capabilities: reasoning, code generation, analysis, and creative problem solving.
Modern LLMs are built on the Transformer architecture (introduced in the 2017 paper "Attention Is All You Need"). The key innovation is the attention mechanism โ the model can look at all parts of the input simultaneously, understanding relationships between distant words.
| Concept | What It Means | Why It Matters |
|---|---|---|
| Token | A word fragment (roughly 3/4 of a word) | Models process and generate tokens, not words. "unbreakable" = ["un", "break", "able"] |
| Context Window | Maximum number of tokens the model can see at once | Determines how much code, documentation, or conversation the model can work with |
| Parameters | The learned weights of the neural network | More parameters generally means more capability (GPT-4 class: hundreds of billions) |
| Temperature | Controls randomness of output (0 = deterministic, 1 = creative) | Low for code generation, higher for brainstorming |
Prompt engineering is the art and science of communicating effectively with LLMs. The quality of your output is directly proportional to the quality of your prompt.
Be as specific and explicit as you would be when writing a detailed specification for a junior developer who has never seen your codebase.
Vague prompts get vague answers. Specific prompts get precise, actionable output.
// BAD: Vague
"Write a function to process data"
// GOOD: Specific
"Write a TypeScript function called `parseUserCSV` that:
- Takes a CSV string as input
- Returns an array of User objects { name: string, email: string, role: 'admin' | 'user' }
- Skips the header row
- Throws a ValidationError if any email is invalid
- Uses no external libraries"
Tell the model what it is working with โ the tech stack, existing patterns, and boundaries.
// Context block example
"You are working on a Node.js 20 / TypeScript project using:
- NestJS for the API layer
- TypeORM for database access
- PostgreSQL as the database
- Jest for testing
The project follows Clean Architecture with these layers:
domain/ โ entities, value objects (no dependencies)
application/ โ use cases (depends on domain)
infrastructure/ โ DB, HTTP (depends on application)
Follow the existing patterns in the codebase."
Tell the model exactly what format you want the response in.
// Request structured output
"Analyze this function for issues. Return your findings as:
## Findings
For each issue:
- **Severity**: S0/S1/S2/S3
- **Line**: line number
- **Issue**: description
- **Fix**: suggested code change
## Summary
- Total findings: N
- Blocks merge: yes/no"
Ask the model to think step by step. This dramatically improves reasoning accuracy.
// Chain of thought
"Before implementing, think through:
1. What are the edge cases?
2. What could go wrong?
3. What is the simplest correct solution?
4. How would you test this?
Then implement the solution."
// Even simpler:
"Think step by step before answering."
Chain of thought forces the model to show its reasoning, which activates more careful processing. Errors in intermediate steps become visible and correctable.
Show the model what good output looks like by providing 1-3 examples.
// Few-shot prompting
"Convert user stories to test cases.
Example input:
'As a user, I can reset my password via email'
Example output:
- test: should send reset email when valid email provided
- test: should return 404 when email not found
- test: should rate-limit to 3 reset requests per hour
- test: should expire reset token after 24 hours
Now convert this story:
'As a user, I can upload a profile avatar'"
LLMs distinguish between two types of input:
| Type | Purpose | Persistence | Example |
|---|---|---|---|
| System Prompt | Sets the model's identity, role, and constraints | Persists across the entire conversation | "You are a senior TypeScript developer. Follow SOLID principles. Never use any." |
| User Prompt | The specific task or question for this turn | This message only | "Implement the UserService class with CRUD methods." |
An AI agent goes beyond simple prompt-response. It is an autonomous system that uses tools, makes decisions, and loops until a task is complete.
Agent = LLM + Tools + Memory + Loop
Without tools, an LLM can only generate text. With tools, it can read files, write code, run commands, search the web, and interact with any system โ then decide what to do next based on results.
The power of agents comes from their tools. Common tool categories:
| Pattern | How It Works | Best For |
|---|---|---|
| ReAct | Reason โ Act โ Observe โ Repeat | General-purpose tasks, debugging, exploration |
| Plan-and-Execute | Create full plan first โ Execute steps โ Verify | Complex multi-step tasks, feature implementation |
| Reflection | Generate output โ Critique own output โ Improve โ Repeat | Code review, writing, quality improvement |
The most powerful application of AI agents in software development is augmenting the Scrum process itself. Every Scrum role can be assisted โ or fully simulated โ by a specialized AI agent.
The AI Product Owner helps structure requirements and manage the backlog.
// Example: AI writes a user story
@po write story: "Users need to reset their password"
โ **US-042: Password Reset via Email**
As a registered user
I want to reset my password via email
So that I can regain access when I forget my credentials
**Acceptance Criteria:**
- Given a valid email, system sends reset link within 30s
- Given an invalid email, system shows generic message (no info leak)
- Reset token expires after 24 hours
- User cannot reuse last 5 passwords
**Story Points:** 5 | **Priority:** High
The AI Scrum Master facilitates ceremonies and tracks team health metrics.
The AI Developer implements features using disciplined engineering practices.
The AI QA engineer designs test strategies and validates coverage.
@arch designs systems. @lead enforces code quality. They work at different levels.
The AI-augmented workflow supports two operating modes:
A single developer works with AI agents that play every role. The developer provides intent and validation; the AI handles planning, implementation, testing, and review.
Human team members use AI agents to accelerate their work. The AI handles repetitive tasks; humans focus on creativity, judgment, and collaboration.
@po to draft stories, then refines and prioritizes@dev for TDD and implementation, reviews the code@lead to assist with code review, makes final callThe workflow uses structured prompt templates to give each AI agent its role, context, and constraints. These prompts are not written ad-hoc โ they are versioned, tested, and optimized.
Prompts are loaded in layers, not all at once. This respects context window limits and keeps the AI focused.
Each agent in the system has a specific identity, set of tools, and protocol it follows:
| Agent | Role | Primary Tasks | Key Tools |
|---|---|---|---|
@po |
Product Owner | Stories, backlog, priorities, acceptance criteria | Memory read/write, backlog management |
@sm |
Scrum Master | Sprint planning, velocity, retrospectives | Sprint tracking, metrics |
@arch |
Architect | System structure, components, boundaries, patterns, infrastructure decisions, ADRs | Codebase search, diagram generation |
@lead |
Tech Lead | Code quality (SOLID/Clean Code), PR reviews, tactical decisions, unblocking devs | Code analysis, file reading |
@dev |
Developer | TDD implementation, refactoring, git operations | File write, command execution, git |
@qa |
QA Engineer | Test strategy, coverage analysis, acceptance testing | Test runner, code analysis |
@sec |
Security | Security audit, vulnerability scanning, hardening | SAST tools, dependency audit |
What is an AI Agent?
In AI-augmented Scrum, who makes priority decisions?