Dashboard SWE track
For tech roles SWE · DevOps · PM

Refactor your tech career
for the AI era.

Personalized AI-fluency curriculum. A mentor that knows your progress. Real capstone projects you can put on your resume. The fastest path from "I should learn AI" to "I ship AI in my role."

10 minute skill check
Free first module
Cancel anytime
Why now

In May 2026, AI fluency is the dividing line.

The engineer who can ship a working agent in two days is being paid more than the one who can't — sometimes 40% more. The platform engineer who built the company's AI gateway is now running the AI platform team. The PM who can credibly spec an eval is the one leading the AI initiative.

The gap between AI-fluent and AI-aware isn't closing. It's compounding.

But the path to that fluency is broken. Generic AI courses teach a little of everything, badly. Tutorials and blog posts are everywhere, but the half-life of AI best practice has dropped to weeks. What was sharp in February is mid by May.

Refactor is built around the only thing that actually matters: becoming the most AI-fluent person at your company in the seat you already hold.

Not a researcher. Not a generalist. The person whose role is exactly what yours is — who can build and ship and review AI work in your stack, in your meetings, in your code reviews.

We pick your role. We meet your skills where they actually are. We update the curriculum monthly so what you learn this week is what shipped this week.

You don't refactor your career once. You refactor it continuously. Refactor is the platform that makes that possible.

Pick your track

Built for the role you're already in

May 2026

Tech work doesn't look like it did 18 months ago.

If your day-to-day still feels familiar, it's because the change is happening around you faster than your habits are catching up. These are the shifts our curriculum is calibrated to.

Code
Reviewed before it's typed.
Cursor, Claude Code, and Windsurf agents now write the first draft of most PRs. Senior engineers spend their day reviewing AI output, not hand-rolling boilerplate.
Integration
MCP is the connective tissue.
Model Context Protocol has become the standard way LLMs talk to tools, databases, and APIs. If you're building an agent in 2026 and not using MCP, you're rewriting infrastructure.
Architecture
Agents replaced point automations.
Where teams used to wire a single API call into a workflow, they now ship multi-step agents with memory, tool use, and supervisors. The new system design interview is agent design.
Reasoning
Models split tasks by depth.
Reasoning models (o-series, Claude reasoning) handle planning and complex problem-solving; fast non-reasoning models do the rest. Knowing when to switch is the new performance tuning.
Quality
Evals are the new tests.
Eval-driven development has replaced "ship it and watch logs." Golden datasets, LLM-as-judge harnesses, and regression evals on every prompt change are now table stakes.
Compliance
EU AI Act is in force.
High-risk AI systems require risk classification, documentation, and audit trails. Whether you ship to the EU or not, the framework is becoming the global default.
What we're betting on for 2026–2028

Browser-using and computer-using agents become mainstream. Voice becomes the primary interface for many apps. Persistent memory and personalization gets serious. Small fine-tuned models eat the long tail of inference. AI cost engineering becomes a named discipline. Refactor's curriculum updates monthly so you're learning where the puck is going, not where it was.

What makes Refactor different

An AI mentor that actually knows you.

Knows your skill gaps
Skips what you've mastered. Goes deeper where you struggle.
Reviews your real code
Drop in your repo. Get role-specific feedback against an evals rubric.
Runs your interviews
Mock interviews calibrated to your target role and seniority.
AI
Mentor
Knows your progress · Module 3
You marked tool calling as "shaky" in your assessment. Want to try a tougher example?
Yes — but make it relevant to my work on payments.
Got it. I'll generate a payment-router agent example with three tools and edge cases for refunds.
3.2k
Engineers refactoring
847
Capstones shipped
94%
Finish their first module
14d
Median time to first ship
Stories

From "I should learn AI" to "I'm running the AI initiative."

A small selection of what's happened in the last six months.

"Six weeks of Refactor and I shipped my first agent into production. I'm now the AI lead on the payments team. The mentor caught gaps I didn't know I had — eval coverage was the thing that pushed me from prototype to production."

MO
Marcus O.
Senior Backend Engineer · Fintech

"The mock interviews are uncomfortably accurate. I got feedback that mirrored exactly what I heard from a real interviewer at my next-level job a week later. The platform-engineering track is the only curriculum I've found that takes the infra side of AI seriously."

PR
Priya R.
Platform Engineer · SaaS

"I went from being the PM that asks questions in AI meetings to the one running them. Refactor doesn't teach AI in the abstract — it teaches AI as something you ship through a real product process, with real evals and real risk classification."

DC
David C.
Senior PM · Data infrastructure
Featured in Hacker News The Pragmatic Engineer Lenny's Newsletter Latent Space
Pricing

Simple, monthly, cancel anytime.

Start free. Pay when the curriculum starts paying you back.

Free
Try the platform, find your gaps.
£0/month
  • Adaptive skill assessment
  • First module of any track
  • Limited AI mentor (10 messages/day)
  • Community access
Start free
Most popular
Pro
Everything you need to ship.
£39/month
  • All tracks (SWE, DevOps, PM)
  • Unlimited AI mentor
  • Mock interviews (text + voice)
  • Capstone projects
  • Public portfolio
  • Monthly curriculum updates
  • Priority support
Start 14-day free trial
Team
For companies upskilling teams.
£149/learner/mo
  • Everything in Pro
  • Admin dashboard
  • Curriculum calibrated to your JD
  • Cohort tooling
  • Skills reporting & exports
  • SSO + SCIM
  • Dedicated CSM
Talk to sales

Annual plans save 20%. Education and non-profit discounts available.

FAQ

Common questions

Who is Refactor for?

Mid- to senior-level tech professionals — engineers, platform/DevOps, PMs — who already know their craft and need to layer AI fluency on top. If you're brand new to tech entirely, this isn't the right starting point. If you've shipped real software and feel like AI is slipping past you, you're exactly who we built it for.

How is this different from a Coursera or Udemy course?

Three things. First, it's role-specific — engineers, platform people, and PMs all see different curricula because the AI fluency they need is different. Second, the AI mentor knows your code, your skill gaps, and your career goal — it's not a chatbot. Third, the curriculum is updated monthly because in AI, anything older than six months is outdated.

How current is the curriculum?

Updated monthly. The current version covers MCP, reasoning models, agent orchestration, the EU AI Act, eval-driven development, prompt caching, and AI cost engineering. We publish a public changelog so you can see what changed and why.

What if I'm a complete beginner with AI?

Module 1 of every track assumes zero AI background — it covers how modern LLMs actually work, capabilities, limits, and where the field is. The skill assessment will route you correctly: if you've already mastered the foundations, you skip them. If not, you start there.

Can my company sponsor or buy this for me?

Yes. The Team plan includes admin dashboards, skills reporting, and the option to calibrate the curriculum to your specific job descriptions. We have a one-pager you can forward to L&D — request it here.

Will this actually help me get hired?

The capstone is a real, shippable project that becomes the centerpiece of your portfolio. Mock interviews are calibrated against current role expectations at top companies. We don't make hiring guarantees — that's not a thing — but the structure is built so the work you do here is the work you can show in interviews.

Do you cover [specific topic, like fine-tuning, voice agents, EU AI Act]?

Probably yes. Fine-tuning, distillation, voice agents, computer use, MCP, EU AI Act, NIST AI RMF, prompt injection defense, and AI cost engineering are all covered in role-appropriate depth. See the full curriculum changelog for what shipped this month.

Refactor takes 10 minutes to start.

Find your gaps. Build something real. Ship it on your resume.

AI fluency check-in
Question 7 of 12 · ~4 min left
RAG basics Concept · Multiple choice

You're building a RAG system over 50,000 internal docs. Retrieval keeps surfacing irrelevant chunks even when queries are clearly worded.

Which is most likely the first thing to investigate?

AI
Why we're asking this: your earlier answers showed strong intuition for embeddings but uncertainty about retrieval debugging. This is one of the highest-leverage skills for production RAG.
Saturday, May 9

Welcome back, Neha.

14-day streak
Interview readiness
62
+8 this week
Knowledge71
Hands-on58
Communication56
Curriculum progress
38%
3 of 8 modules
On pace for capstone in 5.5 weeks
This week
4.2h
of 5h goal
MTWTFSS
Pick up where you left off
Tool calling with structured outputs
Module 3 · Lesson 4 of 8
You're 60% through this module. Two short lessons and one practice ahead — about 45 minutes.
Now
Tool calling
In progress
Next
Building a router agent
35 min
After
Module wrap + practice
20 min
AI
Mentor
Ready
Want to refresh tool calling before the next lesson? I can run a quick 3-question recap.
Skill map
Updated after assessment
LLM fundamentalsStrong
AI-assisted codingStrong
Tool calling & agentsBuilding
RAG & retrievalBuilding
Evals & observabilityGap
Production deploymentGap
Recent activity
Completed Lesson 3.3 — Streaming responses
2 hours ago
Earned badge LLM API Builder
Yesterday
Mock interview Technical · 68/100
3 days ago
Practice Build a /chat endpoint · all tests passed
4 days ago

Curriculum

Updated monthly · Calibrated to May 2026 reality + the bets we're making on 2027–2028.

~50 hours · 8 modules + capstone
Software Engineer track

From AI-assisted coding to building, evaluating, and shipping agentic systems in production. Calibrated for senior+ engineers who already ship code and need to layer AI fluency on top. Updated for May 2026 — covers MCP, reasoning models, agent orchestration, eval-driven development, and the production patterns the top AI-native companies are using right now.

Module 1 · Complete
AI foundations for engineers
3.5h · 6 lessons
LLMs, transformers, tokens, context windows, embeddings — what models can and can't do.
LLMs Tokens Embeddings
Module 2 · Complete
AI-assisted coding
4h · 7 lessons
Copilot, Cursor, Claude Code. Prompt patterns for code gen, review, refactor. When to trust output.
Copilot Code review Refactor
3
Module 3 · In progress
Building with LLM APIs
5.5h · 8 lessons · 60% done
OpenAI/Anthropic APIs, structured outputs, tool/function calling, streaming, error handling.
Lesson 1 — Calling the API
Lesson 2 — Structured outputs
Lesson 3 — Streaming
Lesson 4 — Tool calling
Lesson 5 — Router agents
Lesson 6 — Error handling
4
Module 4 · Up next
RAG & vector search
6h · 9 lessons
Embeddings, chunking strategies, vector DBs, hybrid search, retrieval evaluation.
Embeddings pgvector Hybrid search Re-rankers
Module 5 · Locked
Agentic systems
7h · 10 lessons
Single-agent and multi-agent patterns. Model Context Protocol (MCP) deep dive. Tool design and ergonomics. State and memory management. Human-in-the-loop checkpoints. Computer use, browser agents, and voice agents.
MCPAgent loopsTool designComputer useVoice agents
Module 6 · Locked
Evals & reliability
5.5h · 8 lessons
Eval-driven development. Building golden datasets. LLM-as-judge patterns. Regression testing prompts. Production observability (Langfuse, Braintrust, Helicone). Hallucination detection. Drift monitoring. A/B testing AI features.
EvalsLLM-as-judgeLangfuseDrift
Module 7 · Locked
Production AI
6h · 8 lessons
Cost engineering & token economics. Caching strategies (prompt caching, semantic caching). Latency optimization. Prompt injection defense. Output validation. Fine-tuning vs RAG vs prompting tradeoffs. Distillation & small model deployment. Model gateways.
CostCachingPrompt injectionDistillation
Module 8 · Locked
AI-native software architecture
5h · 7 lessons
Designing systems AI-first vs bolt-on. Persistent memory and personalization. Conversational UX vs traditional UX. Failure modes & graceful degradation. Human-AI collaboration patterns. Reading the next 3 years — what to bet on, what to avoid.
ArchitectureMemoryAI UXFuture bets
Final capstone
Ship an AI feature into a real codebase
10–14 days
Multi-week project, AI mentor pair-programs with you, employer-style rubric, public artifact for your portfolio.
DevOps / Platform Engineer track

Build the infrastructure that lets your company ship AI safely, cheaply, and reliably. Covers the full LLMOps stack — gateways, observability, FinOps for AI, prompt injection defense, EU AI Act technical compliance, and AI for operations itself. Calibrated for senior platform engineers, SREs, and infra leads who need to own AI infrastructure.

~52 hours · 8 modules + capstone
1
Module 1
AI infrastructure fundamentals
5h · 7 lessons

Inference vs training economics. GPU landscape today (H100/H200/B200/MI300). Model serving primitives. Inference engines (vLLM, TGI, TensorRT-LLM, SGLang). Throughput vs latency tradeoffs. Quantization (FP8, INT4, ternary).

GPUsvLLMQuantizationInference
2
Module 2
Deploying models at scale
6h · 8 lessons

Cloud platforms compared (Bedrock, Vertex, Together, Fireworks, Replicate). Self-hosting tradeoffs. Multi-region deployment. Auto-scaling for spiky AI traffic. Cold-start mitigation. Edge deployment for latency-critical apps.

BedrockMulti-regionAuto-scalingEdge
3
Module 3
The LLMOps stack
6.5h · 9 lessons

Model gateways (LiteLLM, Portkey, OpenRouter). Prompt management & versioning. Vector DB ops at scale (pgvector, Turbopuffer, Pinecone). Observability platforms (Langfuse, Helicone, Braintrust). Eval pipelines as CI. Prompt CI/CD. Secrets management for AI.

GatewaysLangfusePrompt versioningCI/CD
4
Module 4
Cost engineering for AI
5h · 7 lessons

Token economics in 2026. Prompt caching strategies. Semantic caching. Model routing (cheap → expensive). Batch processing & async pipelines. Cost monitoring & alerts. Budget enforcement. FinOps for AI is becoming a named discipline — be the person who runs it.

Token economicsCachingFinOpsRouting
5
Module 5
Observability & reliability
6h · 8 lessons

OpenTelemetry GenAI & OpenLLMetry. Tracing AI calls end-to-end. Eval-as-monitor patterns. Drift detection in production. Incident response for AI systems. SLOs for non-deterministic systems. Postmortems with LLM-specific factors. Chaos engineering for agents.

OTel GenAISLOsDriftIncident response
6
Module 6
AI security & compliance
6h · 8 lessons

OWASP LLM Top 10 (2026 update). Prompt injection defense at the gateway. PII redaction & DLP for AI traffic. Secrets in prompts. EU AI Act technical requirements (now in force). ISO 42001 / NIST AI RMF implementation. Audit trails for AI decisions. Red-teaming AI systems.

OWASP LLMEU AI ActDLPRed team
7
Module 7
AI for operations
5h · 7 lessons

LLM-powered log analysis. Incident summarization with AI. Runbook agents. AI-assisted on-call rotations. ChatOps with AI. Auto-remediation patterns (and when not to). Code review agents for ops PRs. The platform team itself becomes AI-augmented.

LogAIRunbook agentsAuto-remediationChatOps
8
Module 8
Building AI platforms
6h · 8 lessons

Internal AI gateway architecture. Self-service AI for product teams. Governance & quotas. Multi-tenant isolation. Cost attribution by team/product. Platform metrics & adoption. Evangelizing internal platforms — being the person who builds the company's AI infrastructure is a career-defining move.

Internal platformsSelf-serviceGovernanceAdoption
Final capstone
Build an internal AI gateway
10–14 days
Auth, rate limits, cost controls, prompt injection defense at the edge, full observability, and a self-service onboarding flow for product teams. Real infrastructure your company could deploy.
Product Manager track

Become the PM who can credibly scope, ship, and measure AI features. Covers AI literacy without hand-waving, designing AI experiences that earn user trust, eval-driven product development, EU AI Act compliance, and AI-native product strategy. Calibrated for product managers at growth-stage and enterprise companies who need to be the AI lead in the room.

~45 hours · 8 modules + capstone
1
Module 1
AI literacy for PMs
4h · 7 lessons

How LLMs actually work — no hand-waving. Capability map for May 2026. Reasoning vs non-reasoning models (when to use which). Cost-latency-quality triangle. Open vs closed model decisions. Multimodal capabilities & UX. Where AI is still bad (and getting better fast).

LLMsCapabilitiesTradeoffsReasoning
2
Module 2
AI product discovery
5h · 8 lessons

Identifying real AI-fit problems vs AI feature theater. Opportunity sizing for AI features. User research with AI in the loop. Build vs buy vs orchestrate. Distinguishing AI-native problems from AI-augmented ones. Telling demos apart from products.

DiscoverySizingBuild/buyUser research
3
Module 3
Designing AI experiences
5.5h · 8 lessons

UX patterns that work in 2026: copilots, autocomplete, agents, ambient AI. Trust & transparency. Designing for failure modes (hallucinations, edge cases). Confidence indicators. Human-AI handoffs. Voice and multimodal UX. Personalization without creepiness.

AI UXTrustVoiceFailure modes
4
Module 4
Working with AI teams
5h · 8 lessons

Writing AI feature specs that engineers can actually build. Eval-driven product development. Dataset curation & labeling as a PM responsibility. A/B testing AI features. Shipping iteratively (canary, staged rollout). Working with ML engineers vs AI engineers.

SpecsDatasetsA/B testingRollout
5
Module 5
AI metrics & evaluation
5.5h · 8 lessons

Offline evals vs online metrics. Building golden datasets as a PM. LLM-as-judge for product metrics. Quality gates for shipping AI. Detecting drift in production. North-star metrics for AI products. The cost-quality-latency triangle in product decisions.

EvalsGolden setsMetricsQuality gates
6
Module 6
AI strategy & build vs buy
5h · 7 lessons

Moats in the AI era (data, distribution, workflow). Choosing model providers. Open vs closed source decisions. Fine-tuning decisions for PMs. Vertical vs horizontal AI products. Pricing AI features. Defensibility analysis. Competitive intelligence with AI.

StrategyPricingMoatsBuild/buy
7
Module 7
Responsible AI & compliance
5h · 7 lessons

EU AI Act for PMs (in force as of 2026). Risk classification for AI features. Bias & fairness audits. Privacy by design. AI red-teaming as PM responsibility. Transparency requirements. Customer trust & disclosure. The PM owns the risk register, not the legal team.

EU AI ActRiskTransparencyRed-teaming
8
Module 8
AI-native product strategy
4.5h · 7 lessons

2026–2030 product landscape. Agentic vs feature-based products. Persistent memory & context. Voice as primary interface. AI organizations: structuring teams. Career strategy for PMs in the AI era. Reading the next 3 years and betting accordingly.

StrategyAgentsCareerFuture
Final capstone
Spec a real AI feature end-to-end
10–14 days
Full PRD + eval plan + GTM brief for an AI feature in a real product, with risk classification under EU AI Act, success metrics, eval methodology, and rollout plan. Reviewed against an employer-style rubric.
Module 3 · Building with LLM APIs
Lesson 4 — Tool calling with structured outputs
Lesson 4 of 8
Concept

When to use tool calling vs. structured outputs

Both let you constrain what the model returns, but they solve different problems. Structured outputs force a JSON shape. Tool calling lets the model decide when to invoke a function and which one.

Use tool calling when the model needs to take action (call a payment API, query a database, hit a search index). Use structured outputs when you just need clean data extraction.

router_agent.ts
// Define tools the model can call
const tools = [{
  name: "refund_payment",
  description: "Issue a refund for a charge",
  input_schema: {
    type: "object",
    properties: {
      charge_id: { type: "string" },
      amount_cents: { type: "integer" }
    },
    required: ["charge_id"]
  }
}];

const response = await client.messages.create({
  model: "claude-sonnet-4-6",
  tools,
  messages: [{ role: "user", content: query }]
});
Common pitfall. Always validate tool inputs server-side. Models can produce schema-valid arguments that still violate your business rules.
AI
Mentor
Reading lesson 4
Quick check before you continue: when would you set tool_choice to "any" instead of "auto"?
When I always want a tool called — like a routing layer that has to pick something.
Right. Watch the cost — the model can't shortcut a "no tool needed" response. Want me to generate a payment-router example using your stack (TypeScript, Stripe)?
Yes please — include refund edge cases.
On it. Generating with three tools: issue_refund, flag_for_review, request_more_info. I'll add a test case where the user is requesting a refund 90 days post-charge.
Practice · Module 3
Build a refund-routing agent
Medium ~25 minutes
Problem

Refund-routing agent

Build a function that takes a customer message and routes it to one of three actions: issue_refund, flag_for_review, or request_more_info.

Requirements
  • ·Use Anthropic SDK with tool calling
  • ·Handle refunds within 30 days only
  • ·Flag suspicious patterns to human review
  • ·Pass all 5 test cases
Hint: set tool_choice: "any" to force a routing decision.
refund_router.ts TypeScript
1 import Anthropic from "@anthropic-ai/sdk";
2
3 const client = new Anthropic();
4
5 const tools = [
6 {
7 name: "issue_refund",
8 description: "Issue a full or partial refund",
9 input_schema: {
10 type: "object",
11 properties: {
12 charge_id: { type: "string" },
13 reason: { type: "string" }
14 },
15 required: ["charge_id", "reason"]
16 }
17 },
18 // TODO: add flag_for_review and request_more_info
19 ];
20
21 export async function routeRefund(message: string) {
22 // TODO: call client.messages.create with tools
23 }
Test cases
Refund under 30 days
Flag stale refund (90d)
Request info on missing charge_id
Multi-charge refund
Adversarial prompt
2 of 5 passing
AI
Mentor review
Your tool schema for issue_refund looks good. The failing test is because you're missing request_more_info entirely. Add it as a third tool and pass tool_choice: "any".

Mock interview

Calibrated to your target role: Senior SWE · AI-fluent

Recording · Question 2 of 4 · 12:34
AI
Interviewer
You're designing a customer-support agent that has access to a knowledge base, an order-history database, and a refund tool. The team is worried about it making mistakes on refunds.

Walk me through how you'd architect this — and how you'd evaluate whether it's safe enough to ship.
NK
You · live transcript
"I'd start by separating the read-only tools — knowledge base and order history — from the write tool, the refund. The refund needs guardrails. I'd add a confirmation step where the model has to explain its reasoning before the refund executes, and I'd cap any single refund amount at, say, $200 without human approval. For evals, I'd build a golden set of 100 edge cases — late refunds, fraud-pattern triggers, multi-charge scenarios — and run them..."
02:14
Take your time. The interviewer waits up to 60s for your answer to land.
Live signals
StructureStrong
Technical depthGood
Trade-off awarenessMid
CommunicationGood
Coverage so far
Tool separation (read/write)
Human-in-the-loop
Eval golden set
Cost & latency budget
Prompt injection defense
Monitoring & rollback
Hint available
Strong start — but you haven't talked about prompt injection yet.
Final capstone · 9 days remaining

Ship a customer-support agent

Real codebase · Real evals · Real artifact for your portfolio

Milestones
1. Define use case & success criteria
Day 1–2
PRD + 5 example tickets + acceptance criteria
2. Build retrieval over knowledge base
Day 3–4
Embeddings, chunking strategy, basic eval
3. Add agent loop with tools
In progress · Day 5
Tool calling, refund routing, guardrails
60%
4. Build eval harness
Day 7–8
Golden set of 100 cases, regression tests
5. Deploy with observability
Day 9–10
Cost tracking, latency, prompt logging, alerts
6. Submit + AI review against rubric
Day 11–12
Final review, scorecard, portfolio publication
Files committed
main · 14 commits
src/agent.ts
+148 / -122h ago
src/tools/refund.ts
+87 / -32h ago
src/retrieval/index.ts
+212 / -01d ago
evals/golden_set.json
+1,420 / -02d ago
AI
Co-pilot
I noticed your refund.ts doesn't handle the 30-day window yet. Want me to draft the validation logic?
Rubric preview
Correctness22 / 25
Eval rigor12 / 25
Production readiness— / 25
Communication— / 25
Current score 34 / 100
Reviewer note. Your eval harness needs work before submission. Cover edge cases for prompt injection and adversarial refund requests.
refactor.dev/n/neha-k
NK

Neha K.

Senior Software Engineer · Refactoring for AI

SWE track 3 modules complete Capstone in progress
Interview readiness
62/100

Backend engineer with 7 years building payments infra. Currently leveling up on production AI — RAG systems, agent loops, and eval rigor. Open to AI-engineering roles at growth-stage companies.

Skills verified
LLM fundamentals84
AI-assisted coding78
Tool calling72
Streaming & APIs76
RAG & retrieval52
Badges
3 of 6 earned
Mock interview history
Technical · Agents
3 days ago
68
AI fluency
6 days ago
71
Behavioral
2 weeks ago
54
Capstone projects
1 in progress · 0 published
Customer-support agent with RAG and refund routing
In progress · 60% complete · ETA 9 days
Capstone

A production-grade support agent over a knowledge base of 2,400 docs. Handles refund routing with guardrails, includes eval harness over 100 golden cases.

TypeScript Anthropic SDK pgvector Stripe
Future capstone slot
Unlocks after first capstone is shipped