Sign in Start free

For tech roles SWE · DevOps · PM

Refactor your tech career
for the AI era.

Personalized AI-fluency curriculum. A mentor that knows your progress. Real capstone projects you can put on your resume. The fastest path from "I should learn AI" to "I ship AI in my role."

10 minute skill check

Free first module

Cancel anytime

Why now

In May 2026, AI fluency is the dividing line.

The engineer who can ship a working agent in two days is being paid more than the one who can't — sometimes 40% more. The platform engineer who built the company's AI gateway is now running the AI platform team. The PM who can credibly spec an eval is the one leading the AI initiative.

The gap between AI-fluent and AI-aware isn't closing. It's compounding.

But the path to that fluency is broken. Generic AI courses teach a little of everything, badly. Tutorials and blog posts are everywhere, but the half-life of AI best practice has dropped to weeks. What was sharp in February is mid by May.

Refactor is built around the only thing that actually matters: becoming the most AI-fluent person at your company in the seat you already hold.

Not a researcher. Not a generalist. The person whose role is exactly what yours is — who can build and ship and review AI work in your stack, in your meetings, in your code reviews.

We pick your role. We meet your skills where they actually are. We update the curriculum monthly so what you learn this week is what shipped this week.

You don't refactor your career once. You refactor it continuously. Refactor is the platform that makes that possible.

Pick your track

Built for the role you're already in

May 2026

Tech work doesn't look like it did 18 months ago.

If your day-to-day still feels familiar, it's because the change is happening around you faster than your habits are catching up. These are the shifts our curriculum is calibrated to.

Code

Reviewed before it's typed.

Cursor, Claude Code, and Windsurf agents now write the first draft of most PRs. Senior engineers spend their day reviewing AI output, not hand-rolling boilerplate.

Integration

MCP is the connective tissue.

Model Context Protocol has become the standard way LLMs talk to tools, databases, and APIs. If you're building an agent in 2026 and not using MCP, you're rewriting infrastructure.

Architecture

Agents replaced point automations.

Where teams used to wire a single API call into a workflow, they now ship multi-step agents with memory, tool use, and supervisors. The new system design interview is agent design.

Reasoning

Models split tasks by depth.

Reasoning models (o-series, Claude reasoning) handle planning and complex problem-solving; fast non-reasoning models do the rest. Knowing when to switch is the new performance tuning.

Quality

Evals are the new tests.

Eval-driven development has replaced "ship it and watch logs." Golden datasets, LLM-as-judge harnesses, and regression evals on every prompt change are now table stakes.

Compliance

EU AI Act is in force.

High-risk AI systems require risk classification, documentation, and audit trails. Whether you ship to the EU or not, the framework is becoming the global default.

What we're betting on for 2026–2028

Browser-using and computer-using agents become mainstream. Voice becomes the primary interface for many apps. Persistent memory and personalization gets serious. Small fine-tuned models eat the long tail of inference. AI cost engineering becomes a named discipline. Refactor's curriculum updates monthly so you're learning where the puck is going, not where it was.

What makes Refactor different

An AI mentor that actually knows you.

Knows your skill gaps

Skips what you've mastered. Goes deeper where you struggle.

Reviews your real code

Drop in your repo. Get role-specific feedback against an evals rubric.

Runs your interviews

Mock interviews calibrated to your target role and seniority.

AI

Mentor

Knows your progress · Module 3

You marked tool calling as "shaky" in your assessment. Want to try a tougher example?

Yes — but make it relevant to my work on payments.

Got it. I'll generate a payment-router agent example with three tools and edge cases for refunds.

3.2k

Engineers refactoring

847

Capstones shipped

94%

Finish their first module

14d

Median time to first ship

Stories

From "I should learn AI" to "I'm running the AI initiative."

A small selection of what's happened in the last six months.

"Six weeks of Refactor and I shipped my first agent into production. I'm now the AI lead on the payments team. The mentor caught gaps I didn't know I had — eval coverage was the thing that pushed me from prototype to production."

MO

Marcus O.

Senior Backend Engineer · Fintech

"The mock interviews are uncomfortably accurate. I got feedback that mirrored exactly what I heard from a real interviewer at my next-level job a week later. The platform-engineering track is the only curriculum I've found that takes the infra side of AI seriously."

PR

Priya R.

Platform Engineer · SaaS

"I went from being the PM that asks questions in AI meetings to the one running them. Refactor doesn't teach AI in the abstract — it teaches AI as something you ship through a real product process, with real evals and real risk classification."

DC

David C.

Senior PM · Data infrastructure

Featured in Hacker News The Pragmatic Engineer Lenny's Newsletter Latent Space

Pricing

Simple, monthly, cancel anytime.

Start free. Pay when the curriculum starts paying you back.

Free

Try the platform, find your gaps.

£0/month

✓Adaptive skill assessment
✓First module of any track
✓Limited AI mentor (10 messages/day)
✓Community access

Start free

Common questions

Who is Refactor for?

Mid- to senior-level tech professionals — engineers, platform/DevOps, PMs — who already know their craft and need to layer AI fluency on top. If you're brand new to tech entirely, this isn't the right starting point. If you've shipped real software and feel like AI is slipping past you, you're exactly who we built it for.

How is this different from a Coursera or Udemy course?

Three things. First, it's role-specific — engineers, platform people, and PMs all see different curricula because the AI fluency they need is different. Second, the AI mentor knows your code, your skill gaps, and your career goal — it's not a chatbot. Third, the curriculum is updated monthly because in AI, anything older than six months is outdated.

How current is the curriculum?

Updated monthly. The current version covers MCP, reasoning models, agent orchestration, the EU AI Act, eval-driven development, prompt caching, and AI cost engineering. We publish a public changelog so you can see what changed and why.

What if I'm a complete beginner with AI?

Module 1 of every track assumes zero AI background — it covers how modern LLMs actually work, capabilities, limits, and where the field is. The skill assessment will route you correctly: if you've already mastered the foundations, you skip them. If not, you start there.

Can my company sponsor or buy this for me?

Yes. The Team plan includes admin dashboards, skills reporting, and the option to calibrate the curriculum to your specific job descriptions. We have a one-pager you can forward to L&D — request it here.

Will this actually help me get hired?

The capstone is a real, shippable project that becomes the centerpiece of your portfolio. Mock interviews are calibrated against current role expectations at top companies. We don't make hiring guarantees — that's not a thing — but the structure is built so the work you do here is the work you can show in interviews.

Do you cover [specific topic, like fine-tuning, voice agents, EU AI Act]?

Probably yes. Fine-tuning, distillation, voice agents, computer use, MCP, EU AI Act, NIST AI RMF, prompt injection defense, and AI cost engineering are all covered in role-appropriate depth. See the full curriculum changelog for what shipped this month.

Refactor takes 10 minutes to start.

Find your gaps. Build something real. Ship it on your resume.

AI fluency check-in

Question 7 of 12 · ~4 min left

RAG basics Concept · Multiple choice

You're building a RAG system over 50,000 internal docs. Retrieval keeps surfacing irrelevant chunks even when queries are clearly worded.

Which is most likely the first thing to investigate?

AI

Why we're asking this: your earlier answers showed strong intuition for embeddings but uncertainty about retrieval debugging. This is one of the highest-leverage skills for production RAG.

Saturday, May 9

Welcome back, Neha.

14-day streak

Interview readiness

62

+8 this week

Knowledge71

Hands-on58

Communication56

Curriculum progress

38%

3 of 8 modules

On pace for capstone in 5.5 weeks

This week

4.2h

of 5h goal

MTWTFSS

Pick up where you left off

Tool calling with structured outputs

Module 3 · Lesson 4 of 8

You're 60% through this module. Two short lessons and one practice ahead — about 45 minutes.

Now

Tool calling

In progress

35 min

After

Module wrap + practice

20 min

AI

Mentor

Ready

Want to refresh tool calling before the next lesson? I can run a quick 3-question recap.

Skill map

Updated after assessment

LLM fundamentalsStrong

AI-assisted codingStrong

Tool calling & agentsBuilding

RAG & retrievalBuilding

Evals & observabilityGap

Production deploymentGap

Recent activity

Completed Lesson 3.3 — Streaming responses

2 hours ago

Earned badge LLM API Builder

Yesterday

Mock interview Technical · 68/100

3 days ago

Practice Build a /chat endpoint · all tests passed

4 days ago

Curriculum

Updated monthly · Calibrated to May 2026 reality + the bets we're making on 2027–2028.

~50 hours · 8 modules + capstone

Software Engineer track

From AI-assisted coding to building, evaluating, and shipping agentic systems in production. Calibrated for senior+ engineers who already ship code and need to layer AI fluency on top. Updated for May 2026 — covers MCP, reasoning models, agent orchestration, eval-driven development, and the production patterns the top AI-native companies are using right now.

Module 1 · Complete

AI foundations for engineers

3.5h · 6 lessons

LLMs, transformers, tokens, context windows, embeddings — what models can and can't do.

LLMs Tokens Embeddings

Module 2 · Complete

AI-assisted coding

4h · 7 lessons

Copilot, Cursor, Claude Code. Prompt patterns for code gen, review, refactor. When to trust output.

Copilot Code review Refactor

3

Module 3 · In progress

Building with LLM APIs

5.5h · 8 lessons · 60% done

OpenAI/Anthropic APIs, structured outputs, tool/function calling, streaming, error handling.

Lesson 1 — Calling the API

Lesson 2 — Structured outputs

Lesson 3 — Streaming

Lesson 4 — Tool calling

Lesson 5 — Router agents

Lesson 6 — Error handling

4

Module 4 · Up next

RAG & vector search

6h · 9 lessons

Embeddings, chunking strategies, vector DBs, hybrid search, retrieval evaluation.

Embeddings pgvector Hybrid search Re-rankers

Module 5 · Locked

Agentic systems

7h · 10 lessons

Single-agent and multi-agent patterns. Model Context Protocol (MCP) deep dive. Tool design and ergonomics. State and memory management. Human-in-the-loop checkpoints. Computer use, browser agents, and voice agents.

MCPAgent loopsTool designComputer useVoice agents

Module 6 · Locked

Evals & reliability

5.5h · 8 lessons

Eval-driven development. Building golden datasets. LLM-as-judge patterns. Regression testing prompts. Production observability (Langfuse, Braintrust, Helicone). Hallucination detection. Drift monitoring. A/B testing AI features.

EvalsLLM-as-judgeLangfuseDrift

Module 7 · Locked

Production AI

6h · 8 lessons

Cost engineering & token economics. Caching strategies (prompt caching, semantic caching). Latency optimization. Prompt injection defense. Output validation. Fine-tuning vs RAG vs prompting tradeoffs. Distillation & small model deployment. Model gateways.

CostCachingPrompt injectionDistillation

Module 8 · Locked

AI-native software architecture

5h · 7 lessons

Designing systems AI-first vs bolt-on. Persistent memory and personalization. Conversational UX vs traditional UX. Failure modes & graceful degradation. Human-AI collaboration patterns. Reading the next 3 years — what to bet on, what to avoid.

ArchitectureMemoryAI UXFuture bets

Final capstone

Ship an AI feature into a real codebase

10–14 days

Multi-week project, AI mentor pair-programs with you, employer-style rubric, public artifact for your portfolio.

DevOps / Platform Engineer track

Build the infrastructure that lets your company ship AI safely, cheaply, and reliably. Covers the full LLMOps stack — gateways, observability, FinOps for AI, prompt injection defense, EU AI Act technical compliance, and AI for operations itself. Calibrated for senior platform engineers, SREs, and infra leads who need to own AI infrastructure.

~52 hours · 8 modules + capstone

1

Module 1

AI infrastructure fundamentals

5h · 7 lessons

Inference vs training economics. GPU landscape today (H100/H200/B200/MI300). Model serving primitives. Inference engines (vLLM, TGI, TensorRT-LLM, SGLang). Throughput vs latency tradeoffs. Quantization (FP8, INT4, ternary).

GPUsvLLMQuantizationInference

2

Module 2

Deploying models at scale

6h · 8 lessons

Cloud platforms compared (Bedrock, Vertex, Together, Fireworks, Replicate). Self-hosting tradeoffs. Multi-region deployment. Auto-scaling for spiky AI traffic. Cold-start mitigation. Edge deployment for latency-critical apps.

BedrockMulti-regionAuto-scalingEdge

3

Module 3

The LLMOps stack

6.5h · 9 lessons

Model gateways (LiteLLM, Portkey, OpenRouter). Prompt management & versioning. Vector DB ops at scale (pgvector, Turbopuffer, Pinecone). Observability platforms (Langfuse, Helicone, Braintrust). Eval pipelines as CI. Prompt CI/CD. Secrets management for AI.

GatewaysLangfusePrompt versioningCI/CD

4

Module 4

Cost engineering for AI

5h · 7 lessons

Token economics in 2026. Prompt caching strategies. Semantic caching. Model routing (cheap → expensive). Batch processing & async pipelines. Cost monitoring & alerts. Budget enforcement. FinOps for AI is becoming a named discipline — be the person who runs it.

Token economicsCachingFinOpsRouting

5

Module 5

Observability & reliability

6h · 8 lessons

OpenTelemetry GenAI & OpenLLMetry. Tracing AI calls end-to-end. Eval-as-monitor patterns. Drift detection in production. Incident response for AI systems. SLOs for non-deterministic systems. Postmortems with LLM-specific factors. Chaos engineering for agents.

OTel GenAISLOsDriftIncident response

6

Module 6

AI security & compliance

6h · 8 lessons

OWASP LLM Top 10 (2026 update). Prompt injection defense at the gateway. PII redaction & DLP for AI traffic. Secrets in prompts. EU AI Act technical requirements (now in force). ISO 42001 / NIST AI RMF implementation. Audit trails for AI decisions. Red-teaming AI systems.

OWASP LLMEU AI ActDLPRed team

7

Module 7

AI for operations

5h · 7 lessons

LLM-powered log analysis. Incident summarization with AI. Runbook agents. AI-assisted on-call rotations. ChatOps with AI. Auto-remediation patterns (and when not to). Code review agents for ops PRs. The platform team itself becomes AI-augmented.

LogAIRunbook agentsAuto-remediationChatOps

8

Module 8

Building AI platforms

6h · 8 lessons

Internal AI gateway architecture. Self-service AI for product teams. Governance & quotas. Multi-tenant isolation. Cost attribution by team/product. Platform metrics & adoption. Evangelizing internal platforms — being the person who builds the company's AI infrastructure is a career-defining move.

Internal platformsSelf-serviceGovernanceAdoption

Final capstone

Build an internal AI gateway

10–14 days

Auth, rate limits, cost controls, prompt injection defense at the edge, full observability, and a self-service onboarding flow for product teams. Real infrastructure your company could deploy.

Product Manager track

Become the PM who can credibly scope, ship, and measure AI features. Covers AI literacy without hand-waving, designing AI experiences that earn user trust, eval-driven product development, EU AI Act compliance, and AI-native product strategy. Calibrated for product managers at growth-stage and enterprise companies who need to be the AI lead in the room.

~45 hours · 8 modules + capstone

1

Module 1

AI literacy for PMs

4h · 7 lessons

How LLMs actually work — no hand-waving. Capability map for May 2026. Reasoning vs non-reasoning models (when to use which). Cost-latency-quality triangle. Open vs closed model decisions. Multimodal capabilities & UX. Where AI is still bad (and getting better fast).

LLMsCapabilitiesTradeoffsReasoning

2

Module 2

AI product discovery

5h · 8 lessons

Identifying real AI-fit problems vs AI feature theater. Opportunity sizing for AI features. User research with AI in the loop. Build vs buy vs orchestrate. Distinguishing AI-native problems from AI-augmented ones. Telling demos apart from products.

DiscoverySizingBuild/buyUser research

3

Module 3

Designing AI experiences

5.5h · 8 lessons

UX patterns that work in 2026: copilots, autocomplete, agents, ambient AI. Trust & transparency. Designing for failure modes (hallucinations, edge cases). Confidence indicators. Human-AI handoffs. Voice and multimodal UX. Personalization without creepiness.

AI UXTrustVoiceFailure modes

4

Module 4

Working with AI teams

5h · 8 lessons

Writing AI feature specs that engineers can actually build. Eval-driven product development. Dataset curation & labeling as a PM responsibility. A/B testing AI features. Shipping iteratively (canary, staged rollout). Working with ML engineers vs AI engineers.

SpecsDatasetsA/B testingRollout

5

Module 5

AI metrics & evaluation

5.5h · 8 lessons

Offline evals vs online metrics. Building golden datasets as a PM. LLM-as-judge for product metrics. Quality gates for shipping AI. Detecting drift in production. North-star metrics for AI products. The cost-quality-latency triangle in product decisions.

EvalsGolden setsMetricsQuality gates

6

Module 6

AI strategy & build vs buy

5h · 7 lessons

Moats in the AI era (data, distribution, workflow). Choosing model providers. Open vs closed source decisions. Fine-tuning decisions for PMs. Vertical vs horizontal AI products. Pricing AI features. Defensibility analysis. Competitive intelligence with AI.

StrategyPricingMoatsBuild/buy

7

Module 7

Responsible AI & compliance

5h · 7 lessons

EU AI Act for PMs (in force as of 2026). Risk classification for AI features. Bias & fairness audits. Privacy by design. AI red-teaming as PM responsibility. Transparency requirements. Customer trust & disclosure. The PM owns the risk register, not the legal team.

EU AI ActRiskTransparencyRed-teaming

8

Module 8

AI-native product strategy

4.5h · 7 lessons

2026–2030 product landscape. Agentic vs feature-based products. Persistent memory & context. Voice as primary interface. AI organizations: structuring teams. Career strategy for PMs in the AI era. Reading the next 3 years and betting accordingly.

StrategyAgentsCareerFuture

Final capstone

Spec a real AI feature end-to-end

10–14 days

Full PRD + eval plan + GTM brief for an AI feature in a real product, with risk classification under EU AI Act, success metrics, eval methodology, and rollout plan. Reviewed against an employer-style rubric.

Module 3 · Building with LLM APIs

Lesson 4 — Tool calling with structured outputs

Lesson 4 of 8

Concept

When to use tool calling vs. structured outputs

Both let you constrain what the model returns, but they solve different problems. Structured outputs force a JSON shape. Tool calling lets the model decide when to invoke a function and which one.

Use tool calling when the model needs to take action (call a payment API, query a database, hit a search index). Use structured outputs when you just need clean data extraction.

router_agent.ts

// Define tools the model can call
const tools = [{
  name: "refund_payment",
  description: "Issue a refund for a charge",
  input_schema: {
    type: "object",
    properties: {
      charge_id: { type: "string" },
      amount_cents: { type: "integer" }
    },
    required: ["charge_id"]
  }
}];

const response = await client.messages.create({
  model: "claude-sonnet-4-6",
  tools,
  messages: [{ role: "user", content: query }]
});

Common pitfall. Always validate tool inputs server-side. Models can produce schema-valid arguments that still violate your business rules.

AI

Mentor

Reading lesson 4

Quick check before you continue: when would you set tool_choice to "any" instead of "auto"?

When I always want a tool called — like a routing layer that has to pick something.

Right. Watch the cost — the model can't shortcut a "no tool needed" response. Want me to generate a payment-router example using your stack (TypeScript, Stripe)?

Yes please — include refund edge cases.

On it. Generating with three tools: issue_refund, flag_for_review, request_more_info. I'll add a test case where the user is requesting a refund 90 days post-charge.

Practice · Module 3

Build a refund-routing agent

Medium ~25 minutes

Problem

Refund-routing agent

Build a function that takes a customer message and routes it to one of three actions: issue_refund, flag_for_review, or request_more_info.

Requirements

·Use Anthropic SDK with tool calling
·Handle refunds within 30 days only
·Flag suspicious patterns to human review
·Pass all 5 test cases

Hint: set tool_choice: "any" to force a routing decision.

refund_router.ts TypeScript

1 import Anthropic from "@anthropic-ai/sdk";
2
3 const client = new Anthropic();
4
5 const tools = [
6 {
7 name: "issue_refund",
8 description: "Issue a full or partial refund",
9 input_schema: {
10 type: "object",
11 properties: {
12 charge_id: { type: "string" },
13 reason: { type: "string" }
14 },
15 required: ["charge_id", "reason"]
16 }
17 },
18 // TODO: add flag_for_review and request_more_info
19 ];
20
21 export async function routeRefund(message: string) {
22 // TODO: call client.messages.create with tools
23 }

Test cases

Refund under 30 days

Flag stale refund (90d)

Request info on missing charge_id

Multi-charge refund

Adversarial prompt

2 of 5 passing

AI

Mentor review

Your tool schema for issue_refund looks good. The failing test is because you're missing request_more_info entirely. Add it as a third tool and pass tool_choice: "any".

Mock interview

Calibrated to your target role: Senior SWE · AI-fluent

Recording · Question 2 of 4 · 12:34

AI

Interviewer

You're designing a customer-support agent that has access to a knowledge base, an order-history database, and a refund tool. The team is worried about it making mistakes on refunds.

Walk me through how you'd architect this — and how you'd evaluate whether it's safe enough to ship.

NK

You · live transcript

"I'd start by separating the read-only tools — knowledge base and order history — from the write tool, the refund. The refund needs guardrails. I'd add a confirmation step where the model has to explain its reasoning before the refund executes, and I'd cap any single refund amount at, say, $200 without human approval. For evals, I'd build a golden set of 100 edge cases — late refunds, fraud-pattern triggers, multi-charge scenarios — and run them..."

02:14

Take your time. The interviewer waits up to 60s for your answer to land.

Live signals

StructureStrong

Technical depthGood

Trade-off awarenessMid

CommunicationGood

Coverage so far

Tool separation (read/write)

Human-in-the-loop

Eval golden set

Cost & latency budget

Prompt injection defense

Monitoring & rollback

Hint available

Strong start — but you haven't talked about prompt injection yet.

Final capstone · 9 days remaining

Ship a customer-support agent

Real codebase · Real evals · Real artifact for your portfolio

Milestones

1. Define use case & success criteria

Day 1–2

PRD + 5 example tickets + acceptance criteria

2. Build retrieval over knowledge base

Day 3–4

Embeddings, chunking strategy, basic eval

3. Add agent loop with tools

In progress · Day 5

Tool calling, refund routing, guardrails

60%

4. Build eval harness

Day 7–8

Golden set of 100 cases, regression tests

5. Deploy with observability

Day 9–10

Cost tracking, latency, prompt logging, alerts

6. Submit + AI review against rubric

Day 11–12

Final review, scorecard, portfolio publication

Files committed

main · 14 commits

src/agent.ts

+148 / -122h ago

src/tools/refund.ts

+87 / -32h ago

src/retrieval/index.ts

+212 / -01d ago

evals/golden_set.json

+1,420 / -02d ago

AI

Co-pilot

I noticed your refund.ts doesn't handle the 30-day window yet. Want me to draft the validation logic?

Rubric preview

Correctness22 / 25

Eval rigor12 / 25

Production readiness— / 25

Communication— / 25

Current score 34 / 100

Reviewer note. Your eval harness needs work before submission. Cover edge cases for prompt injection and adversarial refund requests.

refactor.dev/n/neha-k

NK

Neha K.

Senior Software Engineer · Refactoring for AI

SWE track 3 modules complete Capstone in progress

Interview readiness

62/100

Backend engineer with 7 years building payments infra. Currently leveling up on production AI — RAG systems, agent loops, and eval rigor. Open to AI-engineering roles at growth-stage companies.

Skills verified

LLM fundamentals84

AI-assisted coding78

Tool calling72

Streaming & APIs76

RAG & retrieval52

Badges

3 of 6 earned

Mock interview history

Technical · Agents

3 days ago

68

AI fluency

6 days ago

71

Behavioral

2 weeks ago

54

Capstone projects

1 in progress · 0 published

Customer-support agent with RAG and refund routing

In progress · 60% complete · ETA 9 days

Capstone

A production-grade support agent over a knowledge base of 2,400 docs. Handles refund routing with guardrails, includes eval harness over 100 golden cases.

TypeScript Anthropic SDK pgvector Stripe

Future capstone slot

Unlocks after first capstone is shipped

Refactor your tech career for the AI era.

In May 2026, AI fluency is the dividing line.

Built for the role you're already in

Tech work doesn't look like it did 18 months ago.

An AI mentor that actually knows you.

From "I should learn AI" to "I'm running the AI initiative."

Simple, monthly, cancel anytime.

Common questions

Refactor takes 10 minutes to start.

You're building a RAG system over 50,000 internal docs. Retrieval keeps surfacing irrelevant chunks even when queries are clearly worded.

Welcome back, Neha.

Curriculum

When to use tool calling vs. structured outputs

Refund-routing agent

Mock interview

Ship a customer-support agent

Neha K.

Refactor your tech career
for the AI era.