Greema Joy — Lead Product Designer

Selected Case Studies

WDCF — Weighted Design Capacity Framework

Role: Sole Author · Context: Enterprise design org, vendor-client model · Output: Operational framework + measurement system

Most organisations track design capacity by hours. This model fails because cognitive work is not uniform in weight, a large portion of design work never appears in any project tool, and coordination overhead in distributed teams is invisible to every burndown chart.

Leadership made resourcing decisions on visible deliverables. Designers experienced workload as cognitive, emotional, and coordination weight. The two never matched.

I identified this as a translation problem, not a productivity problem. I built WDCF: a five-dimension weighted scoring system that gives every design task a Weighted Capacity Score — a language leadership can read and designers can use without it sounding like complaint.

The same problem exists in AI product design. When a system's behaviour is invisible to the people depending on it, trust breaks. Making invisible behaviour legible is the design problem in both cases.

The instinct when capacity feels broken is to ask for more headcount or a better process. I went one step back and asked why every existing measurement method was failing.

Story points are engineering-native — they don't transfer to design work. Hours logged capture input time but nothing about quality of attention or context-switching cost. Deliverable count rewards volume without accounting for revision cycles or upstream ambiguity. Headcount ratios ignore skill specialisation and cross-team reach. Self-reported load suffers from anchoring bias and systematic under-reporting in low-trust environments.

The reframe: this was not a resourcing problem. It was a translation problem. The cognitive cost of design work existed — generating burnout, missed commitments, and failed headcount cases. It had no number attached to it. And without a number, it had no voice in an enterprise decision room.

Methods audit — what each approach misses

Method	What it measures	What it misses
Story Points	Relative effort (engineering-native)	Cognitive load, invisible work, coordination cost
Hours Logged	Time spent on tracked tasks	Quality of attention, context-switching, async work
Deliverable Count	Quantity of outputs	Revision cycles, upstream ambiguity, strategic weight
Headcount Ratio	Designer-to-engineer ratio	Skill specialisation, project complexity, cross-team reach
Self-Reported Load	Subjective sense of busyness	Anchoring bias, imposter syndrome under-reporting

Cognitive Load (30%) — Sweller's theory distinguishes intrinsic load from extraneous load added by coordination and process. Miller and Cowan's working memory research establishes a ~4-chunk capacity limit. Tasks exceeding this threshold degrade output quality regardless of reported effort.

Collaboration Complexity (25%) — Wegner's Transactive Memory Systems research shows coordination cost scales non-linearly with team size in distributed environments. Never appears in any task tracker.

Iteration Depth (20%) — Each revision cycle involves re-engagement with prior decisions, stakeholder re-explanation, and sunk-effort re-anchoring. The psychological cost compounds across cycles.

Invisible Work (15%) — Hochschild's emotional labour theory: stakeholder management, conflict absorption, and ambiguity facilitation are psychological work with a measurable cost. They do not appear in Jira.

Strategic Weight (10%) — Carries lowest share because its psychological pressure is partially captured in Cognitive Load through Prospect Theory's loss-aversion effect.

Framework dimensions — weight distribution

Cognitive Load30%Sweller, 1988

Collaboration Complexity25%Wegner, 1987

Iteration Depth20%Reinforcement theory

Invisible Work15%Hochschild, 1983

Strategic Weight10%Kahneman & Tversky

The formula: WCS = (CL × 0.30) + (CC × 0.25) + (ID × 0.20) + (IW × 0.15) + (SW × 0.10). Scale 1.0–5.0. Each dimension scored 1–5 against explicit behavioural anchors to prevent inter-rater drift and allow longitudinal comparison.

Four operational states. The Optimal Zone (2.5–3.4) is grounded in the Yerkes-Dodson performance curve: too low is disengagement, too high is degraded output quality.

The deliberate constraint: the entire system runs in Google Sheets. Zero procurement. Zero onboarding friction. Any manager opens the live dashboard in a browser today. The constraint is a feature. A tool that requires an approval process to adopt would die in that process.

WCS operational states

1.0–2.4

2.5–3.4

3.5–3.9

4.0–5.0

Underutilised

Optimal Zone

Caution

Overload Risk

Weekly capacity dashboard — mock view

Capacity Dashboard · Design TeamWeek 23 · May 2026

Designer	Mon	Tue	Wed	Thu	Fri	Avg	State
Designer A	3.2	3.8	4.1	4.3	3.9	3.86	Caution
Designer B	2.1	2.4	2.8	3.1	2.9	2.66	Optimal
Designer C	4.2	4.5	4.4	3.9	4.1	4.22	Overload
Designer D	1.8	2.1	1.9	2.3	2.0	2.02	Underutil.

WDCF is a trust calibration system. It exists because there was a structural mismatch between the true state of a system — actual designer capacity — and the signals available to the people making decisions about it.

That mismatch is the central design problem in human-AI products. When an AI system takes an action without making its reasoning visible, when its confidence is uncalibrated, when its failure modes are undisclosed — the person depending on it is making decisions based on incomplete signals.

The design question is identical in both cases: what needs to be made legible, to whom, in what form, so that trust is earned through transparency rather than assumed through absence of failure.

"WDCF is not an AI product. It is a proof of concept that I think in this structure."

Structural parallel — same problem, different layer

WDCF — Design Org

Invisible systemActual cognitive load of design work

Incomplete signalHours logged, deliverable count

Decision gapLeadership resourcing on wrong data

Design responseWeighted scoring → legible dashboard

OutcomeTrust through calibrated, shared data

Same problem

AI Product Design

Invisible systemModel reasoning, confidence, failure modes

Incomplete signalOutput with no context or uncertainty shown

Decision gapUser acts on opaque model behaviour

Design responseInterface surfaces reasoning → legible signals

OutcomeTrust earned through transparency, not assumed

Agentic AI for TPMs — Drawing the Autonomy Boundary

Role: Lead Product Designer · Context: GlobalLogic DevShop — agentic assistant (Workspace · Jira · Vertex AI) · Output: Trust-boundary model + product scope definition

The brief described a capability: an AI assistant that could run technical programs — generate status, update Jira, draft stakeholder comms, pull from Workspace. It read as a feature list. Underneath it there was no user model, and no agreement on the one word the whole product turned on: agentic.

Three groups used that word to mean three different things. To engineering it meant autonomous execution. To the program managers it meant a faster first draft they still controlled. To leadership it meant headcount leverage. All three would have approved the same screen and expected a different product.

This was not an interface problem wearing the word "AI". It was a trust-and-scope problem. The design question was never what does the assistant look like — it was what the agent is allowed to do without asking, and what it must never do without a human in the loop. I did not design a screen until that line was drawn.

An agent acting inside someone's program is making decisions under their name. The moment its authority is ambiguous, the PM either rubber-stamps it or stops trusting it — automation bias in one direction, abandonment in the other. The boundary is the product; the interface is only where it becomes visible. Making invisible authority legible is the problem WDCF solved for invisible capacity — one layer up.

The instinct in the room was to build the assistant and react to feedback. That ships the disagreement — everyone projects their own definition onto the same demo, and the conflict resurfaces in production, where it is expensive. I ran a definition exercise before any UI existed.

I enumerated every action the tool could theoretically take and asked one question of each, across all three groups: should the agent do this on its own, propose it for approval, or never do it. The disagreements were not random. They clustered exactly where the stakes were highest — anything that touched a commitment, a deadline, or a written message carrying the PM's name.

The reframe: "agentic" is not a capability level to dial up. It is a permission structure to negotiate. I killed the framing of "how autonomous should it be" — that question assumes a single axis, and there isn't one. Autonomy is per-action, and it is governed by consequence, not by how capable the model is.

What "agentic" meant — and the risk each definition ignored

Stakeholder	Their definition	The risk it ignored
Engineering	Autonomous execution, end to end	Who is accountable when the agent is wrong under the PM's name
Program Managers	A faster draft they still approve	Silent scope creep — the agent committing more than intended
Leadership	Headcount leverage, throughput	Erosion of the human judgment the role exists to provide

I sorted every agent action into three tiers. Autonomous: the agent acts and writes it to a log. Proposed: the agent drafts, the human commits. Prohibited: the agent never acts — it escalates to a person.

The placement rule was behavioural, not technical. An action moves to Proposed or stricter when the cost of being wrong is borne by someone who can no longer inspect it — the PM absorbs the relational damage when a wrong message goes out under their name. I anchored each tier to consequence and reversibility, never to model confidence — because confidence is precisely the signal users over-trust. Tie the boundary to confidence and you have automated the bias instead of designing against it.

Automation bias is the mechanism: under load, people accept system output without verifying it. The boundary does not ask the PM to be more vigilant — vigilance is not a design. It moves verification to the few points where being wrong is expensive and hard to undo, and removes it everywhere else.

Autonomy tiers — sorted by consequence, not capability

ProhibitedAgent never acts · escalates to a human

Send external commitments · Move a deadline · Close a stakeholder risk

ProposedAgent drafts · the human commits

Status updates · Jira changes · Any comms in the PM's name

AutonomousAgent acts · logs for review

Collate data · Format reports · Surface anomalies

Once the boundary was agreed, the interface followed almost mechanically. Autonomous actions surface as a reviewable log — present, never interruptive. Proposed actions stop at a commit step the PM owns: the draft is the agent's, the decision is the human's. Prohibited actions have no button. The absence is the design.

I killed the concept engineering most wanted — the ambient assistant that quietly acts in the background. It tested as anxiety, not assistance. A program manager cannot be accountable for a program whose state is being changed by something they can't watch. Background autonomy over accountable work is a trust leak, not a feature.

The governing principle is legibility: at any moment the PM must be able to reconstruct what the agent did and why. An agent you cannot audit is an agent you cannot trust — and an assistant nobody trusts gets switched off, regardless of how capable it is.

Proposed action — the commit step the human owns

Program Assistant · Proposed ActionAwaiting PM approval

ActionPost weekly status to 3 stakeholders

Drafted byAgent · from Jira + Workspace activity

Sends asG. Joy (PM)

Approve & send Edit draft Dismiss

Autonomous log · agent collated 14 Jira items, flagged 1 slipping milestone — no approval required

The deliverable was a boundary, not a layout. Engineering received a permission spec. Leadership received a scope they could not quietly widen. The PMs received an assistant whose authority they could name — which is the precondition for trusting it at all.

This is the same structure as WDCF. There, I made invisible human capacity legible to leadership so resourcing decisions matched reality. Here, I made invisible agent authority legible to its user so delegation decisions matched risk. Both are trust-calibration problems: a system acting on signals its decision-maker cannot fully see.

Get the boundary too loose and the PM rubber-stamps — the agent's mistakes become theirs. Too tight and they abandon it — you have shipped expensive autocomplete. The calibrated middle is not a confidence threshold. It is a named, auditable line between what the agent owns and what the human owns.

"An agent's intelligence is not the design problem. Its authority is. Get the boundary wrong and capability becomes liability."

Delegation calibration — two failure modes, one workable middle

Too loose

Calibrated authority

Too tight

Rubber-stamping · automation bias

Named, legible, auditable boundary

Abandoned · "just autocomplete"

Lead Product Designer

Started as an engineer.
Stayed because of what engineering couldn't solve.

WDCF — Weighted Design Capacity Framework

Agentic AI for TPMs — Drawing the Autonomy Boundary

The right role is one where
the problem isn't defined yet.

Lead Product Designer

Started as an engineer.Stayed because of what engineering couldn't solve.

WDCF — Weighted Design Capacity Framework

Agentic AI for TPMs — Drawing the Autonomy Boundary

The right role is one wherethe problem isn't defined yet.

Started as an engineer.
Stayed because of what engineering couldn't solve.

The right role is one where
the problem isn't defined yet.