The 270 Fit Models Problem: What Activewear Grading Looks Like When You Take It Seriously

Published 14 May 2026•Last updated 15 May 2026•8 min read

By Nabla Labs ResearchUpstream fit validation and population-aware grading for fashion brands and manufacturers.

Performance apparel brands today are running into the same grading problem at the edges of their size run. The brands solving it well are paying for it — Evelyn & Bobbie uses 270 fit models per style instead of grading from one. Form spent two years building inclusive sizing with community fit testing. Bloom Bras hired a corset specialist and a NASA engineer to engineer a sports bra for sizes 28D–50K.

Each of those workarounds is solving the same underlying issue: linear grade rules, calibrated against a single representative body, break at the extended sizes where performance fabrics matter most. This piece looks at the operational cost of that workaround, what current 3D simulation tools can and can't do about it, and what's beginning to replace it.

What is the 270 fit models problem in activewear?

The 270 fit models problem is the operational cost successful performance brands pay because linear grade rules don't produce a consistent fit across an extended size range. Evelyn & Bobbie's founder Bree McKeen builds her bras with 270 fit models across seven sizes, grading each style individually, instead of scaling from a single base size.

In March 2026, Fortune ran a profile of Bree McKeen, the founder of Evelyn & Bobbie, the fastest-growing bra brand at Nordstrom [1]. The story turns on a number that's worth sitting with.

McKeen's stated reason for the 270 fit models, in the same piece, is direct: Most bra companies have like one or two fit models. Scale a 34B to a 34DDD by rule, and the larger sizes don't fit — not because the rule is wrong, but because no single body contains the answer.

“Most bra companies have like one or two fit models.”
— Bree McKeen, founder, Evelyn & Bobbie · Fortune, March 2026

Two hundred and seventy fit models is not a marketing flourish. The number is the operational cost of solving a real grading problem the way the industry currently knows how to solve it: with bodies in a room.

270 fit models per style is, by any reasonable standard, an enormous expense. The conclusion most performance brand founders draw from McKeen's process isn't we should hire 270 fit models. The conclusion is that something about how grading currently works isn't working.

This piece is about what that something is, and what's beginning to replace it.

Why does grading fail more often in activewear than in other categories?

Activewear depends on engineered fabric mechanics — negative ease, four-way stretch, and modulus — that don't scale linearly with a size chart. A grade rule that adds width proportionally from M to 3XL preserves the chart but quietly violates the fabric mechanics, causing waistband collapse, gusset failure, and edge-size returns.

Performance categories — compression leggings, sports bras, cycling kit, training tights — are unusually punishing to grade. The reason is physical, not stylistic.

The waistband that grips a medium can collapse on a larger body because the panel underneath has stretched too far, losing the gripping power that came from the fabric modulus, not the seam. Our own breakdown of this dynamic lives in our notes on fabric-aware grading.

The linear-grading failure in performance fabrics is not a theoretical problem. Gris Chen, a manufacturing specialist with fourteen years in athletic apparel at LeelineSports, audited 500-plus plus-size returns for a single client earlier this year. The pattern Chen documented was consistent: auto-scaling 80/20 nylon-spandex past size XL creates a 12% error rate in crotch depth [3]. A 12% error in crotch depth is not a tolerance issue. The 12% error is a fit failure.

The same LeelineSports audit found that for a single style with a 12% return rate, the root cause was not fabric quality and not sewing execution — the root cause was geometry: factories rely on linear grading instead of testing material behavior against established apparel sizing standards [3]. The grading rule was applied. The result still failed.

Sourcify, a sourcing partner that works with apparel brands at small and mid scale, names the same pattern from the other side of the table. In Sourcify's own grading guide for brands, the first listed founder mistake is the most common: only testing one size [4].

In categories where fit is decorative, single-size validation is wasteful. In categories where fit is functional, single-size validation produces returns.

Three ways the industry handles body variation todayFrom one body to a distribution

1 fit model

The default. One body sets the size chart.

270 fit models

Evelyn & Bobbie's approach. Bodies in a room, paid by the hour.

A computational body distribution

A population matched to your real customer base, sampled in software.

How are leading performance brands actually solving inclusive sizing?

Category-leading performance brands solve inclusive sizing by running a more expensive process across multiple bodies — physically — rather than trusting linear grade rules. Evelyn & Bobbie uses 270 fit models per style; Form ran multi-year community fit testing; Bloom Bras combined corset construction and engineered support for sizes 28D–50K; Active Truth was founded specifically because activewear above size 16 was failing in predictable ways.

Once you look for the workaround, the same workaround appears in every performance brand that has actually solved the fit problem at extended sizes.

Form, the $43M activewear brand co-founded by Sami Spalter and Sami Clarke, launched its Every Body Collection in 2026 after years of build-up. In an Inside Retail interview at the time of the launch, Spalter described the development cost in plain language: We didn't want just to grade up existing patterns and call it inclusive. [2] Form's method was a community-driven physical fit programme across body types — slower than the standard grading workflow, more honest about what the work would take.

Active Truth, the Australian size-inclusive brand spanning AU 6–26 [6], was founded explicitly because the activewear sold to women above a size 16 was failing in predictable ways — sagging, rolling, and going translucent under stretch. The Active Truth founders' diagnosis was structural, not aesthetic.

Bloom Bras, the sports bra brand serving 28D–50K, was founded by Elyse Kaye after Kaye failed to find a high-impact sports bra that worked at her size. Kaye's framing of the problem is the cleanest summary of the pattern: It's not a design flaw, it's an engineering challenge. [5]

What unites Evelyn & Bobbie, Form, Active Truth, and Bloom Bras is not a marketing position. The common thread is a method. Each brand, at some point, stopped trusting linear grade rules and started running a more expensive process across multiple bodies — physically. The reason these brands' products fit better than their competitors' is, at the root, that each one paid a tax their competitors didn't.

The fit-model tax is what the 270 fit models problem actually refers to.

What can CLO3D, Browzwear, and Style3D do for activewear grading — and what can't they?

CLO3D, Browzwear's VStitcher, and Style3D's Fit Maker simulate a garment on a single 3D avatar and produce a strain map; the simulators do not, by themselves, compute the corrected grade rules across a body distribution. The pattern correction work — interpreting the heatmap and editing the grade rules size by size — still falls to the human looking at the simulator.

The fashion-tech response to the fit problem has converged on 3D garment simulation: CLO3D, Browzwear's VStitcher, and more recently Style3D's Fit Maker, explicitly positioned for the compression and activewear market [8].

These tools genuinely move the work forward. The tools replace the cost of physical samples with the cost of digital ones. For brands that can afford to operate the tools, the workflow is a meaningful upgrade on the workflow of ten years ago.

The tools also have well-documented limits that matter to smaller performance brands specifically. In a recent breakdown of 3D design in sportswear, the same manufacturing audit team that documented the 12% crotch-depth error wrote bluntly about who the tools are actually accessible to: licensing and hardware requirements overwhelm midsize brands [7]. The training cost — roughly three months to onboard a team to CLO3D — sits on top of the licence and the GPU workstation that runs the software. For a brand with one technical designer and a freelance grader, that stack is not realistic.

The more important limitation is structural. Existing simulators show a garment on an avatar. The simulators do not, by themselves, change the pattern. The designer still has to interpret the simulation, decide what the heatmap means for each size in the run, and hand-edit the grade rules. The work that the 270 fit models were doing — telling McKeen what to actually correct, size by size — is not what the simulators do. The interpretation work is what the human looking at the simulator does.

Side-by-side, the trade-offs land roughly as follows.

Tool	What it does well for activewear	What it does not do	Cost barrier for small/mid brands
CLO3D	Realistic fabric drape and visualization on a single avatar.	Does not compute corrected grade rules across a body distribution.	~3 months training, GPU workstation required.
Browzwear VStitcher	High-fidelity simulation, strong technical-design integration.	Diagnostic only — pattern correction is manual.	Enterprise licence pricing.
Style3D Fit Maker	Activewear-targeted fabric physics, cloud workflow.	Single-avatar fit testing by default.	Mid-tier SaaS pricing, learning curve.
Manual fit-model programs	Real-world ground truth across multiple bodies.	Slow, expensive, brand-locked.	Fit-model fees per size, time-bound.

The cost equation has stayed roughly where it is because simulation reduces the number of physical samples. Simulation does not yet reduce the number of fit decisions a person has to make per size.

What is population-aware grading?

Population-aware grading is the practice of validating a graded pattern set against a statistical population of bodies that represents a brand's customer base, rather than against a single representative body per demographic. The technique tests every size in the range across the variance of the customer base and identifies where grade rules fail at each size.

The fit standards that small and mid-sized performance brands inherit — from Alvanon, from internal blocks, from factory libraries — are almost all defined as a body. One body per demographic. A representative size 8 woman. A representative men's medium.

The single-body standard is a useful abstraction. The single-body standard is also the abstraction that breaks at the edges of a size run, because no single body represents the variance inside its own demographic.

The change underway is to define the fit standard not as a body but as a distribution. Same starting point — your existing size charts, your existing fit standards, your existing brand identity around what “medium” means. Different unit of analysis: a population of bodies that statistically represents your real customer base, used as the test bed for every size in your range. We document the methodology in our population-aware grading and fit validation methodology pages.

Population-aware grading is what Nabla Labs runs. The technique tests a graded DXF set across hundreds of bodies sampled to match a target customer — region, age band, athletic or standard build, or a custom distribution — and the simulation identifies where grading breaks at each size, on each body type. Population-aware grading is the computational equivalent of what McKeen does with 270 fit models, run against the actual statistical shape of a brand's customer base rather than the bodies the brand could find and pay.

The output is not a different design tool. The output is a diagnostic, sized to the brand's market, produced from the brand's own pattern files.

What does closed-loop correction add over standard 3D simulation?

Closed-loop correction returns a proposed corrected DXF — not just a strain diagnostic — by computing the pattern geometry that resolves the strain imbalance at each size, on each body distribution, for the specified fabric. The pattern maker reviews a candidate fix rather than reverse-engineering one from a strain map.

A diagnostic report that identifies where grading fails is useful. The question pattern makers and technical designers actually ask next is: what should the corrected geometry look like?

Existing 3D simulators stop at the diagnostic. The human runs the simulation, reads the strain map, then iterates the pattern by hand — usually several rounds, sometimes across days, sometimes across countries and time zones.

Closed-loop simulation closes that gap. Instead of stopping at where the grading breaks, the engine computes what the corrected DXF should look like. The pattern maker receives a proposed correction alongside the diagnostic, reviews the correction, refines it, approves it. We walk through the full loop on the fit validation methodology page.

Closed-loop correction does not replace the pattern maker's judgment. The technique removes the part of the work that was always going to be computational — the geometric back-solve from “this is wrong” to “this is what would be right” — and gives that part back to the human as a starting point rather than a blank file.

Operationally, what closed-loop correction changes for a performance brand is the shape of the fit session. The two or three sizes that get flagged as high-risk get physical fit-model time. The rest of the size run gets validated computationally. The fit model budget — for most brands, the single most expensive line item in development after fabric — gets spent where the budget actually does work.

What does population-aware grading look like operationally for a performance brand?

Population-aware grading runs upstream of the existing CAD workflow, ingests DXF/CAD files in industry-standard ASTM/AAMA formats, and returns factory-ready pattern files in the same formats — no 3D design seat, GPU workstation, or change to the pattern maker's workflow required.

If you are running grading and fit decisions inside a performance apparel brand right now, the practical shape of the change is this.

A performance brand does not need to pay for the 270 fit models. A performance brand does need to know that what the 270 fit models were doing — testing the grade rules against the actual variance of the customer base, not the one body the size chart describes — is a real problem and a real cost, currently paid by the brands that take fit most seriously.

A performance brand does not need a 3D design seat or a GPU workstation. The simulation runs upstream of the existing CAD workflow and returns files in the formats the factory already accepts.

A performance brand does not need to replace the pattern maker, the grader, or the factory. A performance brand needs a way to give the pattern maker a documented, repeatable, population-validated answer to the question the size chart cannot answer on its own: does our pattern actually behave the way we say it does, across every customer we say the pattern fits?

That answer is what upstream fit validation for performance brands returns.

Key Takeaways

270 fit models is a method, not a flourish. Evelyn & Bobbie's number is the operational cost of solving a real grading problem the way the industry currently knows how to solve it: with bodies in a room.
Linear grading breaks where fabric mechanics matter. Auto-scaling 80/20 nylon-spandex past XL produces measurable geometric failures — a 12% crotch-depth error is not a tolerance issue, it's a fit failure.
Existing 3D simulators move the work forward but stop at the diagnostic. The simulators do not, by themselves, change the pattern; a human still hand-edits each grade rule.
The unit of analysis is shifting from a body to a distribution. Population-aware grading tests a graded DXF set across a customer-matched body population and identifies where the grade fails, size by size.
Closed-loop correction returns proposed DXF geometry, not just a heatmap — so the pattern maker reviews a candidate fix rather than reverse-engineering one from a strain map.

References & Further Reading

[1] Mickle, Phoebe. “She left a Silicon Valley VC to solve a problem left untouched for 88 years.” Fortune, 29 March 2026.
[2] Inside Retail Asia. “Form co-founders reveal how they built a US$43 million activewear brand.” 4 May 2026.
[3] Chen, Gris. “8 Plus-Size Activewear Fit Issues And How to Fix Them.” LeelineSports, 6 May 2026.
[4] Sourcify. “Pattern Grading Explained for Apparel Brands.” 8 April 2026.
[5] Wall, Alix. “Reinventing the sports bra a learning curve for new entrepreneur.” J. The Jewish News of Northern California, 9 May 2018.
[6] Active Truth company profile, Lusha.
[7] LeelineSports. “3D Design in Sportswear: A Guide to Virtual Prototyping.” 2026.
[8] Style3D. “How Can Fit Maker Technology Revolutionize Activewear Design?” 2026.

This post was last reviewed in May 2026. We update it as the underlying data — fit-model practices, return rates, and grading-tool capabilities — evolves.

Frequently Asked Questions

What is population-aware grading?

Population-aware grading is the practice of validating a graded pattern set against a statistical population of bodies that represents a brand's actual customer base, rather than against a single representative body per demographic. The technique tests every size in the range across the variance of the customer base and identifies where grade rules fail at each size.

Why does linear grading fail in activewear specifically?

Activewear depends on engineered fabric mechanics — negative ease, four-way stretch, and modulus — that don't scale linearly with a size chart. A grade rule that adds width proportionally from size M to size 3XL preserves the chart but quietly violates the fabric mechanics, causing waistband collapse, gusset failure, and edge-size returns. One manufacturing audit documented a 12% crotch-depth error rate when 80/20 nylon-spandex is auto-scaled past size XL.

How is Nabla Labs different from CLO3D, Browzwear, or Style3D?

CLO3D, Browzwear, and Style3D are 3D simulators: each shows how a garment behaves on a single avatar and requires a human to interpret the result and adjust the pattern. Nabla Labs operates upstream: Nabla Labs validates a graded pattern set against a body distribution that matches the customer base, identifies where grading breaks at each size, and proposes the corrected DXF geometry as part of the report. Nabla Labs complements existing 3D tools rather than replacing them.

Does population-aware grading require new pattern files or a different CAD workflow?

No. Nabla Labs ingests existing DXF/CAD pattern files in industry-standard ASTM/AAMA formats and returns factory-ready pattern files in the same formats. No 3D design seat, GPU workstation, or change to the pattern maker's workflow is required.

How many fit models does a typical activewear brand use?

Most activewear brands use one or two fit models, typically at the base size. Brands that take inclusive sizing seriously use significantly more — Evelyn & Bobbie, the fastest-growing bra brand at Nordstrom, uses 270 fit models across seven sizes. Form, a $43M activewear brand, ran a multi-year community fit testing program across body types before launching its XS–XXXL Every Body Collection in 2026.

Contact Nabla Labs