What AI Video Ad Creative Actually Delivers: Case Study Results from Wyndham, Realtor.com, Goop, and Tuckernuck

A grounded assessment of AI-assisted video ad creative performance using named brand cases from 2025–2026, covering what these campaigns actually measured, where AI creative consistently delivers, and where human judgment remains essential — for performance marketers evaluating AI video creative platforms.

90+ creative variants across 15 audience segments; 23% imagery performance lift (Wyndham, brand-reported); creative team capacity recovery (Goop); overnight reactive ad production (Realtor.com)

cited public sourceadvertisingmid-market, enterprise
Industryhospitality, real estate, apparel, beauty
Tools UsedAdora AI, BrandComms.AI, Airpost, Keynes Kortex
advertisingAI creativereal resultsworkflowROIB2C
Split-composition image: left side shows a grid of video ad thumbnail frames in varied color treatments; right side shows a human hand with a stylus selecting a single video frame on a screen.
AI-assisted video creative: scale on the left, human editorial judgment on the right.

The Credibility Problem with AI Video Ad Claims

Search for "AI video ad results" and you will find a dense layer of vendor case studies, platform blog posts, and trade press roundups, all featuring impressive-sounding percentage lifts. What you will not easily find is independently audited performance data from named brands with disclosed methodology.

This article is not that either — and it is worth saying so plainly upfront. The four brand cases examined here (Wyndham, Realtor.com, Goop, and Tuckernuck) are sourced from AdExchanger trade press coverage published between February and June 2026. The metrics and workflow descriptions reflect brand-provided statements to reporters, not independently audited results. No third-party verification from IAB, Nielsen, or WARC was available for any of the specific figures cited.

That caveat matters, but it does not make the cases useless. Taken together, they reveal a consistent pattern about where AI video creative reliably delivers and where it does not — a pattern that holds across different platforms, different brand categories, and different campaign objectives. The goal here is to lay out that pattern honestly, with the sourcing visible, so you can make your own judgment about what applies to your situation.

What 'AI Video Creative' Actually Means: A Spectrum, Not a Category

One reason AI video ad claims are so hard to evaluate is that "AI video creative" is not a single thing. The term currently covers at least three meaningfully different production models, and conflating them produces misleading comparisons.

Horizontal spectrum diagram showing three zones from left to right: human-led AI-assisted, human-in-the-loop collaboration, and AI-generated with minimal human oversight.
The AI creative involvement spectrum. Where a campaign sits on this spectrum determines what outcomes are realistic — and what risks apply.
  • AI-assisted (human-led, AI-accelerated): Human strategists and creatives drive all core decisions — audience insight, concept direction, brand voice. AI handles iteration, variant generation, and asset resizing. The brand submits existing creative assets and guidelines; the platform multiplies them. Wyndham and Goop operate primarily in this zone.
  • Human-in-the-loop: AI generates initial concepts, copy drafts, or creative frameworks, which human editors then review, revise, and approve before anything goes live. Realtor.com and Airpost both operate here — AI produces the first draft, humans control what ships.
  • AI-generated with minimal human oversight: AI autonomously generates, selects, and deploys creative with limited human review. This model is technically possible with some platforms today, but none of the named brand cases in this article operate this way — and Airpost's CEO explicitly argues it does not yet produce high-performing results.

The distinction matters for interpreting results. When Wyndham reports a 23% imagery performance lift, that figure comes from an AI-assisted model where human teams still monitor performance and make manual targeting adjustments. It is not evidence that autonomous AI creative outperforms human creative. Keeping this spectrum in view is the foundation for reading the cases that follow.

Brand Cases: What Four Named Campaigns Reported

Wyndham: Variant Scale and Real-Time Audience Optimization

Wyndham's case is the most data-specific of the four. Using the Adora AI platform, the hospitality brand ran more than 90 creative variants targeting 15 distinct audience segments within a single campaign. The platform surfaced that beach imagery outperformed food and beverage imagery by 23% for a specific audience segment, then automatically adjusted ad frequency so higher-performing variants ran more often.

What the Wyndham case does not show is a fully autonomous system. Wyndham's performance marketing and account management teams continued to monitor results in real time and made manual adjustments to targeting and distribution. The brand's creative team trained the AI on its preferred aesthetic and submitted existing property photos and assets as the raw material. The AI's role was optimization and scaling, not origination.

Key metrics tracked: cost per acquisition and click-through rate. The 23% imagery performance figure is brand-reported and reflects a single segment comparison within one campaign — not a general benchmark for hospitality advertising.

Realtor.com: Reactive Production Speed as the Primary Win

Realtor.com's case has no hard CTR or ROAS figures available from the AdExchanger source article — and presenting it as a performance case would misrepresent what the brand actually reported. The value Realtor.com describes is production agility: the ability to respond to overnight mortgage rate changes with relevant ads before the market window closes.

"If mortgage interest rates suddenly drop, Realtor.com needs to respond and generate relevant ads quickly because, by the following week, the rates might be back to normal." — Neil Golson, VP of Brand and Creative, Realtor.com

The BrandComms.AI platform enabled production and channel expansion that Golson described as "big marketing at low cost." Critically, the core audience insight that drove the campaign — sellers' deep fear of making a costly mistake on their home sale — was discovered by human brand strategists through qualitative and quantitative research. AI generated ad concepts from that human-led input. Realtor.com's marketing team approved the final creative before any ad went live.

The Realtor.com case is best understood as evidence for production velocity and channel access, not creative performance optimization. That is a meaningful benefit — but a different one.

Goop: Creative Team Capacity as the Measured Outcome

Goop's CMO described the problem the brand faced in operational terms: the volume of iterations required for paid media had become "operationally heavy," pulling the creative team away from higher-level strategic work. The Adora AI platform now handles the "manual, repetitive work" of asset iteration, freeing the team to focus on more complex creative projects.

No ROAS or CTR figures were disclosed for Goop's campaigns. The value the brand reported is capacity recovery — which is a real and measurable benefit, but one that requires a different evaluation lens than paid media performance metrics. This is worth noting for any performance marketer building a business case: team capacity freed is a legitimate ROI input, but it needs to be quantified internally (hours recovered, cost per creative asset, time-to-launch) rather than assumed from a brand quote.

Tuckernuck: Regional Creative Intelligence for CTV

Tuckernuck's use of the Keynes Kortex platform is the most analytically interesting of the four cases because it focuses on creative intelligence rather than creative production. Senior paid media manager Jordan Light described the platform surfacing meaningful regional performance differences: audiences in sunny coastal markets engaged differently with environment-mimicking creative than audiences in prolonged overcast weather markets.

That insight drove audience segmentation strategy rather than one-size-fits-all creative deployment. Tuckernuck also used creative performance signals — specifically, assets generating high site traffic but low conversion — as a trigger to mix and match with media across other platforms, building a more complete funnel picture.

Keynes CEO Dan Larkman's framing captures the underlying principle: marketers need to understand "not just what drove the last click but everything that influenced someone up until that point." The Tuckernuck case is less about AI generating creative and more about AI surfacing the performance signals that inform smarter human creative decisions.

The Human-in-the-Loop Finding: What Airpost's Workflow Reveals

Airpost CEO John Gargiulo offers the most direct practitioner statement in the sourced material about where the human-AI boundary currently sits in video ad production.

"Anyone who says AI can be creative and get high performing results is deluding themselves." — John Gargiulo, CEO, Airpost

Gargiulo's position is not anti-AI — it reflects how Airpost's production workflow actually operates. Brands upload assets and guidelines; AI generates a creative brief; a human creative team assembles and refines the video ads; each client receives a minimum of 10 unique ads per week plus unlimited versioning. Compliance checks are AI-assisted, but humans subsequently double-check the AI's work.

The clearest illustration of why human review matters came from a specific deodorant campaign. AI-generated copy read: "not all women want to use women's deodorant." A human editor flagged it as sounding "a little AI-ish" and rewrote it as: "I was tired of smelling like a scented candle." Gargiulo reports the human-written line drove better performance.

The Airpost case matters because it frames the human-in-the-loop requirement not as a limitation to work around but as a deliberate design choice made by a platform that has thought carefully about where AI adds value and where it does not. That is a different posture than most vendor marketing takes.

What the Data Consistently Shows — and What Remains Unproven

Across all four brand cases and the Airpost practitioner account, a consistent pattern emerges. The gains that AI video creative reliably delivers are in the production and optimization layer — not in autonomous creative performance.

What the sourced brand cases support and what they do not. All performance data reflects brand-provided statements to trade press, not third-party-verified results.
CapabilityEvidence StatusSource Basis
Creative production velocity (faster time-to-launch)Consistently reported across all four casesWyndham, Realtor.com, Goop, Tuckernuck — brand statements to AdExchanger
Variant scale (90+ variants from a single asset set)Specifically reported by WyndhamWyndham via AdExchanger, brand-reported
Audience-segment agility (real-time frequency optimization)Reported by Wyndham; regional intelligence reported by TuckernuckBrand-reported; not independently audited
Creative team capacity recoveryReported by Goop as primary benefitBrand-reported; no hours or cost data disclosed
Reactive production for time-sensitive market conditionsReported by Realtor.com as primary driverBrand-reported; no CTR or ROAS figures disclosed
Autonomous ROAS uplift without human oversightNot demonstrated in any sourced caseNo independent benchmark data available from IAB, Nielsen, or WARC
AI-generated copy outperforming human copy without revisionContradicted by Airpost casePractitioner account; single example, not a controlled study

The absence of independent industry-wide benchmarks is a genuine gap in the available evidence. No IAB, Nielsen, or WARC studies on AI video creative ROAS uplift were accessible during the research for this article. Performance marketers building internal business cases should treat brand-reported figures as directional signals, not category benchmarks.

Honest Limitations: Brand Voice Drift, Cultural Nuance, and FTC Exposure

The sourced cases and practitioner accounts surface three categories of risk that do not appear prominently in vendor marketing but deserve direct attention before scaling AI video creative.

  • Promotional conditioning and brand recall erosion: Tower 28 Co-Founder and CEO Erin Emmerson, writing in an AdExchanger op-ed, raises a specific risk for CTR-optimized AI creative: ads featuring aggressive discount language may outperform in the short term while training audiences to ignore anything that is not a promotional offer. This is a practitioner perspective from a brand CEO, not independent research — but the underlying mechanism (audience conditioning from repeated promotional exposure) is well-established in advertising literature.
  • Cultural nuance gaps at scale: Emmerson also notes that AI-generated creative running across hundreds of variants may miss market-specific imagery associations. Her example: the imagery conventions around pet advertising in Western markets (golden retrievers, domestic settings) do not translate to markets like Malaysia, where street dogs carry entirely different cultural associations. At scale, these mismatches can damage brand perception in specific markets without surfacing as a problem in aggregate performance data.
  • FTC compliance exposure: AI-generated ad creative carries disclosure obligations under current and evolving FTC guidance. This is not a hypothetical risk — it is an active enforcement area. For a detailed account of the specific disclosure requirements and risk areas that apply to AI-generated advertising, see Signal & Craft's coverage of FTC AI disclosure requirements for advertising and marketing. The compliance implications are significant enough that they warrant dedicated review before any AI creative goes live in paid media.

A Practical Evaluation Framework for Performance Marketers

Last-click ROAS is the default metric for paid video evaluation. It is also systematically misleading when applied to AI video creative — not because AI creative does not affect ROAS, but because the most consistent gains AI delivers (production velocity, variant scale, audience-segment agility) are upstream of the last click and do not appear in last-click attribution at all.

Keynes CEO Dan Larkman's framing from the Tuckernuck case captures the problem directly: you need to understand "not just what drove the last click but everything that influenced someone up until that point." Emmerson makes the same point from a brand perspective: "Last-click attribution has always been reductive, but in an AI-driven environment, it's actively misleading."

The following framework extends beyond last-click ROAS to capture the full range of value — and risk — that AI video creative actually affects.

A measurement framework for AI video creative that goes beyond last-click ROAS. Each metric captures a dimension of value or risk that last-click attribution misses.
MetricWhat It MeasuresWhy It Matters for AI Creative EvaluationHow to Capture It
Time-to-creativeCalendar days from brief to approved, live-ready assetAI's most consistent gain is production velocity — but only if you have a baseline to compare against. Without this metric, you cannot quantify the speed benefit.Track brief date and live date per asset in your project management or ad platform; compare pre- and post-AI implementation averages
Variant coverage rateNumber of audience segments with distinct creative vs. total segments targetedWyndham's 90+ variants across 15 segments is only meaningful if you know what your previous variant-to-segment ratio was. This metric reveals whether AI is actually expanding creative reach or just multiplying similar assets.Audit creative libraries per campaign; count unique creative concepts vs. resizes of the same concept
Brand search liftChange in branded search volume during and after campaign periodsCTR-optimized AI creative can drive clicks without building brand memory. Brand search lift is a proxy for whether the creative is creating recall, not just response. Emmerson specifically advocates this metric.Google Search Console brand query trends; compare campaign-on vs. campaign-off periods or year-over-year
Repeat engagement ratePercentage of users who interact with the brand again within 30–90 days post-clickPromotional AI creative can drive first-click conversion while training audiences to wait for discounts. Repeat engagement rate reveals whether you are building customers or deal-seekers.CRM or CDP cohort analysis; segment first-touch AI creative converters and track 30/60/90-day return behavior
Creative team capacity freedHours per week recovered from iterative asset productionGoop's primary reported benefit was capacity recovery. Without measuring baseline hours spent on iteration, this benefit is anecdotal. With measurement, it becomes a quantifiable input to ROI calculations.Time-tracking data pre- and post-AI implementation; estimate cost per creative hour and multiply by hours recovered
Sentiment from creative testingQualitative and quantitative audience response to AI vs. human creative in controlled testsAirpost's deodorant example shows that AI copy can be technically coherent but tonally off. Sentiment testing surfaces brand voice drift before it reaches live audiences at scale.Concept testing panels, social listening tools, or platform creative testing features; compare AI-first vs. human-revised creative in controlled A/B environments

Not every metric in this framework will be immediately trackable in your current stack. The practical starting point is to identify which two or three you can measure now and establish baselines before deploying AI creative at scale. Retrospective evaluation — measuring AI creative performance against a period when you had no baseline — is significantly less useful than prospective measurement with a defined before-state.

The four brand cases here point toward a consistent conclusion: AI-assisted video creative is a production infrastructure upgrade before it is a performance optimization tool. The brands that reported the clearest benefits — Wyndham on variant scale, Realtor.com on reactive speed, Goop on capacity, Tuckernuck on creative intelligence — all used AI to expand what was operationally possible, with human judgment governing what actually ran. That is a meaningful result. It is also a more modest and more honest claim than most AI creative vendor marketing makes.

Implementation guidance

  • Cost per opportunity −49.8% (CPaaS APAC); −32% CPL with $5.3M net-new revenue on −17% spend (payments SaaS)

    B2B Paid Search with AI Bidding: Case Study Results and Deployment Lessons

    A practitioner-focused breakdown of what AI bidding actually produces in B2B paid search accounts — grounded in real Q1 2026 campaign data — covering the signal quality failures that suppress performance, the four most common deployment errors, and a phased framework for moving from form-fill optimization to CRM-closed-loop value bidding. For performance marketers and demand-gen managers running complex-cycle B2B accounts on Google Ads.

    cited public sourceadsmid-market, enterprise

Comments

Join the discussion with an anonymous comment.

Loading comments...