Apple executives have detailed the architecture of the company's new Apple Foundation Models (AFM) and clarified exactly how Google's technology factored into their development.

Craig Federighi, Apple's SVP of Software Engineering, held a post-keynote tech talk (via 9to5Mac) with press on Monday alongside AI VP Amar Subramanya, Siri lead Mike Rockwell, and software VP Sebastien Marineau-Mes to walk through how the third-generation AFM family was built and how it powers Apple Intelligence.
"The amount of the Google Assistant we use is none," Federighi said, explaining that Apple uses none of the Gemini models deployed to Google's customers, none of Google's client-side code, and no Google Search infrastructure as the knowledge backbone.
Of course, we don't have the Gemini app as our app. In fact, none of that client code is part of how we run on iOS. For these models, we use none of the models that Google deploys to their customers, nor do we use the infrastructure and means by which they deploy models to their customers. And then, when it comes to the knowledge base, we of course don't use Google Search or anything like that as the foundation of our system.
Subramanya outlined the new AFM family, which spans two on-device models and three server-side models. The on-device tier consists of AFM Core, a next-generation dense architecture model, and AFM Core Advanced, which uses a sparse architecture and is natively multimodal.
Subramanya said AFM Core Advanced is "unlike any on-device model we've run before," enabling new features including invitation and expressive voices without any cloud requests. On the server side, AFM Cloud handles latency-optimized Private Cloud Compute requests, while AFM Cloud Image powers image generation and editing features including spatial reframing.
The key detail on the Google collaboration came in Subramanya's description of how these four models were trained. "All of these are custom built for Apple Silicon, trained using proprietary data with reinforcement learning and refined using outputs from Gemini frontier models," he said, making clear that Google's contribution was distillation-based, not a wholesale adoption of Gemini.
The fifth and most capable model, AFM Cloud Pro, is designed for agentic tool use and complex reasoning tasks, with quality that Subramanya said is "similar to Gemini frontier models." This model marks a departure from Apple's standard Private Cloud Compute setup.
To run it, Apple worked with both Google and Nvidia to extend its private cloud infrastructure to Nvidia GPUs hosted in Google's cloud. Marineau-Mes said Apple wanted to use Nvidia's latest chips but required them to be configured so they couldn't read the contents of Apple's servers. A recent Nvidia technology called "ambiguous confidential compute" provided the solution.
We wanted to avail ourselves of the latest technology from Nvidia, and so we set out to extend private cloud compute to third-party cloud.
Federighi described the broader system architecture as being organized around a System Orchestrator, a piece of software he called "key to the privacy architecture of our entire system." The orchestrator routes any given query to the appropriate model, on-device or cloud, based on the complexity of the request and the personal context required.
It draws on an App Toolbox for in-app actions, a Spotlight Semantic Index for personal content, and on-screen context for real-time awareness. For queries involving current events, responses are found through Apple's own World Knowledge Service, which Federighi said the company has been building for several years.
Apple also maintains that all Private Cloud Compute infrastructure, including the extended Nvidia GPU capacity in Google's cloud, can be independently verified by third-party researchers to confirm that user data is never stored or accessed.


















