Guidelines for Applying the EU GPAI Training transparency Template

03/11/2025

The development and deployment of general-purpose AI systems (GPAI) is advancing rapidly. As a result, regulators worldwide are introducing stricter requirements for providers concerning transparency, accountability, and governance of these systems. In this context, the European Commission has introduced the standardized GPAI Training Transparency Template as part of the operational implementation of the EU Artificial Intelligence Act (EU AI Act). The template is not a mere administrative formality. Instead, its purpose is to change companies’ approach to compliance by requiring those who develop or integrate GPAI to have a systematic understanding of the data, processes, and decisions shaping their models.

Therefore, a provider must have precise and comprehensive knowledge of how an AI model was trained and be able to clearly document that process to authorities and other relevant stakeholders.

What is the GPAI Training Transparency Template?

The template standardizes the reporting system and is intended to increase transparency in the development and training of GPAI systems. Its role is to provide regulators and users with a clear view of the origin, limitations, and risks of a given model.

Based on the template, providers are required to disclose details on:

Training data – types and sources of databases, methods of selection and filtering, and whether the material is copyrighted.
Data management – quality control mechanisms, risk mitigation measures, and documentation of iterations during development.

By formalizing the process, the template introduces a uniform approach, encourages responsible development, and facilitates compliance with regulatory requirements.

When does this obligation take effect?

From August 2, 2025, the public release of a training summary becomes mandatory for all GPAI models placed on the EU market from that date onward. For models introduced earlier, a transitional period runs until August 2, 2027, during which providers must retroactively publish training summaries for previously developed models.

If certain information cannot be obtained due to technical limitations or would require disproportionate effort, providers are allowed to omit it from publication, but only with a mandatory and transparent explanation. Even then, the training summary must be as complete, relevant, and informative as possible.

Who does this obligation apply to, and why is it important?

The obligation is prescribed by Article 53(1)(d) of the EU AI Act and applies to every provider of GPAI models placed on the EU market.

It covers:

GPAI model developers based in the EU and outside the EU.
Companies integrating GPAI into services, especially when the integrated model becomes part of a high-risk AI system.
AI-as-a-Service providers, who make GPAI models available via APIs and platforms, thereby placing them on the EU market.

Non-compliance with these obligations can lead to regulatory sanctions, reputational damage, and restricted market access. The extraterritorial scope of the EU AI Act means obligations also apply to entities outside the EU that make GPAI models available to EU users.

What if the model is modified or fine-tuned?

If the scope and nature of the modification are to such extent that they turn a modifier into a provider under the EU AI Act, they are required to publish a training summary. It is not necessary to repeat the entire training history of the base model: only the additional training/fine-tuning must be disclosed, with clear identification of the model name and version.

A single summary may cover multiple variants that share the same additional training data, provided all versions are explicitly listed. If different datasets are used, separate summaries are required, each referencing the original model and its previously published summary.

Key obligations for companies

The template prescribes three categories of mandatory public information:

1. General information – provider and model identity; description of data types (text, images, audio, video); and estimates of data volume per type.

2. List of data sources – overview of the origins of training content: public and private databases, web data (web scraping), synthetic data, and data derived from user interactions. For web data, additional details are required: collection tools, training timeframes, types of content, and most frequently targeted domains. SMEs are subject to less stringent requirements but must still provide adequate transparency about their practices.

3. Relevant aspects of data processing – identification of copyrighted material and justification of text and data mining practices under EU law; confirmation of whether user interaction data was used (excluding personal data); description of measures to detect and remove illegal content; and an overview of technical and organizational risk mitigation measures related to training.

These obligations align with broader AI governance principles: accountability, explainability, and fairness.

Where and when must the summary be published?

The summary must be publicly available no later than at the time of placing the GPAI model on the EU market, since the publication is a prerequisite for market access.

It must be published on the provider’s official website in a prominent and easily accessible location, with clear identification of the model and version.

In addition, the summary must be available at distribution points for the model: open-source repositories, developer hubs, and digital marketplaces.

Do the summaries need to be updated?

Yes. Summaries are “living” documents and must be updated regularly based on two factors:

Time-based factor – at least once every six months, the provider must review and, if necessary, update the summary.
Material factor – when introducing new datasets or making significant changes, the update must be done immediately, before the six-month period expires.

Each new version must include the date of change and a description of new data or modifications. It must be published alongside the updated model version, on both the official website and relevant distribution channels (repositories, hubs, marketplaces).

Common mistakes and how to avoid them

Vague or incomplete disclosures – statements like “we use publicly available data” are insufficient. Providers must specify concrete datasets, legal bases, tools used for collection, timeframes, and the most frequently targeted data sources.

Excluding legal and oversight teams – compliance is a multi-layered process. Late involvement of legal, DPO, and risk/governance teams creates gaps, especially regarding IP rights and personal data.

Treating documentation as a one-off task – summaries require continuous updates and new versions, either periodically (at least every six months) or immediately when material changes occur.

This approach enables full compliance, reduces regulatory and reputational risks, and demonstrates a responsible attitude toward the development and deployment of GPAI.