Copyright at a crossroads: The UK Government AI and Copyright Consultation

Specifically, the consultation asks for opinions on how to achieve the UK Government's objectives to:

support right holders' control over the use of their content to train AI models;
support the development of world-leading AI models in the UK by ensuring developers have access to high-quality data; and
improve trust and transparency between the AI and creative industry sectors.

The UK Government acknowledges that there is ongoing litigation on the extent to which the use of copyright works to train AI models infringes copyright. Recognising the harm that on-going uncertainty is causing the AI and creative industries, the UK Government wishes to explore what it describes as "a more direct intervention through legislation" to clarify the law and establish a fair balance between both sectors. At the same time, it is recognised that the outputs generated by AI may also infringe copyright, and the UK Government is also reviewing the extent to which copyright and other IP rights should apply to outputs generated by AI.

In this update, we outline the key proposals and provide guidance for AI developers and those using generative AI tools on how to avoid copyright infringement.

Copyright Exception - Text and Data Mining

One of the key - and most controversial - proposals put forward in the consultation is the expansion of the Text and Data Mining (TDM) exception to copyright infringement. TDM is an automated technique which involves copying works in order to extract and analyse information. UK copyright law currently permits TDM for the purposes of non-commercial research only. However, the UK Government's proposals would see this exception expanded to permit TDM for any purpose, except where the right holder has reserved their rights in relation to their work being available. In other words, the default position would permit TDM for any purpose (including commercial purposes), unless the right holder has opted out of that work being made available for such purposes.

Representatives of the creative industries have expressed reservations about the introduction of an opt-out system, particularly in relation to the efficacy of any system introduced to facilitate right holders opting out. This would rely upon the system removing all traces of a given work to which AI systems may have access. This proposal also seems to place the administrative burden upon creators to ensure their works are protected, as opposed to AI developers ensuring their systems do not infringe copyright in the first place.

Transparency

The UK Government proposes that the extension of the TDM exception would be underpinned by greater transparency obligations for AI developers. The UK Government intends that any transparency provisions would be proportionate and balance the interests of AI developers and creators as much as possible. Although the consultation does not contain any specific proposals in this regard, it does mention the possibility of requiring AI developers to disclose information on the use of specific works in the development of AI models, how they acquire the content and any content generated by their models.

The UK Government does address some of the practical issues around greater transparency. For example, there may be legitimate reasons to restrict the disclosure of some sources used, such as where they are provided pursuant to a commercial agreement or disclosure of sources would compromise trade secrets. Additionally, it is recognised that training AI models is likely to involve a large quantity of works and it may not be practical to comply with such transparency obligations in respect of every work used. In recognition of these issues, the UK Government may be willing to adopt a provision similar to that which applies in the EU, requiring AI developers to report and make publicly available a non-exhaustive but "sufficiently detailed summary" of training content used and has invited comment on this.

AI Output Labelling

The UK Government also seek views on the current provisions which protect computer generated works and proposals to clarify other areas of the law in relation to works created by generative AI, also inviting comment on the proposed requirement for AI generated works to be labelled as such. The rationale is that this would maintain the integrity and provenance of the original underlying works whilst also providing more informed consumer choice.

Tips for Businesses

It is clear that the law in this area is likely to change in the near future. Many of the proposals put forward in the consultation would provide much needed certainty for AI developers and users of AI systems if introduced. However, until then, AI developers must remain cautious about the materials they use in training AI systems in order to avoid infringing copyright, as should anyone using generative AI tools.

Below are our tips for those developing or using generative AI to avoid falling foul of copyright law:

Vet input data - ensure the datasets being used in AI development do not feature any copyright protected works where permission to use these works has not been obtained.
Obtain licences for specific works - where developers seek to use a resource which is copyright protected, contact the IP owner to request permission to use that resource.
Use public domain content - actively use data which is publicly available.
Use commercial datasets - many companies license datasets to developers for AI training purposes, though developers should ensure they are familiar with the source and that any licensing terms meet their requirements.
Contract carefully - users of generative AI tools should ensure that the developers of those tools provide sufficient assurances that use of those tools will not infringe copyrighted works.
Monitor output - consider what is being generated and whether it is likely to have been drawn from any copyright protected works.
Educate staff on prompts - ensure staff are aware of the possibility of copyright infringement and avoid prompts which are likely to produce infringing content.
Document training materials - although not yet a legal requirement, it is advisable for AI developers to keep records of materials used in training and the wider development process in case such provisions are introduced.

The consultation is open until 25 February 2025.

Should you require advice on the development or use of generative AI models, please contact David Gourlay or another member of our Data Protection and Cyber Security Team.

This article was co-written by George Munro, a trainee Solicitor in our Commercial team.