What generative AI means for copyright

Creatives are worried about being thrown out of a job by generative AI, while artificial intelligence developers leave themselves open to copyright infringement claims by training their AI on unlicensed material. Rachel Alexander explains

Copyright has always been inextricably linked to technological innovation. The introduction of the printing press to England in the 15th century led to a series of acts aimed at regulating the publishing industry. By 1710, England would pass the Statute of Anne, widely considered to be the first copyright law, and upon which the United States would model its early copyright legislation 80 years later. Often characterised as “utilitarian” in philosophical bent, the purpose of these laws was to encourage the production of further works by remunerating authors.

From the 18th century onwards, the scope of copyright expanded to encompass not just published books, but works of visual art, drama, and music — and, as time went on, sound recordings, films, and broadcasts. By the 1990s, the UK would wedge computer programs into the Copyright, Designs and Patents Act (CDPA) under the category of literary work.

GDPR — How does it impact AI?As the GDPR turns five, how has its relationship with AI evolved?

First principles

We are now apparently on the brink of yet another technological revolution, this time propelled by generative artificial intelligence: computer systems capable of producing content historically valued (and protected by law) for its human “touch”.

Recently, there has been a flurry of press about the potential of, and risks associated with, generative AI replacing humans across traditional creative industries. To list a few recent examples: AI has become a key issue in the current Writer’s Guild of America strike in Hollywood; The New York Times Book Review featured a murder mystery aptly titled Death of an Author written using three different AI tools; a song featuring vocals produced using AI trained on the voices of Drake and The Weeknd went viral before being removed due to copyright infringement claims.

Amidst all the noise, it’s helpful to return to first principles of copyright to understand the current legal position and where the law may be headed.

This article is written primarily from a UK perspective but also touches on the approach in the US and the EU.

Authorship and ownership

Copyright is a type of legal protection for, among other things, literary, dramatic, musical, and artistic works.  Copyright subsists automatically where those works are original and fixed.

The owner of copyright in a work has exclusive rights in the work, including the right to prevent others from reproducing the work without permission. Initial ownership of copyright typically vests with the author (with the exception of certain employment contexts), and authorship has historically been thought to be the purview of human authors only.

Copyright may subsist in the software code behind a generative AI programme, which would be classified as a “literary work”. This may provide protection to AI developers against substantial copying and other uses of their code. However, things become more complicated when we look to AI-generated content, where questions of authorship and ownership have yet to be conclusively answered.

What the draft EU AI Act means for regulationInformation Age speaks to EU data protection, intellectual property and technology experts about the business implications of the EU AI Act

For example, OpenAI (the company responsible for the popular AI chatbot, ChatGPT) purports to grant users “all its right, title and interest in and to Output”, but it is unclear whether copyright subsists at all in the generated content and — if it does — with whom it would vest in the first place.

Arguments as to why copyright does not subsist in AI-generated content centre on the lack of human authorship and creativity and, in tandem, the lack of requisite originality.

As noted in the Berne Convention, an international treaty on copyright, copyright protection operates “for the benefit of the author and his successors in title” – the assumption being that there is a human creator. This has been affirmed in the US in the now famous “monkey selfie” dispute, during which both the Copyright Office and the courts found that animals could not hold copyright. More recently, the US Copyright Office rejected registration of a visual artwork in the name of an AI algorithm. The absence of a human creator in respect of AI-generated content therefore presents obstacles to the subsistence of copyright in the output that is generated.

So too does the need for originality, which has been traditionally defined by reference to human authorship.  “Originality” for purposes of copyright requires, under the UK domestic test, some labour, skill, and judgement on the part of the author. Under EU standards, it is the author’s “own intellectual creation” – in other words, creative choices – reflecting the personality of the author. The most recent guidance from the US Copyright Office indicates that when “AI technology determines the expressive elements of its output, the generated material is not the product of human authorship” and therefore not protected by copyright.

However, where a work containing AI-generated material also contains “sufficient human authorship”, for example where a human selects or arranges AI-generated material in a creative way to create an original work, the resultant work could be entitled to copyright protection.

Taking those tests at face value, one way of attributing copyright protection to AI-generated works would be to look for some underlying human contribution. 

Section 9(3) of the CDPA already provides some guidance on this point: “In the case of a literary, dramatic, musical or artistic work which is computer-generated, the author shall be taken to be the person by whom the arrangements necessary for the creation of the work are undertaken.” However, this approach raises several new questions. Is the “person” making the arrangements the AI developer or the user? Would the human creation of an algorithm for AI-generated works be sufficient?  Is there a distinction between truly AI-generated outputs and AI-assisted content?

In short, whether copyright subsists in AI-generated content is likely to be a highly fact-specific enquiry, but any such analysis will (at least as things currently stand) be grounded in existing principles of authorship, originality, and ownership.

Copyright infringement

Notwithstanding the interesting debates about authorship and originality raised by the above, key concerns for businesses will be the infringing use of existing copyright works as the input for AI-generated works and competition between human-created and AI-created works. 

As in the case of authorship, copyright law tends to envisage infringement as being done by natural persons — not computers.  Indeed, section 16(2) of the CDPA states: “Copyright in a work is infringed by a person who without the licence of the copyright owner does, or authorises another to do, any of the acts restricted by the copyright”. However, this does not provide a loophole for AI developers or users. Existing case law indicates the courts are likely look to the person or entity most closely associated with the infringing software. 

Forms of artificial intelligence have been around for decades, but recent advances in generative AI are due in part to the vast data sets now available on which algorithms can be trained. However, training is often conducted using copyright materials without a licence. Although it has yet to be tested in the courts, we believe the infringement position is straightforward in the UK, although there may be challenges to identifying and evidencing infringement. In this context, it is helpful to consider the two prongs of (1) input (i.e., what the AI is trained on) and (2) output (i.e., what the AI generates).

Generative AI vs copyright

Under UK law, there are only a narrow set of exceptions or defences to copyright infringement under the umbrella of “fair dealing” (this is distinct from the broader concept of “fair use” under US law).

Broadly, section 29A of the CDPA permits the making of copies of copyright works for the purpose of text and data analysis only in the context of non-commercial research.

The UK government had proposed to broaden this exception to encompass AI development as of last year, but following significant pushback from rightsholder groups, the government announced in February 2023 that it would not be moving forward with that plan.

On the assumption that data training involves reproduction of copyright works or substantial parts thereof without a licence, this could constitute infringement under current UK law. (In practice, it might be difficult to establish what data was used in the training). The “infringer” in this context is arguably the developer of the AI as the person or entity responsible for arranging for copies of copyright works to be made. This issue could be addressed by the courts in the next year or so. Notably, Getty Images recently sued Stability AI (known for its art generator, Stable Diffusion) on both sides of the pond alleging, among other things, that Stability AI had unlawfully copied millions of copyright-protected images in training Stable Diffusion. Getty does however license its library to other technology companies for AI purposes.

Licensing itself raises new and difficult questions for rightsholders, as the scope of licensed use becomes murkier — even if a work was licensed to train an AI platform, the generated results may be unpredictable. 

In addition to the risk of infringement via training, there is also a risk of infringement where AI-generated content is substantially similar to or reproduces pre-existing copyright works which were used to train the module. 

Whether works were used to train a module will be a question of fact, although in some cases it may be more obvious than in others: for example, Getty Images watermarks have been shown to appear on images generated by Stable Diffusion. In this scenario, either or both of the developer of the AI module or an end user utilising the generated works could be liable for infringement.

What does generative AI mean for copyright?

And so, what does generative AI mean for copyright?  Generative AI presents a new scenario on which to apply questions that copyright has been grappling with for centuries such as: Who can be an author? What is original? What constitutes ‘fair dealing’ with a protected work? These questions continue to be important for creators, businesses, and the general public alike.

Both developers and users of the technology should understand the risks of infringement involved with generative AI, and also understand possible new scopes of use in any potential licensing arrangements.

The broader legal and regulatory landscape must and will continue developing to meet the issues raised by AI that continue to emerge daily, such as biases in systems, of which intellectual property and copyright specifically are just one facet. Although much ink is being spilt at the moment, including by governments, neither the UK’s White Paper, A pro-innovation approach to AI regulation nor the proposed EU AI Act appear to go into detail on issues of generative AI and copyright. As such, the courts — and perhaps those handling the Gettycases in the US and UK if no settlement is achieved — are likely to be the first arbiters of some of these issues.

Rachel Alexander is a partner at Wiggin specialising in copyright and intellectual property.

Co-author Sinclaire Marber Schäfer is a dual-qualified intellectual property litigator at Wiggin who has practised in both the UK and US and holds a Juris Doctor from Columbia University, a Master of Laws (intellectual property) from the London School of Economics and a postgraduate diploma in intellectual property law and practice from Oxford University

More on generative AI regulation

How generative AI regulation is shaping up around the worldWith generative AI developments heating up globally, we take a look at the regulation state of play for regions across the world