Rights to trained AI

While the EU’s new Artificial Intelligence Act primarily poses regulatory challenges, it also raises a number of IP issues relating to artificial intelligence, machine learning and neural networks. Manually programmed software can undeniably enjoy copyright protection. There is also consensus that AI-generated output (e.g. DALL-E images and ChatGPT text) generally cannot. One unanswered question, however, is who owns the rights to the AI itself?

Training AI

When it comes to copyrights, a distinction must be made between untrained and trained software. Artificial neural networks do not simply run the algorithms contained in the original software. Instead, they provide instructions on how individual network neurons are to analyse and weight the training data fed to them to make them “intelligent”.

Broadly speaking, there are three training techniques: In “supervised learning”, the AI is fed data with a known output. Thus, the error rate can be determined by considering the input/output pairs. In “reinforcement learning”, the AI receives positive or negative feedback on which of its actions are desirable or undesirable. In “unsupervised learning”, the AI receives no feedback but instead organises its output according to its own criteria. This method is therefore particularly suitable for cluster analysis to discover unknown patterns.

Training always takes place at object code and not source code level, regardless of the approach. The source code itself is not altered, nor can the training data be reconstructed – even by analysing the weighted neural network (known as the “black box effect”).

Untrained software

Legally, the rights to the source code typically belong to the developer or, under section 69b(1) Copyright Act (Urheberrechtsgesetz, “UrhG”), to his or her employer. Training the AI does not change this, nor do the rights simply cease to exist once the AI has been trained. Using the trained AI without the consent of the owner of the rights to the original software therefore usually constitutes an infringement. The same typically applies to the training itself, provided that the original software is used for this purpose.

Trained software

Two main questions arise with respect to the trained AI: (1) whether the trained software is even eligible for copyright protection and (2) who owns these copyrights. These questions are closely related due to the principle of authorship (Schöpferprinzip) under German copyright law and have not yet been conclusively discussed.

German copyright law protects “programs of any form” (section 69a (1) UrhG) and the “expression in any form of a computer program” (section 69a (2), sentence 1 UrhG). This means that copyright protection exists for both the source code and the (untrained) object code; the latter is merely the compiled, machine-readable expression. The programmer of the source code therefore also owns the copyrights to the resulting object code. However, when neural networks are trained, the source code and the trained object code no longer match since the object code – due to the weighting of the neurons – contains additional information not present in the source code. This is the information that makes the software “intelligent”. According to what is likely the prevailing opinion in the literature, the trained software and the original software are therefore not two expressions of the same software. As a result, the programmer’s copyright to the original software does not extend to the trained part of the software. Consequently, if the programmer has no influence over the further training, his or her copyright is limited to the original software.

This raises the additional question of who owns the rights to the trained software – assuming it is actually protectable as copyright. For the software to enjoy protection, the result must constitute an individual intellectual creation (section 69a (3), sentence 1 UrhG). This has to be assessed based on the development process, i.e. the training itself: If this is purely automated and does not require any further human intellectual input, there is no apparent basis for copyright protection under currently applicable law. This is particularly evident in the case of the aforementioned “unsupervised learning” – at least if the contribution is limited to the provision of random training data.

The issue becomes more complex if the training process itself can be regarded as an intellectual, creative achievement. Therefore, it cannot be dismissed from the outset that copyright protection applies in the case of “supervised learning” or “reinforcement learning”. In this scenario, only the person who initiated and controlled the data collection or training can be considered the creator, i.e. the copyright owner. However, this depends on the extent to which the training was actually under the control of the “creator”, i.e. still qualifies as an individual intellectual creation within the meaning of German copyright law. It is doubtful whether the targeted selection of data is sufficient for this, as at least some form of additional control is likely required (e.g. checking output). There are similarities here between machine learning and computer-aided software development. However, in the case of the latter, the question of individuality (individuelle Schöpfung) tends to play a more central role. In the case of machine learning, the focus will usually be on whether the result constitutes an intellectual creation (geistige Schöpfung).

Conclusion

There are still many unresolved questions about the ownership rights to AI-based software. Where contractual relationships are concerned, one solution would be to include appropriate provisions in the relevant agreements (such as restrictions on use as well as confidentiality obligations). When dealing with third parties, de facto restricting access may oftentimes be the best solution – at least if no action can be taken based on copyrights to the untrained software.

08.04.2024

Forward

Digital Future