- A. Wikipedia
- B. Wikimedia Commons
- C. Wikidata and CC0
- D. Copyright restrictions and public domain in wiki projects
- E. Artificial intelligence as a controversial topic
1 Wikipedia is the largest community project on the internet with over 62 million users worldwide.
2 In addition to the Wikipedia project, there are other projects in the so-called "Wikiverse" such as WikiCommons and Wikidata, which are also presented here. The following section describes how CC licences are used on Wikipedia, Wikimedia Commons and Wikidata, how the content made available in this way is reused, and what other legal aspects play a role.
A. Wikipedia
3 Wikipedia is a free, multilingual online encyclopaedia created by volunteers. The project was launched in January 2001 by Jimmy Wales and Larry Sanger. The content of Wikipedia is based on contributions from the Wikipedia community that supports the project. The community consists of many thousands of volunteer contributors who are organised into numerous subgroups based on the languages they speak and edit, as well as topics.
4 The Wikimedia Foundation, which operates the servers for Wikipedia and other wiki projects and owns the corresponding trademarks, legally fulfils the necessary role of the responsible organisation
5 As a rule, articles are written by several authors and are repeatedly updated iteratively, which is understood as collaborative writing. There are now around 60 million articles in over 330 languages and dialects.
6 "Wikipedia is an encyclopaedia.
Articles should be written in such a way that they comply with the principle of neutrality.
Content is free; it must be under a free licence.
Other users must be respected and wikiquette observed (a derivative of the portmanteau word netiquette, which in turn comes from the English 'net' and the French 'etiquette' for rules of etiquette)."
(emphasis not in the original)
I. CC BY-SA 4.0 in Wikipedia
7 The Wikimedia Foundation's terms of use state that contributions must grant the general public extensive rights to redistribute and reuse them. When a Wikipedia article is created, it is automatically licensed under CC BY-SA 4.0 and the GNU Free Documentation License (GFDL, without version specification)
8 In the early years of the project, the content of Wikipedia was only licensed under the GFDL, a licence from the Free Software Foundation that was developed as a release tool for software documentation. However, there was criticism that the GFDL was too complicated, and its official text is only available in English. With the introduction of CC licences, the Wikipedia community demanded that Wikipedia articles be placed under the CC BY-SA licence. This promised greater user-friendliness and a better fit with the needs of collaborative writing.
9 From a legal perspective, however, such a subsequent change to the licence would have required the individual consent of each and every person who had already created content. It was clear to everyone that such a massive individual request for consent was practically impossible to implement, especially since a blanket relicensing would have required a consent rate of 100%.
10 Therefore, a different approach was taken: the Wikimedia Foundation, together with Creative Commons and the Free Software Foundation, agreed to further develop the then current version 1.2 of the GFDL into version 1.3, which included an additional clause 11 tailored to the relicensing problem of Wikipedia. Under strict conditions, this clause allows the operator of a collaborative authoring project to make a relicensing decision on behalf of all contributors without requiring their individual consent.
11 Since the Wikipedia content existing at that time was under GFDL without version specification and thus automatically always the latest GFDL version applied (which is still the case today), it was subject to the GFDL versioning to 1.3 without further steps, including its new clause 11. This made it possible in 2009 to place the entire existing content of Wikipedia and other Wikimedia projects under the CC BY-SA 3.0 licence by simple decision of the Wikimedia Foundation as operator.
12 The above-mentioned automatic application of the Wikimedia terms of use for licensing Wikipedia articles under CC BY-SA licence explicitly requires that the content in question is protected by (copyright) law. This does not include content that, by its very nature, is not eligible for copyright protection or whose protection has since expired. The terms of use thus address an aspect that is also enshrined in the CC licences .
II. Inserting content from other sources into Wikipedia
13 When incorporating content from other sources into Wikipedia, users must ensure that the incorporation does not infringe the rights of uninvolved parties. For example, an article from a published encyclopaedia may not be inserted into Wikipedia without the publisher's permission, even if its content is a good fit. This would infringe the rights of the publisher and its authors. Furthermore (due to the strict protective function of copyright law) the above-mentioned automatic licensing under CC BY-SA and GFDL would not apply, without this being apparent to users of Wikipedia, who in turn – trusting in the project's promise of free use of all Wikipedia content – might reuse the content elsewhere without actually being entitled to do so . The strictness of copyright law means that in such a situation, not even private individuals can claim that they were unaware of their lack of authorisation and could not have known about it. Not least because of such scenarios, the Wikipedia community is known for being very alert and precise when it comes to copyright issues .
14 If, on the other hand, the third-party content is CC-licensed or subject to another free licence, it is possible to incorporate this content into a Wikipedia text, provided that there is licencecompatibility between the content found elsewhere and the licence used in Wikipedia, CC BY-SA 4.0. This is also expressly stated in the Wikimedia Foundation's terms of use.
BY-SA (version 4.0 or later)
expressly declared by Creative Commons to be compatible with the Share-Alike clause in CC BY-SA 4.0 licences; these are currently only
: o Free Art License 1.3
o GPLv3 (however, only material under CC BY-SA 4.0 can be further licensed under GPLv3 after editing , but not vice versa: GPLv3 material cannot be licensed under CC BY-SA 4.0 at a later date).
15 This applies both to the integration of other copyrighted works that are licensed under CC BY-SA 4.0 and incorporated into a Wikipedia article, as well as to further use.
16 There are no restrictions under the CC licence or the Wikipedia participation rules for uses that are already permitted under copyright law.
III. Further use of Wikipedia articles
17 Once a Wikipedia article has been created, it is automatically licensed under CC BY-SA 4.0. This means that the authors must be credited if the article is used for purposes other than simply reading it. The Wikimedia Foundation's terms of use specify how the authors must be credited:
"By providing a hyperlink to the article (if possible) or the URL of the article to which you contributed (since every article has a version history listing all contributors, authors and editors);
By hyperlink (if possible) or URL to another stable online copy that is freely accessible, complies with the relevant licence and ensures attribution of the authors in a manner equivalent to that on the project websites; or
By a list of all authors (please note, however, that any list of authors can be filtered to remove very small or irrelevant contributions).
18 This only describes in more detail the options for Wikipedia reuse that are already included in CC licences. However, the CC BY-SA 4.0 licence variant used in Wikipedia requires not only attribution for reuse, but also that if the reuse Wikipedia texts in any way, the modified version must be subject to the same licence (i.e. CC BY-Sa 4.0 again). The aim of this is to ensure that the free knowledge of the online encyclopaedia remains free in the long term and cannot be "locked in" legally by someone who makes their own revisions. B.
B. Wikimedia Commons
19 The Wikimedia Commons project provides a freely accessible media archive of photos, videos and audio clips, which is filled and maintained by volunteers (community). Wikimedia Commons has been in existence since 2004 and currently contains 107,466,094 files.
20 Unlike Wikipedia, when uploading content (i.e. an image, video or audio clip or other media file) to Wikimedia Commons, the contributor can choose between several licences officially supported by the project and can also use other licences – as long as the uploaded content is ultimately available under at least one licence that is considered free, possibly alongside other licences (which is then referred to as multiple or parallel licensing).
21 The officially supported licences listed on Wikimedia Commons are:
The CC licence variants BY and BY-SA
The GNU licences GPL and LGPL
The Free Art Licence/Licence Art Libre
The Open Data Commons licences (for data)
22 As mentioned above, other free licences can also be used, but they must then be entered "manually" using the appropriate wiki text. The only alternative to releasing material under a licence is to specify that the material is free of rights (in the public domain). In this case, no licence may be specified at all, as there is no licensable right. In order to reduce the number of cases in which uploaders mistakenly assume that content is in the public domain, a few important indicators of this legal status must be confirmed via a checkbox during upload .
23 It is not intended that no licence (or indication of public domain status) be selected, and the corresponding step is therefore an integral part of the upload process. After uploading, the respective content is then displayed with the appropriate licence notice and the files are classified into the respective licence category.
24 The following cannot be selected in Wikimedia Commons:
Licences (including those from Creative Commons) that only allow non-commercial use and/or do not permit editing of the content
Usability based on legal licences that only apply to specific contexts (e.g. in accordance with the US fair use principle or based on the right to quote in German copyright law)
C. Wikidata and CC0
25 Wikidata is a free, collaborative database that was created to collect structured data to support Wikipedia, Wikimedia Commons and other wiki projects.
26 The special feature of the database is that the data is published under CC0. CC0 releases the database, including the collected data, to the general public without any restrictions; even attribution is not required when using it (for details on CC0, see the commentary section there).
D. Copyright restrictions and public domain in wiki projects
27 Since wiki projects only contain content that is either in the public domain or under a Wikipedia-compatible licence, or where the creator is permitted to grant a CC BY-SA licence, it would be inadmissible to invoke a licence that only applies in certain contexts due to copyright restrictions. The Wikimedia Foundation also makes this clear in its terms of use.
28 If only one restriction allows use, but otherwise "all rights reserved" applies, this is not sufficient for the free use sought by the Wikimedia projects, because each restriction applies only to a specific context of use or a specific purpose of use. However, the content of Wikimedia projects should be usable in any context and for any purpose. The Wikimedia communities are also considered to be particularly conscientious when it comes to copyright.
29 The situation is similar with regard to the time limit of copyright. It expires at some point, under German law (§ 64 UrhG) 70 years after the death of the author. Once this period has elapsed and there is no longer any copyright, there is nothing left to license. Such works are in the public domain and may be used completely freely. Attempting to place them under a CC licence would not only be formally ineffective from a legal point of view, but could even have legal consequences as a so-called assertion of property rights (see also section 8, margin note 7). Accordingly, works in the public domain may be contributed to Wikimedia projects, but no licence may be "granted" for them; instead, they must be marked as public domain during the upload process .
E. Artificial intelligence as a controversial topic
30 Many innovative digital technologies only unfold their added value in relation to large amounts of digital content, such as the Semantic Web and geodata services. However, the most influential innovation in recent years has been generative AI, i.e. IT systems such as ChatGPT and Midjourney, which use stochastic methods to generate content that, with reasonable effort, can no longer be distinguished from human-created texts, images, videos, music, etc. The upheavals that this "AI revolution" has caused since the end of 2022, both in the world of professional creatives and for creative amateurs, as well as for society as a whole, cannot be overlooked. And of course, Wikimedia projects are also affected by this in two ways: as collections of material for AI technologies and through AI-generated content that in turn flows back into them.
I. Wikimedia projects as collections of material for AI training
31 The content of Wikipedia, Wikimedia Commons, Wikidata and other smaller Wikimedia projects is human-curated, standardised and quality-assured digital material that can be accessed without technical barriers and, thanks to free licences, can be used extensively. It is therefore not surprising that the total text inventory of each Wikipedia language version is part of virtually every data set used to train language AIs in that language. The same applies to Wikimedia Commons and image AIs, and Wikidata was even created specifically as a kind of "machine-readable knowledge" to make it easier for technical systems to "understand" contexts of meaning .
32 Ever since the advent of well-known assistant products such as Amazon's Alexa, Apple's Siri and others, it has been clear that this mission of Wikidata has been successful. These assistants draw their "knowledge" from a mixture of various sources and, thanks to AI, are becoming increasingly better at evaluating uncurated, rather chaotic sources. However, when this technology was still in its infancy, it was largely thanks to specially prepared networked data collections such as Wikidata that functional products could be created.
33 This has always sparked debate within projects such as Wikipedia. This is because the open paradigm (cf. Einl Rn. 6), which aims to enable technical innovations such as those mentioned above, almost automatically comes under criticism when the downsides of these innovations become apparent. Decades ago, there was already controversy over the fact that open-source software can also be used for weapons systems and is being used for this purpose. Something similar is now happening with AI.
34 On the one hand, this concerns the immediate effects of innovative technologies, i.e. currently the impact of the general availability of generative AI on professionals in the field of so-called applied arts and similar fields (graphic designers, copywriters, composers of background music, etc.). On the other hand, some are critical of the fact that new technologies typically generate a great deal of money, thanks in part to the voluntary work of the Wikimedia communities, without there being any direct return, such as donations.
35 However, it is precisely the idea behind open content that there should be no exchange of content for money. Nevertheless, at least in the CC approach and various other standard licensing systems, there should be some form of "consideration", namely attribution as a publicly visible acknowledgement. And this is precisely where innovative technologies often fall short. The first major debate on this issue centred on The first major debate on this issue arose around the map data of the Open Streetmap (OSM) project, which was read and used by map service operators without the contributions of the OSM project and its contributors being visible. It was only the persistence of the OSM community that led to the inclusion of corresponding references in map services today.
36 The Wikimedia communities have similar difficulties. Only Wikidata, with its unconditional release under CC0, is not designed for attribution or "copyleft" from the outset. Wikipedia and Wikimedia Commons are, but so far no satisfactory way has been found to implement attribution in a meaningful way in reuse systems, neither in assistive products nor in the newer generative AI systems. Copyleft, i.e. the release of derivative content under the same licence, is also a problem with such systems, because it is not even legally clear whether or when they produce protectable content to which copyleft could apply. Interesting developments and discussions are therefore to be expected in this area.
II. AI content as material for Wikimedia projects
37 Conversely, much remains open, namely the question of whether and, if so, how AI-generated content may play a role as material that is incorporated into Wikimedia projects. In the background of Wikipedia maintenance, AI has been used very successfully for years to detect vandalism, i.e. minor edits to Wikipedia text made for fun or with destructive intent. But what about the two decades of exclusively manual text creation – should AI also be allowed to be used for this? If so, to what extent? And how exactly should who be able to control this?
38 The situation is similar at Wikimedia Commons. There, image material that depicts physical reality has always been used strictly, because this is precisely in line with the encyclopaedic function of Wikipedia, for which the media archive was created. Nowadays, however, it is very easy to simply generate photorealistic images. Should such images be permitted, for example to depict an idealised historical architectural style for which no idealised photograph exists? And in any case, should the various legal obstacles that the Wikimedia Commons community has to deal with on an ongoing basis (copyright, personal rights of those depicted, trademark law, etc.) be circumvented by means of AI-generated material?
39 The communities of the Wikimedia projects, OSM and other projects are already addressing these issues in great detail, mostly publicly visible on the internet.
Creative Commons License
Open Access Kommentar, Commentary on F. Wikipedia and Wikimedia is licensed under a Creative Commons Attribution 4.0 International License.