AIAgreement: clarify Forgejo project's current understanding wrt/ LLM/AI copyright #381

Merged
mfenniak merged 1 commit from mfenniak/forgejo-governance:aiagreement-copyright into main 2026-03-13 05:41:43 +01:00
Image
Member

When the AI Agreement was authored, it was not clear whether there would be a way to satisfy point 2 in the AI Agreement in the near-future. This has led to some confusion over how a contributor can satisfy this point in the agreement. This PR amends the point to represent the current state of affairs, as it's currently understood.

Adapted from @aahlenst's proposal in Forgejo Chat, but tweaked to indicate all generated content (not just code), and I've removed "We will revisit that restriction once the legal situation changes." as that feels irrelevant to communicating the policy clearly to contributors today.


Including @Gusted's comments from Forgejo Chat about this point for context:

It was not written that a reviewer should make this effort, the contributor has to do this. The burden should not fall on the reviewer.

When I wrote the wording, the legal situation was more leaning towards "this can't be copyrighted under any license because it's clearly trained on all sorts of copyrighted materials that you don't have knowledge of" and no major AI providers had a cut and clear answer on it (except for, this is okay for personal use).

Today we have the EU doing something that has a AI framework that in terms of copyright is mostly concerned with training data and not on the output. And the US upholding(?) a decision that at least AI Images can't be copyrighted. I was hopeful that https://osai-index.eu/ and Apertus (Swiss "promising" Open AI model) would provide clarity on it and see some court ruling or a solid proof legal scholar tackle this, but simple hasn't. A lot is happening while almost none can be extracted to the simple question if you can license it such that it's compatible with GPL-3.0-or-later.

The wording has aged quite badly. The gut feeling says: no it cannot be licensed. But then if you have Anthropic say in their FAQ we claim no copyright over it! Does that mean you can license it? Are they simply moving the problem of figuring that out to you? Does it mean that they signed this AI Code of Practices by the EU, that the copyright was not violated under EU law (which supposedly is one of the strongest copyright laws)?

It's a can of worms which I hoped was no longer a can of worms by around now. It might even be more of a can of worms now given there's been progress but may or may not be applicable to that simple question and arguing about this (just like I do with this comment) makes me engage in lawyer larping.


I'm proposing this as a PR on the existing agreement as I believe this edit is consistent with the original intent of the community-accepted agreement but clarifying that we don't believe it's possible to satisfy this requirement today. If the community indicates disagreement and believes that this is a rewrite, not a clarification, then this change should be brought into the formal decision-making process rather than be a simple PR. I'm open to receiving that feedback.

When the AI Agreement was authored, it was not clear whether there would be a way to satisfy point 2 in the AI Agreement in the near-future. This has led to some confusion over how a contributor can satisfy this point in the agreement. This PR amends the point to represent the current state of affairs, as it's currently understood. Adapted from @aahlenst's proposal in Forgejo Chat, but tweaked to indicate all generated content (not just code), and I've removed "We will revisit that restriction once the legal situation changes." as that feels irrelevant to communicating the policy clearly to contributors today. --- Including @Gusted's comments from Forgejo Chat about this point for context: > It was not written that a reviewer should make this effort, the contributor has to do this. The burden should not fall on the reviewer. > > When I wrote the wording, the legal situation was more leaning towards "this can't be copyrighted under any license because it's clearly trained on all sorts of copyrighted materials that you don't have knowledge of" and no major AI providers had a cut and clear answer on it (except for, this is okay for personal use). > > Today we have the EU doing something that has a AI framework that in terms of copyright is mostly concerned with training data and not on the output. And the US upholding(?) a decision that at least AI Images can't be copyrighted. I was hopeful that https://osai-index.eu/ and Apertus (Swiss "promising" Open AI model) would provide clarity on it and see some court ruling or a solid proof legal scholar tackle this, but simple hasn't. A lot is happening while almost none can be extracted to the simple question if you can license it such that it's compatible with GPL-3.0-or-later. > > The wording has aged quite badly. The gut feeling says: no it cannot be licensed. But then if you have Anthropic say in their FAQ we claim no copyright over it! Does that mean you can license it? Are they simply moving the problem of figuring that out to you? Does it mean that they signed this AI Code of Practices by the EU, that the copyright was not violated under EU law (which supposedly is one of the strongest copyright laws)? > > It's a can of worms which I hoped was no longer a can of worms by around now. It might even be more of a can of worms now given there's been progress but may or may not be applicable to that simple question and arguing about this (just like I do with this comment) makes me engage in lawyer larping. --- I'm proposing this as a PR on the existing agreement as I believe this edit is consistent with the original intent of the community-accepted agreement but clarifying that we don't believe it's possible to satisfy this requirement today. If the community indicates disagreement and believes that this is a rewrite, not a clarification, then this change should be brought into the formal decision-making process rather than be a simple PR. I'm open to receiving that feedback.
Image viceice approved these changes 2026-03-07 21:56:27 +01:00
Dismissed
AIAgreement.md Outdated
@ -26,1 +25,4 @@
6. It is not allowed to use AI in an autonomous-looking way to contribute in Forgejo. This also applies when someone engages in 'vibe coding' or uses so-called 'agent mode'.
[^1]: Under EU law, it is unclear whether code generated by LLMs is copyrightable. Furthermore, it is almost impossible to ascertain whether code in a PR does not violate somebody else's copyright.
Member

The footnote is too narrow. Proposal: "Under EU law, it is unclear whether output of LLMs is copyrightable. Furthermore, it is almost impossible to ascertain whether output of a LLM does not violate somebody else's copyright."

The footnote is too narrow. Proposal: "Under EU law, it is unclear whether output of LLMs is copyrightable. Furthermore, it is almost impossible to ascertain whether output of a LLM does not violate somebody else's copyright."
aahlenst marked this conversation as resolved
mfenniak force-pushed aiagreement-copyright from bca060a1b1 to 6a3fa3cfae 2026-03-07 23:06:49 +01:00 Compare
Image mfenniak dismissed viceice's review 2026-03-07 23:06:49 +01:00
Image Reason:

New commits pushed, approval review dismissed automatically according to repository settings

Image viceice approved these changes 2026-03-07 23:15:09 +01:00
Dismissed
Image Gusted approved these changes 2026-03-08 02:51:51 +01:00
Dismissed
Image 0ko approved these changes 2026-03-08 10:47:24 +01:00
Dismissed
Image 0xllx0 approved these changes 2026-03-08 15:56:41 +01:00
Dismissed
Image Beowulf approved these changes 2026-03-08 17:15:40 +01:00
Dismissed
Image Beowulf added the due date 2026-03-21 2026-03-08 17:16:08 +01:00
Image Beowulf modified the due date from 2026-03-21 to 2026-03-14 2026-03-08 17:17:43 +01:00
AIAgreement.md Outdated
@ -20,3 +20,3 @@
1. If content was made with the help of AI, you must convey that this is the case. This includes content that you authored but was motivated by a suggestion of AI.
2. If at any point you used AI's work in your contribution you should make an effort to verify that you can submit this under the license of the repository.
2. Forgejo does not accept works of authorship (code, documentation, etc.) generated by LLMs or so-called "AI" due to legal uncertainties. [^1]
Member

I don't have any remarks regarding the meaning of the new text (which I think it's clear), but I do have some remarks regarding consistency:

  • The above Terminology section states that Software and services that heavily rely on large language model technology to generate their outcomes are referred to as Artificial Intelligence (AI) (also providing some examples). Afterward only the AI term was used.
    With the new text (including the footnote, where LLM is used twice) to me it fells that a shift is made towards preferring the LLM term over the AI one.
  • At point 6. single quotes (primary UK style, according to Wikipedia) are used and here - the new version of point 2. - double quotes are used (primary US/CA style).

I don't have any preference over one term or another, but I think consistency would be nice. In the same time (especially taking into account that the changes were already approved by 4 mergers and 1 contributor), maybe my remarks should be ignored at this point.

I don't have any remarks regarding the meaning of the new text (which I think it's clear), but I do have some remarks regarding consistency: - The above _Terminology_ section states that _Software and services that heavily rely on large language model technology to generate their outcomes are referred to as Artificial Intelligence (AI)_ (also providing some examples). Afterward only the _AI_ term was used. With the new text (including the footnote, where _LLM_ is used twice) to me it fells that a shift is made towards preferring the _LLM_ term over the _AI_ one. - At point 6. single quotes (primary UK style, [according to Wikipedia](https://en.wikipedia.org/wiki/Quotation_mark#Summary_table)) are used and here - the new version of point 2. - double quotes are used (primary US/CA style). I don't have any preference over one term or another, but I think consistency would be nice. In the same time (especially taking into account that the changes were already approved by 4 mergers and 1 contributor), maybe my remarks should be ignored at this point.
Member

@floss4good wrote in #381/files (comment):

  • At point 6. single quotes (primary UK style, according to Wikipedia) are used and here - the new version of point 2. - double quotes are used (primary US/CA style).

I think in general, US english is more often used in the Forgejo space. So I think we should stick to the double quotes 🤔

@floss4good wrote in #381/files (comment):

but I think consistency would be nice

Agreed

@floss4good wrote in #381/files (comment):

In the same time (especially taking into account that the changes were already approved by 4 mergers and 1 contributor), maybe my remarks should be ignored at this point.

Constructive voices are always good and welcome - I just haven't realized it when I read it. The input is valuable in my opinion 👍

@floss4good wrote in https://codeberg.org/forgejo/governance/pulls/381/files#issuecomment-11412652: > * At point 6. single quotes (primary UK style, [according to Wikipedia](https://en.wikipedia.org/wiki/Quotation_mark#Summary_table)) are used and here - the new version of point 2. - double quotes are used (primary US/CA style). I think in general, US english is more often used in the Forgejo space. So I think we should stick to the double quotes 🤔 @floss4good wrote in https://codeberg.org/forgejo/governance/pulls/381/files#issuecomment-11412652: > but I think consistency would be nice Agreed @floss4good wrote in https://codeberg.org/forgejo/governance/pulls/381/files#issuecomment-11412652: > In the same time (especially taking into account that the changes were already approved by 4 mergers and 1 contributor), maybe my remarks should be ignored at this point. Constructive voices are always good and welcome - I just haven't realized it when I read it. The input is valuable in my opinion 👍
Member
Some off-topic remarks regarding UK vs. US English (expand if you are interested)

@Beowulf wrote in #381 (comment):

I think in general, US english is more often used in the Forgejo space. So I think we should stick to the double quotes 🤔

Since this is not the first time when I have remarks or questions regarding American English vs. British English please check the following excerpts (if there is interest in such a topic, a separate discussion for a convention/standard should be probably started):

Do you have any agreement / convention concerning the English dialect (i.e. American English or British English) to be used for these news posts?

British English was originally used everywhere (mostly because there are more European than American contributors), but lately there's been more of a mix. I don't think there's an agreement.

  • When I wrote the following remarks concernig behavior vs behaviour:

Although it's a little bit off-topic, I am also wondering what variant should be used – the American or the British one?
If I'm not mistaken, at some point I asked a similar question (however, I couldn't find that discussion) and I was told that since most of the involved people are from Europe, British English is the first option. However, searching through the blog posts and the docs I can find both variants but the US one is more frequent than the UK one (behaviour). And within the company I used to work (from EU with customers mainly from EU) American English was preferred (being considered the business English).

@earl-warren wrote in forgejo/discussions#337 (comment):

Not being a native English speaker, I do not have an opinion on the American vs British matter

<details><summary>Some off-topic remarks regarding UK vs. US English (expand if you are interested)</summary> @Beowulf wrote in https://codeberg.org/forgejo/governance/pulls/381#issuecomment-11420182: > I think in general, US english is more often used in the Forgejo space. So I think we should stick to the double quotes :thinking: Since this is not the first time when I have remarks or questions regarding American English vs. British English please check the following excerpts (if there is interest in such a topic, a separate discussion for a convention/standard should be probably started): - @caesar wrote in https://codeberg.org/forgejo/website/pulls/482#issuecomment-2261747: > > Do you have any agreement / convention concerning the English dialect (i.e. American English or British English) to be used for these news posts? > > British English was originally used everywhere (mostly because there are more European than American contributors), but lately there's been more of a mix. I don't think there's an agreement. - When [I wrote](https://codeberg.org/forgejo/discussions/issues/337#issuecomment-4029836) the following remarks concernig _behavior_ vs _behaviour_: > Although it's a little bit off-topic, I am also wondering what variant should be used – the American or the British one? > If I'm not mistaken, at some point I asked a similar question (however, I couldn't find that discussion) and I was told that since most of the involved people are from Europe, British English is the first option. However, searching through the blog posts and the docs I can find both variants but the US one is more frequent than the UK one (behaviour). And within the company I used to work (from EU with customers mainly from EU) American English was preferred (being considered the business English). @earl-warren wrote in https://codeberg.org/forgejo/discussions/issues/337#issuecomment-4032596: > Not being a native English speaker, I do not have an opinion on the American vs British matter </details>
Author
Member

I've updated the revision to just use the term AI, consistent with the rest of the bullet-points. 👍 6a3fa3cfae..968845daf6

I've updated the revision to just use the term AI, consistent with the rest of the bullet-points. 👍 https://codeberg.org/forgejo/governance/compare/6a3fa3cfae2d3c234d949098a22a3350375255e3..968845daf6bc0c01ee2706f3b925f236aba9ae77
floss4good marked this conversation as resolved
mfenniak force-pushed aiagreement-copyright from 6a3fa3cfae to 968845daf6 2026-03-09 20:21:39 +01:00 Compare
Image mfenniak dismissed viceice's review 2026-03-09 20:21:40 +01:00
Image Reason:

New commits pushed, approval review dismissed automatically according to repository settings

Image mfenniak dismissed Gusted's review 2026-03-09 20:21:42 +01:00
Image Reason:

New commits pushed, approval review dismissed automatically according to repository settings

Image mfenniak dismissed 0ko's review 2026-03-09 20:21:44 +01:00
Image Reason:

New commits pushed, approval review dismissed automatically according to repository settings

Image mfenniak dismissed 0xllx0's review 2026-03-09 20:21:47 +01:00
Image Reason:

New commits pushed, approval review dismissed automatically according to repository settings

Image mfenniak dismissed Beowulf's review 2026-03-09 20:21:49 +01:00
Image Reason:

New commits pushed, approval review dismissed automatically according to repository settings

AIAgreement.md Outdated
@ -20,3 +20,3 @@
1. If content was made with the help of AI, you must convey that this is the case. This includes content that you authored but was motivated by a suggestion of AI.
2. If at any point you used AI's work in your contribution you should make an effort to verify that you can submit this under the license of the repository.
2. Forgejo does not accept works of authorship (code, documentation, etc.) generated AI due to legal uncertainties. [^1]
Member

generated by AI?

Although, that could be interpreted as "It's fine as long as parts are made by a human." What about: "Forgejo does not accept works of authorship (code, documentation, etc.) that are either partially or completely generated by AI due to legal uncertainties."?

generated **by** AI? Although, that could be interpreted as "It's fine as long as parts are made by a human." What about: "Forgejo does not accept works of authorship (code, documentation, etc.) that are either partially or completely generated by AI due to legal uncertainties."?
Author
Member

Whoops, yes, lost the word "by" in that edit. Re-added it.

I'm not sure what interpretation would lead one to think "It's fine as long as parts are made by a human", but I don't mind this being explicit. 👍

Whoops, yes, lost the word "by" in that edit. Re-added it. I'm not sure what interpretation would lead one to think "It's fine as long as parts are made by a human", but I don't mind this being explicit. 👍
aahlenst marked this conversation as resolved
mfenniak force-pushed aiagreement-copyright from 968845daf6 to a6ecb70bb7 2026-03-09 20:39:14 +01:00 Compare
mfenniak force-pushed aiagreement-copyright from a6ecb70bb7 to 57bf0779be 2026-03-09 20:41:38 +01:00 Compare
Image Beowulf approved these changes 2026-03-09 20:45:00 +01:00
Image viceice approved these changes 2026-03-09 20:57:55 +01:00
Image floss4good approved these changes 2026-03-09 21:43:56 +01:00
Image Gusted approved these changes 2026-03-09 22:18:04 +01:00
Image
Owner

So does it also mean i can't use ai for basic auto complete? I usually suggests the code i would write anyways and i only used it if it matches what i was currently writing anyways. I've fully disabled ai for forgejo repo🤔

So does it also mean i can't use ai for basic auto complete? I usually suggests the code i would write anyways and i only used it if it matches what i was currently writing anyways. I've fully disabled ai for forgejo repo🤔
Image
Author
Member

@viceice wrote in #381 (comment):

So does it also mean i can't use ai for basic auto complete? I usually suggests the code i would write anyways and i only used it if it matches what i was currently writing anyways. I've fully disabled ai for forgejo repo🤔

As written, I would interpret it as being prohibited if the auto complete is coming from an LLM, as that's the definition of AI given. I'd say that's where the copyright risk comes in terms of "what was this trained on" and "how much plagiarizing is it doing?"

Things that I might call "basic auto complete" that would fall outside of this definition and therefore be completely fine would be:

  • Context-aware code completion -- os.Open<Tab> -> os.OpenFile, coming from a prefix-search or fuzzy-search with a language server or IDE integration.
  • Write for and the editor pops up a programmed snippet of a for loop in a Go program.
  • Anything that might be a fuzzy-match on code being edited in the repo -- I don't have a clear idea of whether IDE still do this, but I recall some of them would pattern match what you're doing now against other files in the same repo, and try to autocomplete.

Probably a reasonable approximation is: is it happening locally on your machine? (... and you didn't download a multi-GB LLM model from Hugging Face trained on copy-written work? 🤣)

@viceice wrote in https://codeberg.org/forgejo/governance/pulls/381#issuecomment-11432011: > So does it also mean i can't use ai for basic auto complete? I usually suggests the code i would write anyways and i only used it if it matches what i was currently writing anyways. I've fully disabled ai for forgejo repo:thinking: As written, I would interpret it as being prohibited if the auto complete is coming from an LLM, as that's the definition of AI given. I'd say that's where the copyright risk comes in terms of "what was this trained on" and "how much plagiarizing is it doing?" Things that I might call "basic auto complete" that would fall outside of this definition and therefore be completely fine would be: - Context-aware code completion -- `os.Open<Tab>` -> `os.OpenFile`, coming from a prefix-search or fuzzy-search with a language server or IDE integration. - Write `for ` and the editor pops up a programmed snippet of a for loop in a Go program. - Anything that might be a fuzzy-match on code being edited in the repo -- I don't have a clear idea of whether IDE still do this, but I recall some of them would pattern match what you're doing now against other files in the same repo, and try to autocomplete. Probably a reasonable approximation is: is it happening locally on your machine? (... and you didn't download a multi-GB LLM model from Hugging Face trained on copy-written work? 🤣)
Image mfenniak deleted branch aiagreement-copyright 2026-03-13 05:41:44 +01:00
Sign in to join this conversation.
No milestone
No project
No assignees
8 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

2026-03-14

Reference
forgejo/governance!381
No description provided.