PeerTube and generative AI

I tried to send this via the support email but the form kept throwing an error whenever I would try to hit send, so if this reads more like a direct email rather then a general forum post, that’s why. I hope the people at Framasoft will see it here.

Hi, I am emailing you today to ask about your company’s relationship with generative AI.
Due to PeerTube’s usage of Whisper for automatic subtitles, and the lack of mention of generative AI on your website, I am concerned about where you stand on this issue. Especially since I just recently started making monthly donations to your company in attempt to further your efforts for a truly free internet based on open-source software.
It’s very well known at this point that generative AI (which doesn’t actually generate anything new) is horrifically harmful to the environment, as well as the careers of countless artists, writers, and general creatives. Assuming this is the main demographic of people that would make use of a service like PeerTube, you can understand my concern over Whisper’s use in it. I will fully admit, I don’t know how the software works, as I am not a programmer. It could be entirely ethical, but considering it’s made by the same people that run ChatGPT, I’m not very convinced of this.

I left Google and by extension YouTube in large part due to their use of generative AI. It would be monumentally disappointing to find that the best alternative to big tech’s monopolistic and outright fascistic practices were still making use of those same tools that threaten all of our well beings, as well as the planet we live on.
I am requesting confirmation that Framasoft will never use generative AI internally, or to publicly market their services, and will take a firm stance against theft of other people’s creations.

I look forward to hearing back from someone on the team soon. Also, if I have clearly missed something, am uninformed, or Framasoft’s stance on generative AI is very clear and out and the open, please correct and inform me.

Hello, thank you for following up. I can’t explain what might have happened with your message on our support system, as we continue to receive messages regularly via our support tool.

Please note that Framasoft is not a company, but an association (a really small non-profit organisation). But that’s a bit off topic here.

Yes, we are well aware of all the issues surrounding generative AI, particularly AI chatbots.

We have several projects aimed at raising awareness of these issues (the following list is not even exhaustive):

In short, no, we do not like AI. At all.

So let’s keep things in perspective: Whisper is not ChatGPT, even if it’s built with the same toxic company. Putting them on the same level is like comparing a kilo of carrots to a kilo of meat, just because they’re both food.

I’m not saying that Whisper is an ‘ethical’ tool (I mention this in the conference above, NO generative AI can be ethical, whether free or not). As Whisper’s training data is not available in its entirety, there is no doubt that this AI reproduces biases (gender, ethnicity, social class, etc.).

However, the energy required both for training the model and, even more so, for its inference, has little to do with a conversational generative AI such as ChatGPT.

In short, yes, transcription requires energy, and no, these models are not ‘ethical’. Even models such as VOSK Offline Speech Recognition API - which is virtually no longer actively maintained and significantly less effective than Whisper - consume resources. Note that transcription models are evolving, and evolving quickly. Mistral just announced its model, Voxtral [Voxtral | Mistral AI], a few hours ago. Nothing (except time/money) would prevent PeerTube from using other models in the future.

You will not receive this confirmation, as we (Framasoft) do not prohibit the use of specific AI tools. For example, as here for transcription, or for translation, etc.

It is up to you to trust us (or not, which is perfectly OK) to know what we do and to be informed in advance of the choices we make, whether political or technical.

We have said it, written it, printed it, and repeated it: we would prefer a world without Generative AI. We consider ourselves informed about the various types of problems posed by this technology. From our privileged technical position, we could easily boycott any use of these tools.

But we are a community organisation dedicated to educating people about digital issues. Our mission is not to judge or condemn people who use this or that technology. Typically, we are ‘anti-Google’ both personally and collectively, but that does not mean we criticise people who use Google as a search engine or Facebook as a social network. Our job is to present and propose alternatives, not to mount advocacy campaigns.

Today, we do not encourage anyone to use transcription tools. I understand perfectly well that by giving the option to people with the skills and means to enable transcription with Whisper in PeerTube, you may think that we are facilitating the development of the use of these technologies. But in reality, it is a balancing act between not imposing how users should use the software (we are not God) and evaluating the impact of each new PeerTube feature (or any other software, for that matter).

There is no doubt that this position will disappoint you. I am (sincerely) sorry about that.

Once again: you can choose to trust us (in the same way that we ask you to trust us when we tell you that we do not monetise your data), or not.

If you do not or no longer trust us, you can of course suspend your donation to Framasoft and stop using our services.

Our position on generative AI, like all our positions, is not definitive. However, I can attempt an imperfect summary (what follows is my attempt to summarise a complex position; I know it would be simpler to have a black-and-white position, but we choose to add a little nuance to this binary position).

We fully recognise the threats posed by generative AI, particularly conversational AI. We have been documenting, informing and actively working on this subject for several months/years (see bullet point list above), and, without ambiguity, we would have preferred a world without GAI.
However, we do not work for ourselves as individuals. We are at the service of the wider public, including those who do not always share our ideas, convictions or practices.
The projects we produce do not seek to make us heralds or heroes of free software or privacy protection, but to offer tools (software or intellectual) that enable people to ‘better’ appropriate technologies while emancipating themselves as much as possible, and therefore often imperfectly, from the digital giants.

However, we would like to emphasise that we do not wish to use conversational AI (AI chatbots) in any context other than that described in our mission statement. Namely, to provide popular education on digital issues. Framasoft’s goal is to enable a world that we believe to be fairer, not to make the current situation worse.

(For information and transparency, this message has been translated using DeepL, a generative AI system that translates text for people who, like me, are not fluent in languages other than their native language.)

1 « J'aime »

Thank you so much for the detailed response, it means a great deal to me. I don’t have much else to add, since you answered petty much all of my questions.

I appreciate how well thought through your positions are, and apologize for not being aware of the AI opposed projects you’re involved in. It gives me great peace of mind.
I also appreciate that these tools can have genuine uses for transcription and translation, as you explained. While I personally struggle to find where I stand on it, I absolutely see the nuance and difference between that and art theft, and even if some adjacent tech can be used in your services, I take comfort in your stances on AI as a whole.

The only clarification I could possibly need is in regards to my request for confirmation that generative AI wouldn’t be used. While I intended to and did ask the question generally, I should have been more specific as well.
My main worry was that Framasoft would, at some point in the future, use AI « generated » audio/visual material to market itself or its services. That would be where I’d take the most issue, not to mention that the artwork you have on your website is absolutely beautiful, and the artists you commission incredibly talented.

Other then that, do you think it would be worth reiterating the points you’ve made in a section of joinpeertube.org’s FAQ at some point? It would be nice for me to be able to point it out to friends if I find myself recommending your services.

Thanks once again for the response, and thank you for being a positive force of change in a capitalistic world.

Hello,

The only clarification I could possibly need is in regards to my request for confirmation that generative AI wouldn’t be used. While I intended to and did ask the question generally, I should have been more specific as well.
My main worry was that Framasoft would, at some point in the future, use AI « generated » audio/visual material to market itself or its services. That would be where I’d take the most issue, not to mention that the artwork you have on your website is absolutely beautiful, and the artists you commission incredibly talented.

We’re really happy to work with David Revoy for our illustrations and we can’t see the point to use generative AI for our visual identity!

If we’d have to use generative AI that would only be for educational purposes (regarding, to say it again, our mission to raise awareness about digital issues).

Other then that, do you think it would be worth reiterating the points you’ve made in a section of joinpeertube.org’s FAQ at some point? It would be nice for me to be able to point it out to friends if I find myself recommending your services.

We’ll discuss about this point internally! Thanks for your suggestion!

1 « J'aime »

I feel the need to clarify for anyone stumbling upon this in the future: I don’t really agree with my original reply anymore. I’ve since become more sensitive to the broader harms of AI, outside of “just” being an insult to any and all artists. As much as I loved using PeerTube, I can’t continue to unless Framasoft outright bans LLMs from being used during the development of their software, as well as removing any and all AI features such as Whisper. Since they don’t seem keen on doing that, I guess I’ll just have to wait for a fork. I suggest you do the same or take up the task yourself. I’m no developer, but I’ll be first in line to donate to an AI-free PeerTube.

Sorry to those at Framasoft. You’ve made amazing things and I want to support them, but this is a big elephant in the room that can’t be ignored.

I understand the general skepticism toward LLMs.
The world is complex and is becoming increasingly so. I’m not claiming to have all the answers and I’m not saying you’re wrong.

So much content is available only as video. How can people with hearing impairments watch videos?

Subtitles can be part of the solution. So how do videos get subtitles?

Not every video is produced based on a script that’s written beforehand and then turned into a video.
I’m thinking of panel discussions, interviews, conversations and other ad hoc formats where the video comes first. Does someone sit down later to transcribe the video?

I recommend giving it a try, it would be an interesting experience.

A rejection of LLMs would, in effect, intentionally exclude people.

LLMs can also be helpful to people. These people can instead watch videos outside of YouTube. Those videos have subtitles.
Integrating an LLM into Peertube can help generate subtitles. However, due to the high number of errors, manual review and editing are highly recommended anyway.

Subtitles also make it possible to offer videos in languages other than the one spoken in the video. This allows videos to serve as a bridge between people in different countries and languages. Isn’t that great?

So there are some positive aspects as well.

And then there’s the matter of compliance with the law. In my country, there’s a law requiring that digital content and videos on websites be accessible to everyone. In this way, automatic (and manually corrected) captions can help ensure compliance with legal obligations.

That’s why I think Framasoft is doing the right thing. The features are available in Peertube, and it’s up to each admin to decide.