[Federated Learning] Privacy-preserving recommender system for Peertube?

Hi folks!
I am Marc, a PhD student working on Privacy-Preserving Machine Learning (with a slight focus on Federated Learning). Being a Fediverse enthusiast, I am looking for some applications of my research work in the Fediverse. My will is to bring ML to the Fediverse. Don’t get me wrong, I don’t want to force ML features in the Fediverse but only cover necessary features (if exist). In this context, Peertube might be a good fit if you want to build a privacy-preserving recommender system: such system would be collaboratively trained by all instances (i.e., all instances accepting the recommendation extension). No private data would be revealed (= each instance keeps its data locally) and each instance might even have a slightly personalized model (depending on the particular interests of its community).

This might be already too much details since my first question for you is: « Would the Peertube project/developers be interested by such automated recommender system? » I saw a relatively old open issue about recommendations on GitHub but I would like to know whether informal discussions happened since then. Maybe, you even have some developers working on it?

NB: I first posted this message on Matrix and later discovered this forum which is clearly a better place to discuss such complex feature.

1 Like

Hi!

What kind of privacy-preserving recommendations this feature would give? Is it for users/admins?

Hi,
This can be defined depending on your needs. The first step will be to have a private training: the training step of the recommendation model will be done collaboratively between the instances (keeping their data private). They just exchange updates of Machine Learning model to converge towards a common optimal model.
Once we have a good model trained over all the federation data, we can do « secure inference » where the user sends an encrypted query and receives encrypted recommendations (that are locally decrypted). However, it raises even more technical issues (still very interesting).

I would suggest to start with the following assumption: « Each user trusts its Peertube instance (= its admins) to store and process her preferences. » However, the user has no trust in the rest of the instances or even to any other user in the world. In this case, we would do the private training but the recommendation would be done in plaintext (between the user and her instance). Each instance would know the preferences of its users.

I hope it clarifies a bit my idea. Otherwise, I can detail more. BTW, this is just one possible setup. My goal is to discuss with you about your expectations in terms of recommendations and privacy so I can design a system satisfying exactly your needs. We can then discuss: what data is used? where is it stored? who is allowed to access it? who agrees to contribute to the training computations? etc.