Proposal for Sepia videos captions search

chlin501 · Novembre 17, 2025, 8:47

After searching this category, I do not find other contribution guide apart from PeerTube Contributing Guide; however, that doc looks like more of explaining the things required for e.g. translation, development. So here is my proposal for Sepia Search’s new feature in a higher level. Otherwise, please let me know if I should follow some formal steps, or guide. Also, please let me know if any issues related to the proposal. Many thanks.

# Proposal

This proposal aims at providing Sepia Search the capability of searching videos by videos’ captions, in addition to the current searching options such as searching by videos, playlists, and channels.

## The problem

A same video uploaded to different PeerTube streaming instances may have a video’s title renamed, or some content edited. This increases the difficulty for a Sepia Search user finding the video they are after when only partial conversation that user remembers.

## A solution

This can be somewhat mitigated by allowing the user to search with keywords or partial sentences in the conversation of videos.

## PROs

* Sepia Search usability

* Attracting more users

## CONs

* Architecture complexity

* Integration complexity

* Not 100% accuracy search result

## Issues

In order to allow Sepia Search users find videos through captions, this requires exploiting machine learning models to extract video information. Thus, an additional component or process, apart from the search-index, is necessary, incurring the level of complexity. Granting Sepia Search with option to enable or disable such feature so that the administrator can control the service at their own will.

Additionally, the component or process that utilizes machine learning model for extracting video information may run with different programming languages other than JavaScript, posing to some integration complexity. These include

1. management integration

2. search integration

3. security integration

For the first problem, RESTful APIs can be employed so to provide an uniform access interface to the client side; whereas for the second, Sepia Search remains interacting with Meilisearch, which prevents additional communication links overhead. The third problem may need to discuss with the PeerTube developer team so to minimize the overhead of maintaining multiple security mechanism, though several options such as JWT available.

Although with the aid of machine learning models, Sepia Search can exploit for searching videos by video’s captions, an obvious problem is the model may not 100% percent correctly recognize the conversation, leading to the imprecision of extracted information. Fortunately, amending the extracted information used by Sepia Search can alleviate the imprecision problem, though amending process may require human intervention initially, that could also be improved in the future.