The problem of the long tail

henke37 · Avril 5, 2018, 10:55

I’ve been thinking about the peertube architecture. It scales well for when one video gets very popular. But, there is the flip side.

When you have many videos that aren’t very popular. Yet people watch them. Maybe not a lot of people for any one video, but a lot of people with a lot of different videos.

This scenario is what I call the « long tail », lots of on their own small datapoints, but they all add up to a lot.

The question then is: what can be done about this?

Chocobozzz · Avril 6, 2018, 8:16

Hi,

Yes you’re right. https://github.com/Chocobozzz/PeerTube/issues/123 should resolve (or help at least) this issue.

henke37 · Avril 6, 2018, 3:12

I think that’s a good start, but it doesn’t scale. Let me be generous and say that the network currently has 200 instances. Yet, there are easily a few thousand videos. That’s what, 50 videos per instance on average? Seems reasonable. Thing is, the video count will grow way faster than the instance count.

No, what’s needed is for normal users to easily assist with the task, even when they aren’t actively watching a video.

If one were to look at traditional torrent systems there is an answer to the question of “why”. Offer something in return to users who help host the content. Pride and status is always cheap. But increased upload sizes for users at the discretion of each instance is an obvious answer. Tit for tat so to say.

Of course, this isn’t very compatible with webbrowsers and the volatile world of a browser sandbox. Which was an USP of the peertube system.

rigelk · Avril 7, 2018, 9:27

What you are looking for is a ratio system (or at least what I’ve experienced of it in many private and semi-private torrent trackers/communities). They provide an elegant solution in the torrent realm, where you can trade the ability to download with your capacity to share enough or long enough.
We can’t however trade the ability to download without loosing a fair share of users, so this tit for tat effectively needs to rely more on social status. It’s however a weaker incentive, and arguably enough to appeal to enough users. We should find more retribution mechanisms, ideally.
In the meantime, we could store the volume of uploaded content and the time for which a given user has seeded a given content.

Webbrowsers are still second class citizens when it comes to seeding. A persistent seeding accross views could help, but that will not ever be close to what a headless seedbox can achieve.