Object storage migration fails if a single DNS timeout occurs

I’ve been working on migrating from local storage to Backblaze B2. A couple videos have ended up in the « failed move » state so far, which requires some manual postgres surgery to recover.

The logs had a transient dns error each time, and it did not retry. Here’s one of the errors pasted below. The recovery in one case required restoring files and database rows from backup. In another case it was enough to change the video « state » column back to 1 and try again. I haven’t seen this problem since adding the object storage to /etc/hosts.

(removed TLD from URLs in log to avoid forum limit)

May 27 18:55:33 tuber peertube[4168586]: [diode:443] 2024-05-27 18:55:33.810 error: Cannot execute job 6 in queue move-to-object-storage. { May 27 18:55:33 tuber peertube[4168586]: « payload »: { May 27 18:55:33 tuber peertube[4168586]: « videoUUID »: « 8c47ed3b-757e-42ad-95a8-d83fde664d0f », May 27 18:55:33 tuber peertube[4168586]: « isNewVideo »: false, May 27 18:55:33 tuber peertube[4168586]: « previousVideoState »: 1 May 27 18:55:33 tuber peertube[4168586]: },
May 27 18:55:33 tuber peertube[4168586]: « err »: {
May 27 18:55:33 tuber peertube[4168586]: « stack »: « Error: getaddrinfo EAI_AGAIN video-diode-zone.s3.us-west-001.backblazeb2\n at GetAddrInfoReqWrap.onlookupall [as oncomplete] (node:dns:120:26) », May 27 18:55:33 tuber peertube[4168586]: « message »: « getaddrinfo EAI_AGAIN video-diode-zone.s3.us-west-001.backblazeb2 », May 27 18:55:33 tuber peertube[4168586]: « errno »: -3001, May 27 18:55:33 tuber peertube[4168586]: « code »: « EAI_AGAIN », May 27 18:55:33 tuber peertube[4168586]: « syscall »: « getaddrinfo », May 27 18:55:33 tuber peertube[4168586]: « hostname »: « video-diode-zone.s3.us-west-001.backblazeb2.com », May 27 18:55:33 tuber peertube[4168586]: « $metadata »: { May 27 18:55:33 tuber peertube[4168586]: « attempts »: 1, May 27 18:55:33 tuber peertube[4168586]: « totalRetryDelay »: 0 May 27 18:55:33 tuber peertube[4168586]: } May 27 18:55:33 tuber peertube[4168586]: } May 27 18:55:33 tuber peertube[4168586]: }

Hi,

feat: config option object_storage.max_attempts by kontrollanten · Pull Request #6418 · Chocobozzz/PeerTube · GitHub may fix this issue even if I’m not sure to understand why there is a DNS issue

That pull request does look like a big improvement!

Re the DNS error, I haven’t conclusively identified the root cause but I think it’s just a UDP timeout lining up with the local cache TTL expiring. It’s a systemd-resolved instance backed by google dns. The return code indicates the caller should try again but it’s being taken as a fatal error.

1 « J'aime »