Conversation:
Notices
-
@takeshitakenji Because of the type of exception that it's raising is a ClientError, which defaults to 400 if it isn't passed any information (which it doesn't look like it was given the complete lack of message or whatever.)
You'd have to strace it or something and see what's raising it. If I know I can try to take a look.
-
@takeshitakenji There are a few things that will give stock 500 because its the most appropriate error, in ServerException subtypes, but to my recollection 400 is only the default ClientError
-
@takeshitakenji If I had to make an (educated) guess it's probably expecting to have the records for federated notices that didn't get through properly.
-
@takeshitakenji The client error if that's the case is symptomatic. It's probably going "they're asking for records that seem to be broken"
-
@takeshitakenji I looked at the code in question and I'd refine this to: "The server knows it was supposed to receive a federated message directed at you, however the message itself was dropped because of packet loss in transmission, which causes the QvitterNotification to fail because the notice doesn't exist."
-
@takeshitakenji If the source server has retries enabled you're probably getting it eventually so it can be ignored in that case as just the network being naff. If the source server doesn't though, then you've lost that message.
-
@takeshitakenji Unfortunately short of asking the server admins there's no easy way to know if a server has retries enabled. (Mine does.)
-
@takeshitakenji It's probably the root cause.
-
@takeshitakenji Might consider making the duplication detection stronger. But the retries code should look for duplicates and it doesn't, because it was never enabled by default before and people just accepted transmission loss.
-
@takeshitakenji I mean the internal option in pA/GS. There's a setting for how long you have to wait between identical messages.
As to the actual code side, I suspect the redis code is operating so quickly all the retries are essentially happening in parallel which is less than desirable if we have temporary network outages. I should probably introduce a pause to it.
-
@takeshitakenji I agree. Ideally we should have a "last retry" field on the queue item which we can then use to put a (configurable) pause between them.