A backup job does have a higher priority than an offload job, but in the case of a copy mod, stopping the offload by backup job is not an error and should be displayed as a green message.
The general stopping logic looks like this: if a backup job sees that the lock it requires on storage has been captured by an offload job, it sends a stop request to the offload task.
The offload task, receiving such a signal, stops, suppresses the error if it is an copy mod, and retry, again trying to get a lock on the storage (thereby waiting for the end of the backup activity).
There were 2 bugs in this logic (yes, it’s sad, 2 bugs in one place). The error suppression scope was too small and if a stop occurred at the time of resource scheduling, we did not suppress the error.
Also, if a stop occurred while the agent was executing a command (stop signal sended to agent), it used wrong type of stop reason and the error was not suppressed. Privatfix should fix both bugs.
I hope there are no more errors in this logic, but if they do appear when offload jobs are stopped by backup activity, this is definitely not normal behavior and should be escalated to R&D, since if there are offload logs it’s relatively easy to understand why exactly the suppress\retray logic in the offload job doesn’t work.
The general stopping logic looks like this: if a backup job sees that the lock it requires on storage has been captured by an offload job, it sends a stop request to the offload task.
The offload task, receiving such a signal, stops, suppresses the error if it is an copy mod, and retry, again trying to get a lock on the storage (thereby waiting for the end of the backup activity).
There were 2 bugs in this logic (yes, it’s sad, 2 bugs in one place). The error suppression scope was too small and if a stop occurred at the time of resource scheduling, we did not suppress the error.
Also, if a stop occurred while the agent was executing a command (stop signal sended to agent), it used wrong type of stop reason and the error was not suppressed. Privatfix should fix both bugs.
I hope there are no more errors in this logic, but if they do appear when offload jobs are stopped by backup activity, this is definitely not normal behavior and should be escalated to R&D, since if there are offload logs it’s relatively easy to understand why exactly the suppress\retray logic in the offload job doesn’t work.
Statistics: Posted by Ivan239 — Mar 02, 2024 4:28 pm






