How Laravel fails and retries jobs
This post has some examples of how failing and retrying jobs work in Laravel.
All examples in this post use queue:work
to run the queue worker.
For reference, here are the default arguments for queue:work
.
If you define tries
or backoff
in a job class, they take precedence over the arguments of your worker.
The arguments you pass to the queue worker don't matter that much.
The only important thing is that you don't set your tries
to 0
, that'll get your failing jobs stuck in infinite loops and you won't have a good time (this was actually the default up until Laravel 6).
Key takeaways from this post:
-
If you define a
Queue::failing()
callback, every failed queued job calls that exactly once -
If a queued job fails and has a
failed
method, it will call that method exactly once -
Jobs that fail always end up in the
failed_jobs
table -
The
backoff
delay is only used when a job fails due to an uncaught exception -
You have to manually stop your code after calling
release
orfail
in a job -
Always make sure your
retry_after
is higher than your highest timeout
Events for failed jobs
For all examples below, I've registered this event listener in my AppServiceProvider
:
use Illuminate\Queue\Events\JobFailed;
use Illuminate\Support\Facades\Queue;
class AppServiceProvider extends ServiceProvider
{
public function boot()
{
Queue::failing(function (JobFailed $event) {
info('Queue::failing()');
});
}
}
Releasing jobs
You can use the release
method to tell Laravel a job should be retried.
After calling this method, the job is marked to be "released" back to the queue, this just means that you want to retry the job.
class TestJob extends BaseJob implements ShouldQueue
{
public $backoff = 999; // this value isn't used because we use `release()`
public $tries = 3;
public function handle()
{
info('handle(): '.$this->attempts().' / '.$this->tries);
$this->release(delay: now()->addSeconds(3));
info('After release');
}
public function failed(Throwable $exception): void
{
info('failed(): '.get_class($exception).': '.$exception->getMessage());
}
}
This job logs the following:
[2023-04-16 13:19:48] local.INFO: handle(): 1 / 3
[2023-04-16 13:19:48] local.INFO: After release
[2023-04-16 13:19:51] local.INFO: handle(): 2 / 3
[2023-04-16 13:19:51] local.INFO: After release
[2023-04-16 13:19:54] local.INFO: handle(): 3 / 3
[2023-04-16 13:19:54] local.INFO: After release
[2023-04-16 13:19:57] local.INFO: failed(): Illuminate\Queue\MaxAttemptsExceededException: App\Jobs\TestJob has been attempted too many times or run too long. The job may have previously timed out.
[2023-04-16 13:19:57] local.INFO: Queue::failing()
[2023-04-16 13:19:57] local.ERROR: App\Jobs\TestJob has been attempted too many times or run too long. The job may have previously timed out. {"exception":"[object] (Illuminate\\Queue\\MaxAttemptsExceededException(code: 0): App\\Jobs\\TestJob has been attempted too many times or run too long. The job may have previously timed out. at /Users/sjorso/code/watchtower/vendor/laravel/framework/src/Illuminate/Queue/Worker.php:746)
[stacktrace]
(long stacktrace)
The first thing to notice is that the release()
method does not stop the job, you have to stop the code manually.
Even if it reaches the end of the handle method, the job will still be retried if you called release()
somewhere.
Another thing to note is that backoff
does not apply.
The backoff
delay is only used when your job fails due to an uncaught exception.
When you retry a job by releasing it, it always uses the delay passed into the release
method.
If you call release()
without an argument, the job will be retried instantly.
Uncaught exceptions
Let's take a look at uncaught exceptions:
class TestJob extends BaseJob implements ShouldQueue
{
public $backoff = 5;
public $tries = 3;
public function handle()
{
info('handle(): '.$this->attempts().' / '.$this->tries);
throw new RuntimeException($this->attempts().' / '.$this->tries);
}
public function failed(Throwable $exception): void
{
info('failed(): '.get_class($exception).': '.$exception->getMessage());
}
}
This job logs the following:
[2023-04-16 13:09:33] local.INFO: handle(): 1 / 3
[2023-04-16 13:09:33] local.ERROR: 1 / 3 {"exception":"[object] (RuntimeException(code: 0): 1 / 3 at /Users/sjorso/code/watchtower/app/Jobs/TestJob.php:24)
[stacktrace]
(long stacktrace)
[2023-04-16 13:09:38] local.INFO: handle(): 2 / 3
[2023-04-16 13:09:38] local.ERROR: 2 / 3 {"exception":"[object] (RuntimeException(code: 0): 2 / 3 at /Users/sjorso/code/watchtower/app/Jobs/TestJob.php:24)
[stacktrace]
(long stacktrace)
[2023-04-16 13:09:43] local.INFO: handle(): 3 / 3
[2023-04-16 13:09:43] local.INFO: failed(): RuntimeException: 3 / 3
[2023-04-16 13:09:43] local.INFO: Queue::failing()
[2023-04-16 13:09:43] local.ERROR: 3 / 3 {"exception":"[object] (RuntimeException(code: 0): 3 / 3 at /Users/sjorso/code/watchtower/app/Jobs/TestJob.php:24)
[stacktrace]
(long stacktrace)
Each attempt logs the exception, and then waits backoff
seconds to retry.
The failed
method is only called after the last failure.
Manually failing your job
You can also manually fail a job by calling the fail
method:
class TestJob extends BaseJob implements ShouldQueue
{
public $backoff = 5;
public $tries = 3;
public function handle()
{
info('handle(): '.$this->attempts().' / '.$this->tries);
$this->fail('The message');
info('After fail');
}
public function failed(Throwable $exception): void
{
info('failed(): '.get_class($exception).': '.$exception->getMessage());
}
}
This job logs the following:
[2023-04-16 13:10:45] local.INFO: handle(): 1 / 3
[2023-04-16 13:10:45] local.INFO: failed(): Illuminate\Queue\ManuallyFailedException: The message
[2023-04-16 13:10:45] local.INFO: Queue::failing()
[2023-04-16 13:10:45] local.INFO: After fail
Calling fail
tells Laravel that this job has failed and that it shouldn't be retried.
Laravel doesn't write anything to the log.
Just like with the release
method, the fail
method doesn't throw an exception, so you have to manually return and stop your code.
Job timeouts
Last up, timeouts:
class TestJob extends BaseJob implements ShouldQueue
{
public $timeout = 3;
public $backoff = 999;
public $tries = 3;
public function handle()
{
info('handle(): '.$this->attempts().' / '.$this->tries);
sleep($this->timeout + 1);
info('After sleep');
}
public function failed(Throwable $exception): void
{
info('failed(): '.get_class($exception).': '.$exception->getMessage());
}
}
Timeouts are interesting, this is what gets logged:
[2023-04-16 13:12:23] local.INFO: handle(): 1 / 3
(the queue:work process gets killed here)
(supervisor restarts the queue:work process)
(nothing happens for 90 seconds, this the "retry_after" value in "config/queue.php")
[2023-04-16 13:13:53] local.INFO: handle(): 2 / 3
(same as above, process killed and restarted, waiting to retry)
[2023-04-16 13:15:23] local.INFO: handle(): 3 / 3
(same as above, process killed and restarted, no waiting this time)
[2023-04-16 13:15:26] local.INFO: failed(): Illuminate\Queue\MaxAttemptsExceededException: App\Jobs\TestJob has been attempted too many times or run too long. The job may have previously timed out.
[2023-04-16 13:15:26] local.INFO: Queue::failing()
Laravel doesn't write anything to the log when a job fails due to a timeout.
The only way you'd notice that this job has failed is by either monitoring your failed_jobs
table, or by logging something manually.
It is important to understand how the retry_after
delay works.
The operating system kills the queue worker process if it exceeds the timeout limit, Laravel doesn't know that this has happend, so your job ends up stuck in limbo.
Laravel solves this by always retrying all jobs after retry_after
seconds.
It is very important that your retry_after
value is higher than your highest timeout, else long-running jobs will get retried while they are still running.
Don't retry a job on timeout
If you don't want to retry a job on timeout, you can define a failOnTimeout
property on your job.
If you take the same job as above, but add a public failOnTimeout = true;
property, it logs the following:
[2023-04-16 13:21:57] local.INFO: handle(): 1 / 3
[2023-04-16 13:22:00] local.INFO: failed(): Illuminate\Queue\MaxAttemptsExceededException: App\Jobs\TestJob has been attempted too many times or run too long. The job may have previously timed out.
[2023-04-16 13:22:00] local.INFO: Queue::failing()