gpu: host1x: Add no-recovery mode

Add a new property for jobs to enable or disable recovery i.e.
CPU increments of syncpoints to max value on job timeout. This
allows for a more solid model for hanged jobs, where userspace
doesn't need to guess if a syncpoint increment happened because
the job completed, or because job timeout was triggered.

On job timeout, we stop the channel, NOP all future jobs on the
channel using the same syncpoint, mark the syncpoint as locked
and resume the channel from the next job, if any.

The future jobs are NOPed, since because we don't do the CPU
increments, the value of the syncpoint is no longer synchronized,
and any waiters would become confused if a future job incremented
the syncpoint. The syncpoint is marked locked to ensure that any
future jobs cannot increment the syncpoint either, until the
application has recognized the situation and reallocated the
syncpoint.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
This commit is contained in:
Mikko Perttunen
2021-06-10 14:04:43 +03:00
committed by Thierry Reding
parent 687db2207b
commit c78f837ae3
7 changed files with 81 additions and 7 deletions

View File

@@ -236,9 +236,15 @@ struct host1x_job {
u32 syncpt_incrs;
u32 syncpt_end;
/* Completion waiter ref */
void *waiter;
/* Maximum time to wait for this job */
unsigned int timeout;
/* Job has timed out and should be released */
bool cancelled;
/* Index and number of slots used in the push buffer */
unsigned int first_get;
unsigned int num_slots;
@@ -259,6 +265,9 @@ struct host1x_job {
/* Add a channel wait for previous ops to complete */
bool serialize;
/* Fast-forward syncpoint increments on job timeout */
bool syncpt_recovery;
};
struct host1x_job *host1x_job_alloc(struct host1x_channel *ch,