[Prev][Next][Index][Thread]

Re: CPU inheritance issue



> Hi all,
> 
> In the OSDI CPU Inheritance paper, the authors claim that CPU inheritance
> is a general-purpose replacement for priority inheritance, and results in
> correct real-time behavior.  There is an important case where this is not
> true.
> 
> In a traditional priority-based real-time system implementing priority
> inheritance, when a thread releases a mutex the system unblocks the
> waiting thread with highest priority and lets it enter the critical
> section.  This is important: if mutex wait queues are FIFO, high-priority
> tasks can wait behind a large convoy of low-priority tasks.  This will
> have a very bad effect on the worst-case schedulability of the system.


Please do not take this as a definitive answer: it has been a long time
since I dealt with CPU-inheritance scheduling, and I have not even
looked at the implementation you alluded to in the OSKit.


Yes, you are absolutly correct: the use of FIFO mutexes with an
inheritance-based framework will not work properly.  This is true
for CPUI as well as for a priority-based priority inheritance scheme.

See section 3.4 of the paper, voluntary donation.  Yes, the mutex
implementation needs to know about the priority inheritance aspects of
CPU inheritance (as lock needs to schedule the correct thread).  Yes,
a bad implementation of mutexes can cause problems.  But I do not
believe these are due to problems with CPU inheritance scheduling,
but rather point out how various parts of the system need to interact
properly in CPUI.

CPU inheritance allows a thread to donate its time to another thread.
This allows priority inheritance to be implemented by having a thread
donate its time to the owner of a lock.  In this case, the thread only
wants to donate its time until the mutex is released.  In other words,
an event of interest to that thread is when its donated thread releases
the mutex, at which point it wants control back.

Thus, a ``good'' implementation of mutexes in the CPUI world will be
able to notify/wakeup/schedule the correct thread (if not itself) when
the mutex is released.

> The problem is that when deciding what waiting thread to wake up, we need
> to reason about the scheduling desirability of a group of blocked threads.
> Of course this is trivial in a priority-based system; as far as I can see,
> it's not trivial under the CPU inheritance model.

Yes, when a mutex is released, the thread that is currently donating the
CPU time to the thread that released the lock should be the one to execute.
In other words, the highest-priority thread should resume execution.

But it is not necessary to track the priorities of all the threads:
it is only necessary to indicate which thread would be scheduled if
the mutex were to be released.  Since that is the thread that explicitly
told the lock() operation to donate its time to the lock's owner, the
mutex knows which thread to execute -- the last thread to donate time
to this thread, if it is waiting on the mutex.

It gets more interesting when other schedulers get involved and the
thread can be running for (for example) 3 different reasons: ``itself,''
and two doner threads [if the thread takes multiple time quantums while
owning the lock, multiple other threads could block on the mutex].  In
this case, it is necessary to track who dontated time *last* to the
thread.  If that donating thread wants notification on mutex-release,
the mutex code needs to do that.

I think the problem you describe should not happen with a proper mutex
implemenation.  If a thread (A) does donate time to another thread (Z),
which releases the mutex after A's quantum is expired, A will not get to
run that quantum.  If another thread B, becomes scheduled and acquires
the lock, and then A gets scheduled again, A will donate it's quantum
to B.  If A is truly higher priority, B should not have been scheduled.

Another case: A blocks on the mutex, Z runs but does not release;
quantum expires and B runs and blocks on same mutex, Z runs, releases,
and B should now be scheduled for the rest of the quantum.

Yet another case: A blocks on M, Z runs but does not release; quantum
expires, B is scheduled and blocks on M; Z runs but does not release;
quantum expires, and Z gets scheduled as the highest-priority thread.
Z releases mutex and continues execution.

If A or B would have been scheduled before Z released M, Z would have
been given their CPU time since the `inherit' flag was set on the mutex,
and Z was runnable.

An interesting case would be where Z blocks W/O donating time to the
lock holder: in this case, A,B, and Z would all have to wait until
the mutex was released and one of them was scheduled again.
This case shows the downsides of using partial priority inheritance,
where CPU is only sometimes donated.

> Maybe there is some way to atomically wake up all waiting threads and see
> which one the scheduling hierarchy decides to run, and then re-block the
> others?

No, that is the ``thundering herd'' problem.  Only one thread should be
woken up, unless all of them are now able to run.

In practice, on a uniprocessor, waking all threads isn't as bad as it
could be, because each thread will execute one-at-a-time, and in all
likelihood the resource will be available again by the time a thread
gets to execute, so it isn't necessary to pay the cost (in practice)
of putting all the threads to sleep again.

Kevin Van Maren
University of Utah

Follow-Ups: