Linux: Discussing the Completely Fair Scheduler

Submitted by Jeremy
on May 15, 2007 - 6:10am

"As I understand, fair_clock is a monotonously increasing clock which advances at a pace inversely proportional to the load on the runqueue," Srivatsa Vaddagiri explained in a review of Ingo Molnar [interview]'s CFS CPU scheduler [story], "if load = 1 (task), it will advance at same pace as wall clock, as load increases it advances slower than wall clock." He continued on to ask some questions about the choices made in CFS as compared to the EEVDF CPU scheduler [story]. In the resulting discussion, Ingo offered some insight into the design of the CFS. He began:

"80% of CFS's design can be summed up in a single sentence: CFS basically models an 'ideal, precise multi-tasking CPU' on real hardware. 'Ideal multi-tasking CPU' is a (non-existent :-) CPU that has 100% physical power and which can run each task at precise equal speed, in parallel, each at 1/nr_running speed. For example: if there are 2 tasks running then it runs each at 50% physical power - totally in parallel.

"On real hardware, we can run only a single task at once, so while that one task runs the other tasks that are waiting for the CPU are at a disadvantage - the current task gets an unfair amount of CPU time. In CFS this fairness imbalance is expressed and tracked via the per-task p->wait_runtime (nanosec-unit) value. 'wait_runtime' is the amount of time the task should now run on the CPU for it become completely fair and balanced."


From: Srivatsa Vaddagiri [email blocked]
To: Ingo Molnar [email blocked]
Subject: fair clock use in CFS
Date:	Mon, 14 May 2007 14:03:58 +0530

Hi,
	I have been brooding over how fair clock is computed/used in
CFS and thought I would ask the experts to avoid wrong guesses!

As I understand, fair_clock is a monotonously increasing clock which
advances at a pace inversely proportional to the load on the runqueue.
If load = 1 (task), it will advance at same pace as wall clock, as 
load increases it advances slower than wall clock.

In addition, following calculations depend on fair clock: task's wait
time on runqueue and sleep time outside the runqueue (both reflected in
p->wait_run_time).

Few questions that come up are:

1. Why can't fair clock be same as wall clock at all times? i.e fair
   clock progresses at same pace as wall clock independent of the load on
   the runqueue.

   It would still give the ability to measure time spent waiting on runqueue 
   or sleeping and use that calculated time to give latency/bandwidth
   credit? 

   In case of EEVDF, the use of virtual clock seems more
   understandable, if we consider the fact that each client gets 'wi' real
   time units in 1 virtual time unit. That doesnt seem to be the case in
   CFS as Ting Yang explained +/- lags here 
   http://lkml.org/lkml/2007/5/2/612 ..


2. Preemption granularity - sysctl_sched_granularity

	This seems to be measured in the fair clock scale rather than
	wall clock scale. As a consequence of this, the time taken
	for a task to relinquish to competetion is dependent on number N
	of tasks? For ex: if there a million cpu hungry tasks, then the
	time taken to switch between two tasks is more compared to the
	case where just two cpu hungry tasks are running. Is there
	any advantage of using fair clock scale to detect preemtion points?


-- 
Regards,
vatsa


From: Ingo Molnar [email blocked] Subject: Re: fair clock use in CFS Date: Mon, 14 May 2007 13:10:51 +0200 * Srivatsa Vaddagiri [email blocked] wrote: > I have been brooding over how fair clock is computed/used in > CFS and thought I would ask the experts to avoid wrong guesses! hey, thanks for the interest :-) > As I understand, fair_clock is a monotonously increasing clock which > advances at a pace inversely proportional to the load on the runqueue. > If load = 1 (task), it will advance at same pace as wall clock, as > load increases it advances slower than wall clock. correct. > In addition, following calculations depend on fair clock: task's wait > time on runqueue and sleep time outside the runqueue (both reflected > in p->wait_run_time). (note that in -v12 only the on-runqueue wait time is used.) > Few questions that come up are: > > 1. Why can't fair clock be same as wall clock at all times? i.e fair > clock progresses at same pace as wall clock independent of the load > on the runqueue. > > It would still give the ability to measure time spent waiting on > runqueue or sleeping and use that calculated time to give > latency/bandwidth credit? 155 milliseconds spent waiting for CPU time while there are another 10 tasks contending for the CPU is a different scenario from when there is just one task running on the CPU. In the 10-task case a single task would only have a 'fair expectation' to run for 15.5 milliseconds, in the 1-task case a single task has a 'fair expectation' to run the full 155 milliseconds. The 'fair clock' measures this capacity of 'effective CPU power' in essence. but let me give you some more CFS design background: 80% of CFS's design can be summed up in a single sentence: CFS basically models an "ideal, precise multi-tasking CPU" on real hardware. "Ideal multi-tasking CPU" is a (non-existent :-) CPU that has 100% physical power and which can run each task at precise equal speed, in parallel, each at 1/nr_running speed. For example: if there are 2 tasks running then it runs each at 50% physical power - totally in parallel. On real hardware, we can run only a single task at once, so while that one task runs the other tasks that are waiting for the CPU are at a disadvantage - the current task gets an unfair amount of CPU time. In CFS this fairness imbalance is expressed and tracked via the per-task p->wait_runtime (nanosec-unit) value. "wait_runtime" is the amount of time the task should now run on the CPU for it become completely fair and balanced. ( small detail: on 'ideal' hardware, the p->wait_runtime value would always be zero - no task would ever get 'out of balance' from the 'ideal' share of CPU time. ) CFS's task picking logic is based on this p->wait_runtime value and it is thus very simple: it always tries to run the task with the largest p->wait_runtime value. In other words, CFS tries to run the task with the 'gravest need' for more CPU time. So CFS always tries to split up CPU time between runnable tasks as close to 'ideal multitasking hardware' as possible. Most of the rest of CFS's design just falls out of this really simple concept, with a few add-on embellishments like nice levels, multiprocessing and various algorithm variants to recognize sleepers. In practice it works like this: the system runs a task a bit, and when the task schedules (or a scheduler tick happens) the task's CPU usage is 'accounted for': the (small) time it just spent using the physical CPU is deducted from p->wait_runtime. [minus the 'fair share' it would have gotten anyway]. Once p->wait_runtime gets low enough so that another task becomes the 'leftmost task' (plus a small amount of 'granularity' distance relative to the leftmost task so that we do not over-schedule tasks and trash the cache) then the new leftmost task is picked and the current task is preempted. The rq->fair_clock value tracks the 'CPU time a runnable task would have fairly gotten, had it been runnable during that time'. So by using rq->fair_clock values we can accurately timestamp and measure the 'expected CPU time' a task should have gotten. All runnable tasks are sorted in the rbtree by the "rq->fair_clock - p->wait_runtime" key, and CFS picks the 'leftmost' task and sticks to it. As the system progresses forwards, newly woken tasks are put into the tree more and more to the right - slowly but surely giving a chance for every task to become the 'leftmost task' and thus get on the CPU within a deterministic amount of time. > 2. Preemption granularity - sysctl_sched_granularity > > This seems to be measured in the fair clock scale rather than > wall clock scale. As a consequence of this, the time taken for a > task to relinquish to competetion is dependent on number N of > tasks? For ex: if there a million cpu hungry tasks, then the > time taken to switch between two tasks is more compared to the > case where just two cpu hungry tasks are running. Is there any > advantage of using fair clock scale to detect preemtion points? yes, there is an advantage. The granularity is really mostly a property of the physical CPU and not part of the "precise scheduling" model. Granularity expresses the fact that a task has caching affinity to the CPU and there is a cost to scheduling in a too finegrained way. This granularity value does not depend on the number of tasks running. For example the granularity value could be made really small if CPUs had no task-switching costs. ( small detail: the granularity value is currently dependent on the nice level, making it easier for higher-prio tasks to preempt lower-prio tasks. ) ( i'd agree in theory with the proposition that the granularity could be made a function of load too, but only to model cache/affinity effects, _not_ due to the basic model of precise scheduling. ) Ingo
From: Srivatsa Vaddagiri [email blocked] Subject: Re: fair clock use in CFS Date: Mon, 14 May 2007 18:34:12 +0530 On Mon, May 14, 2007 at 01:10:51PM +0200, Ingo Molnar wrote: > but let me give you some more CFS design background: Thanks for this excellent explanation. Things are much clearer now to me. I just want to clarify one thing below: > > 2. Preemption granularity - sysctl_sched_granularity [snip] > This granularity value does not depend on the number of tasks running. Hmm ..so does sysctl_sched_granularity represents granularity in real/wall-clock time scale then? AFAICS that doesnt seem to be the case. __check_preempt_curr_fair() compares for the distance between the two task's (current and next-to-be-run task) fair_key values for deciding preemption point. Let's say that to begin with, at real time t0, both current task Tc and next task Tn's fair_key values are same, at value K. Tc will keep running until its fair_key value reaches atleast K + 2000000. The *real/wall-clock* time taken for Tc's fair_key value to reach K + 2000000 - is surely dependent on N, the number of tasks on the queue (more the load, more slowly the fair clock advances)? This is what I meant by my earlier remark: "If there a million cpu hungry tasks, then the (real/wall-clock) time taken to switch between two tasks is more compared to the case where just two cpu hungry tasks are running". -- Regards, vatsa
From: Ingo Molnar [email blocked] Subject: Re: fair clock use in CFS Date: Mon, 14 May 2007 15:15:52 +0200 * Srivatsa Vaddagiri [email blocked] wrote: > On Mon, May 14, 2007 at 01:10:51PM +0200, Ingo Molnar wrote: > > but let me give you some more CFS design background: > > Thanks for this excellent explanation. Things are much clearer now to > me. I just want to clarify one thing below: > > > > 2. Preemption granularity - sysctl_sched_granularity > > [snip] > > > This granularity value does not depend on the number of tasks running. > > Hmm ..so does sysctl_sched_granularity represents granularity in > real/wall-clock time scale then? AFAICS that doesnt seem to be the > case. there's only this small detail i mentioned: > > ( small detail: the granularity value is currently dependent on the > > nice level, making it easier for higher-prio tasks to preempt > > lower-prio tasks. ) > __check_preempt_curr_fair() compares for the distance between the two > task's (current and next-to-be-run task) fair_key values for deciding > preemption point. > > Let's say that to begin with, at real time t0, both current task Tc > and next task Tn's fair_key values are same, at value K. Tc will keep > running until its fair_key value reaches atleast K + 2000000. The > *real/wall-clock* time taken for Tc's fair_key value to reach K + > 2000000 - is surely dependent on N, the number of tasks on the queue > (more the load, more slowly the fair clock advances)? well, it's somewhere in the [ granularity .. granularity*2 ] wall-clock scale. Basically the slowest way it can reach it is 'half speed' (two tasks running), the slowest way is 'near full speed' (lots of tasks running). > This is what I meant by my earlier remark: "If there a million cpu > hungry tasks, then the (real/wall-clock) time taken to switch between > two tasks is more compared to the case where just two cpu hungry tasks > are running". the current task is recalculated at scheduler tick time and put into the tree at its new position. At a million tasks the fair-clock will advance little (or not at all - which at these load levels is our smallest problem anyway) so during a scheduling tick in kernel/sched_fair.c update_curr() we will have a 'delta_mine' and 'delta_fair' of near zero and a 'delta_exec' of ~1 million, so curr->wait_runtime will be decreased at 'full speed': delta_exec-delta_mine, by almost a full tick. So preemption will occur every sched_granularity (rounded up to the next tick) points in time, in wall-clock time. with 2 tasks running delta_exec-delta_mine is 0.5 million, so preemption will occur in 2*sched_granularity (rounded up to the next timer tick) wall-clock time. Ingo
From: William Lee Irwin III [email blocked] Subject: Re: fair clock use in CFS Date: Mon, 14 May 2007 03:29:29 -0700 On Mon, May 14, 2007 at 02:03:58PM +0530, Srivatsa Vaddagiri wrote: > I have been brooding over how fair clock is computed/used in > CFS and thought I would ask the experts to avoid wrong guesses! > As I understand, fair_clock is a monotonously increasing clock which > advances at a pace inversely proportional to the load on the runqueue. > If load = 1 (task), it will advance at same pace as wall clock, as > load increases it advances slower than wall clock. > In addition, following calculations depend on fair clock: task's wait > time on runqueue and sleep time outside the runqueue (both reflected in > p->wait_run_time). It's not hard to see that that's a mistake. The great thing about virtual deadline schedulers, though, is that all it takes to completely rewrite the policy is changing the numbers calculated for these things. On Mon, May 14, 2007 at 02:03:58PM +0530, Srivatsa Vaddagiri wrote: > Few questions that come up are: > 1. Why can't fair clock be same as wall clock at all times? i.e fair > clock progresses at same pace as wall clock independent of the load on > the runqueue. > It would still give the ability to measure time spent waiting on runqueue > or sleeping and use that calculated time to give latency/bandwidth > credit? > In case of EEVDF, the use of virtual clock seems more > understandable, if we consider the fact that each client gets 'wi' real > time units in 1 virtual time unit. That doesnt seem to be the case in > CFS as Ting Yang explained +/- lags here > http://lkml.org/lkml/2007/5/2/612 .. It's not just more understandable, it doesn't break down with increasing numbers of tasks. On Mon, May 14, 2007 at 02:03:58PM +0530, Srivatsa Vaddagiri wrote: > 2. Preemption granularity - sysctl_sched_granularity > This seems to be measured in the fair clock scale rather than > wall clock scale. As a consequence of this, the time taken > for a task to relinquish to competetion is dependent on number N > of tasks? For ex: if there a million cpu hungry tasks, then the > time taken to switch between two tasks is more compared to the > case where just two cpu hungry tasks are running. Is there > any advantage of using fair clock scale to detect preemtion points? I'm not convinced this is a great way to mitigate context switch overhead. I'd recommend heuristically adjusting the latency parameters (l_i) to try to mitigate context switch overhead or otherwise express the tradeoff between latency and throughput instead of introducing an arbitrary delay like sched_granularity_ns. I suspect it'll have to restart from a point much closer to EEVDF for that to be effective. -- wli
From: Ingo Molnar [email blocked] Subject: Re: fair clock use in CFS Date: Mon, 14 May 2007 12:31:20 +0200 * William Lee Irwin III [email blocked] wrote: > On Mon, May 14, 2007 at 02:03:58PM +0530, Srivatsa Vaddagiri wrote: > > I have been brooding over how fair clock is computed/used in > > CFS and thought I would ask the experts to avoid wrong guesses! > > As I understand, fair_clock is a monotonously increasing clock which > > advances at a pace inversely proportional to the load on the runqueue. > > If load = 1 (task), it will advance at same pace as wall clock, as > > load increases it advances slower than wall clock. > > In addition, following calculations depend on fair clock: task's wait > > time on runqueue and sleep time outside the runqueue (both reflected in > > p->wait_run_time). > > It's not hard to see that that's a mistake. [...] please clarify - exactly what is a mistake? Thanks, Ingo
From: William Lee Irwin III [email blocked] Subject: Re: fair clock use in CFS Date: Mon, 14 May 2007 04:05:00 -0700 On Mon, May 14, 2007 at 02:03:58PM +0530, Srivatsa Vaddagiri wrote: >>> I have been brooding over how fair clock is computed/used in >>> CFS and thought I would ask the experts to avoid wrong guesses! >>> As I understand, fair_clock is a monotonously increasing clock which >>> advances at a pace inversely proportional to the load on the runqueue. >>> If load = 1 (task), it will advance at same pace as wall clock, as >>> load increases it advances slower than wall clock. >>> In addition, following calculations depend on fair clock: task's wait >>> time on runqueue and sleep time outside the runqueue (both reflected in >>> p->wait_run_time). * William Lee Irwin III [email blocked] wrote: >> It's not hard to see that that's a mistake. [...] On Mon, May 14, 2007 at 12:31:20PM +0200, Ingo Molnar wrote: > please clarify - exactly what is a mistake? Thanks, The variability in ->fair_clock advancement rate was the mistake, at least according to my way of thinking. The queue's virtual time clock effectively stops under sufficiently high load, possibly literally in the event of fixpoint underflow. The waiting time is impossible to scale properly in the presence of a variable time rate (at least under the memory allocation constraints of a scheduler), and the deadlines computed as offsets from ->fair_clock come out clustered. Basically it needs to move closer to EEVDF in these respects. This detail sailed right past me when it went in; sorry about that. -- wli
From: Ingo Molnar [email blocked] Subject: Re: fair clock use in CFS Date: Mon, 14 May 2007 13:50:49 +0200 * William Lee Irwin III [email blocked] wrote: > On Mon, May 14, 2007 at 12:31:20PM +0200, Ingo Molnar wrote: > > please clarify - exactly what is a mistake? Thanks, > > The variability in ->fair_clock advancement rate was the mistake, at > least according to my way of thinking. [...] you are quite wrong. Lets consider the following example: we have 10 tasks running (all at nice 0). The current task spends 20 msecs on the CPU and a new task is picked. How much CPU time did that waiting task get entitled to during its 20 msecs wait? If fair_clock was constant as you suggest then we'd give it 20 msecs - but its true 'fair expectation' of CPU time was only 20/10 == 2 msecs! So a 'constant' fair_clock would turn the whole equilibrium upside down (it would inflate p->wait_runtime values and the global sum would not be roughly constant anymore but would run up very fast), especially during fluctuating loads. the fair_clock is the fundamental expression of "fair CPU timeline", and task's expected runtime is always measured by that, not by the real clock. The only time when we measure the true time is when a _single_ task runs on the CPU - but in that case the task truly spent a small amount of time on the CPU, exclusively. See the exec_time calculations in kernel/sched_fair.c. Ingo
From: Daniel Hazelton [email blocked] Subject: Re: fair clock use in CFS Date: Mon, 14 May 2007 10:31:13 -0400 On Monday 14 May 2007 07:50:49 Ingo Molnar wrote: > * William Lee Irwin III [email blocked] wrote: > > On Mon, May 14, 2007 at 12:31:20PM +0200, Ingo Molnar wrote: > > > please clarify - exactly what is a mistake? Thanks, > > > > The variability in ->fair_clock advancement rate was the mistake, at > > least according to my way of thinking. [...] > > you are quite wrong. Lets consider the following example: > > we have 10 tasks running (all at nice 0). The current task spends 20 > msecs on the CPU and a new task is picked. How much CPU time did that > waiting task get entitled to during its 20 msecs wait? If fair_clock was > constant as you suggest then we'd give it 20 msecs - but its true 'fair > expectation' of CPU time was only 20/10 == 2 msecs! Either you have a strange definition of fairness or you chose an extremely poor example, Ingo. In a fair scheduler I'd expect all tasks to get the exact same amount of time on the processor. So if there are 10 tasks running at nice 0 and the current task has run for 20msecs before a new task is swapped onto the CPU, the new task and *all* other tasks waiting to get onto the CPU should get the same 20msecs. What you've described above is fundamentally unfair - one process running for 20msecs while the 10 processes that are waiting for their chance each get a period that increases from a short period at a predictable rate. Some numbers based on your above description: Process 1 runs for 20msecs Process 2 runs for 2msecs (20/10 == 2msecs) Process 3 runs for 2.2msecs (has waited 22msecs, 22/10 == 2.2) Process 4 runs for 2.4msecs (has waited 24.2msecs - rounded for brevity) Process 5 runs for 2.7msecs (has waited 26.6msecs - rounded for brevity) process 6 runs for 3msecs (has waited 30.3msecs) process 7 runs for 3.3msecs (has waited approx. 33msecs) process 8 runs for 3.6msecs (has waited approx. 36msecs) process 9 runs for 3.9msecs (has waited approx. 39msecs) process 10 runs for 4.2msecs (has waited approx. 42msecs) Now if the "process time" isn't scaled to match the length of time that the process has spent waiting to get on the CPU you get some measure of fairness back, but even then the description of CFS you've given shows a fundamental unfairness. However, if you meant that "the new process has spent 20msecs waiting to get on the CPU", then the rest of your description does show what I'd expect from a fair scheduler. If not, then I guess that CFS is only "Completely Fair" for significantly large values of "fair". (I will not, however, argue that CFS is'nt a damned good scheduler that has improved interactivity on the systems of those people that have tested it) > So a 'constant' fair_clock would turn the whole equilibrium upside down > (it would inflate p->wait_runtime values and the global sum would not be > roughly constant anymore but would run up very fast), especially during > fluctuating loads. Hrm... Okay, so you're saying that "fair_clock" runs slower the more process there are running to keep the above run-up in "Time Spent on CPU" I noticed based solely on your initial example? If that is the case, then I can see the fairness - its just not visible from a really quick look at the code and the simplified description you gave earlier. DRH
From: Ingo Molnar [email blocked] Subject: Re: fair clock use in CFS Date: Mon, 14 May 2007 17:08:00 +0200 * Daniel Hazelton [email blocked] wrote: > [...] In a fair scheduler I'd expect all tasks to get the exact same > amount of time on the processor. So if there are 10 tasks running at > nice 0 and the current task has run for 20msecs before a new task is > swapped onto the CPU, the new task and *all* other tasks waiting to > get onto the CPU should get the same 20msecs. [...] What happens in CFS is that in exchange for this task's 20 msecs the other tasks get 2 msecs each. (and not only the one that gets on the CPU next) So each task is handled equal. What i described was the first step - for each task the same step happens (whenever it gets on the CPU, and accounted/weighted for the precise period they spent waiting - so the second task would get +4 msecs credited, the third task +6 msecs, etc., etc.). but really - nothing beats first-hand experience: please just boot into a CFS kernel and test its precision a bit. You can pick it up from the usual place: http://people.redhat.com/mingo/cfs-scheduler/ For example start 10 CPU hogs at once from a shell: for (( N=0; N < 10; N++ )); do ( while :; do :; done ) & done [ type 'killall bash' in the same shell to get rid of them. ] then watch their CPU usage via 'top'. While the system is otherwise idle you should get something like this after half a minute of runtime: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2689 mingo 20 0 5968 560 276 R 10.0 0.1 0:03.45 bash 2692 mingo 20 0 5968 564 280 R 10.0 0.1 0:03.45 bash 2693 mingo 20 0 5968 564 280 R 10.0 0.1 0:03.45 bash 2694 mingo 20 0 5968 564 280 R 10.0 0.1 0:03.45 bash 2695 mingo 20 0 5968 564 280 R 10.0 0.1 0:03.45 bash 2698 mingo 20 0 5968 564 280 R 10.0 0.1 0:03.45 bash 2690 mingo 20 0 5968 564 280 R 9.9 0.1 0:03.45 bash 2691 mingo 20 0 5968 564 280 R 9.9 0.1 0:03.45 bash 2696 mingo 20 0 5968 564 280 R 9.9 0.1 0:03.45 bash 2697 mingo 20 0 5968 564 280 R 9.9 0.1 0:03.45 bash with each task having exactly the same 'TIME+' field in top. (the more equal those fields, the more precise/fair the scheduler is. In the above output each got its precise share of 3.45 seconds of CPU time.) then as a next phase of testing please run various things on the system (without stopping these loops) and try to get CFS "out of balance" - you'll succeed if you manage to get an unequal 'TIME+' field for them. Please try _really_ hard to break it. You can run any workload. Or try massive_intr.c from: http://lkml.org/lkml/2007/3/26/319 which uses a much less trivial scheduling pattern to test a scheduler's precision of scheduling) $ ./massive_intr 9 10 002765 00000125 002767 00000125 002762 00000125 002769 00000125 002768 00000126 002761 00000126 002763 00000126 002766 00000126 002764 00000126 (the second column is runtime - the more equal, the more precise/fair the scheduler.) Ingo

Related Links:

It will be cool If we can

Anonymous (not verified)
on
May 15, 2007 - 5:34am

It will be cool If we can have a nice paper wich describe this scheduler. (not the little .txt we have now...)

...and some graphs which

Anonymous (not verified)
on
May 15, 2007 - 8:29am

...and some graphs which make clear how it works.

BTW: IS CFS meant for systems running server apps or multimedia/desktop usage? Does it perform better with many processes or with few? What about older hardware?

Desktop Usage

jldugger (not verified)
on
May 15, 2007 - 10:01am

It should be fairly clear from the implication that time slice goes down with load that this isn't a scheduler for servers.

Nonsense

Anonymous (not verified)
on
May 15, 2007 - 2:52pm

Timeslice must go down with load in every scheduler. If it doesn't that leads to starvation.

You should be building your

Anonymous (not verified)
on
May 16, 2007 - 8:15am

You should be building your server hardware and manage the installed software such that resources don't become starved...

So is it something I would

Anonymous (not verified)
on
May 16, 2007 - 10:23am

So is it something I would want to use on my Zenwalk (ex Minislax, thus Slackware) DESKTOP? I dont run too many apps, to the time slices should be fairly "large".
htop says that i have 53 tasks and 2-3 running. My CPU isn't powerful: 500MHz PIII Coppermine.

Dream-world

slamb
on
May 25, 2007 - 8:58pm

Well, in a dream world you only run toy systems on hugely overpowered machines so there's no contention. Any scheduler will do.

In the real world, there's contention and you need to make choices.

It doesn't

slamb
on
May 25, 2007 - 8:56pm

The timeslice doesn't go down with load. The virtual clock's rate of advance does, but it's only used to pick which task gets to run next. Once a task is picked, I believe the time it runs is constant. That's why Ting Yang describe there being a lag of +/- one timeslice between the "ideal" amount of work done and the actual work done.

If the timeslices did go down with load, this scheduler would be worthless. Once the system got busy, it would just spend all of its time rescheduling processes and getting cache misses. It would never recover from the high load.

i agree. could anyone tell

Anonymous (not verified)
on
October 17, 2007 - 11:44pm

i agree.

could anyone tell me what "rq->fair_clock - p->wait_runtime" mean and
why rb-tree is sorted by this value?

How refreshing

Anonymous (not verified)
on
May 16, 2007 - 1:00pm

After reading the usual stories about patents, copyright, ya boo, etc. it was a refreshing change to read a discussion and explanation on fairness.

Scheduling in Cygwin

Ysangkok
on
May 19, 2007 - 11:39am


PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
3116 Janus Tr 8 0 2068 2048 32 R 20 0.2 0:03.09 bash
3568 Janus Tr 8 0 2068 2052 32 R 20 0.2 0:03.43 bash
3416 Janus Tr 8 0 2068 2048 32 R 20 0.2 0:03.28 bash
3324 Janus Tr 8 0 2068 2052 32 R 20 0.2 0:04.93 bash
252 Janus Tr 8 0 2068 2052 32 R 20 0.2 0:02.56 bash
3272 Janus Tr 8 0 2068 2052 32 R 20 0.2 0:01.89 bash
3480 Janus Tr 8 0 2068 2052 32 R 19 0.2 0:03.59 bash
3544 Janus Tr 8 0 2068 2048 32 R 19 0.2 0:03.76 bash
3164 Janus Tr 8 0 2068 2052 32 R 19 0.2 0:03.51 bash
2916 Janus Tr 8 0 2068 2052 32 R 19 0.2 0:02.79 bash

I guess that means Windows sucks? :P (Okay, I know it's not fair to run it in Cygwin)

I couldn't compile massive_intr though (still in Cygwin, I don't know if I'm supposed to do that):

$ gcc -g -O0 -Wall -o massive_intr massive_intr.c
massive_intr.c: In function `main':
massive_intr.c:205: warning: implicit declaration of function `shm_open'
massive_intr.c:207: warning: implicit declaration of function `shm_unlink'
/cygdrive/c/DOCUME~1/JANUST~1/LOKALE~1/Temp/ccjXSf2h.o: In function `main':
/home/Janus Troelsen/massive_intr.c:205: undefined reference to `_shm_open'
/home/Janus Troelsen/massive_intr.c:207: undefined reference to `_shm_unlink'
collect2: ld returned 1 exit status

hello, i patched the kernel

Anonymous (not verified)
on
June 10, 2007 - 1:55am

hello, i patched the kernel with sched-cfs-v2.6.20.13-v16.patch, and now i have a question - how can i enable the CFS. there is nothing about it when i do the xconfig.
thank u for answears.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.