Quote: Sacrificing Performance To Achieve Determinism

Submitted by Jeremy
on May 21, 2008 - 2:38pm

"An RTOS (Real-time Operating System) will sacrifice performance to achieve determinism (low latencies). Several key features to an RT system usually come with a performance cost. A non RT system will perform 99% of the time faster than an RTOS. But all it takes is that one time to miss a deadline to make an RT system crash. An RTOS may be slightly slower, but it will not have those outliers that a normal desktop system would have."

I wonder is there any

Anonymous (not verified)
on
May 22, 2008 - 11:21am

I wonder is there any benefit of running RT kernel on a desktop, say KDE 3.5. I know this is really not the purpose of RT kernel and non time-critical application wouldn't take any advantage of realtime kernel, but is there going to be overall increase of responsiveness of the system, so that applications may load slightly slower but menues and other graphical elements would be rendered faster - especially on hardware that doesn't provide any rendering acceleration, and is this trade off worth anything?

a bit pointless

Anonymouse (not verified)
on
May 22, 2008 - 6:38pm

RT systems are usually very specialized embedded devices - for example, a real-time video stream encoder such as the ones you might find in modern cinematography cameras. If you miss some data, you've screwed up a frame. Sure, odds are no one will notice, but that doesn't make it OK. There are numerous scientific measurement applications where losing a single piece of data in about 16 million measurements renders the entire measurement absolutely useless.

On your desktop systems, you may notice problems with video over USB or even audio over USB. An RT patch can help there - provided only that audio and video contend for the RT interrupts - if every other extremely low-priority USB device also contends for those interrupts you're screwed anyway (and in practice they will contend because the interrupts are shared). What really happens of course is that people try to escalate the priority, and then just put up with the remaining glitches in the audio and video.

Real time doesn't necessarily mean fast

Mr_Z
on
May 23, 2008 - 9:59am

I don't think it's pointless for a UI, personally. RT just means predictable latency. There is no specific latency that is considered "real time."

There are certainly hard real time systems for which late is truly, truly wrong, and for which the deadlines are incredibly tight. The servo control loop in your hard drive is an example. If it doesn't update the parameters in a timely fashion, you could bang the head against the wall of the case and destroy the hard drive. Furthermore, the deadlines measured in microseconds in such a system. That is one of the harshest real time environments.

In the video recording example you give, the deadlines are sub-millisecond for receiving and encoding portions of frames. Many times a frame you receive the next batch of pixels and process them. How often this happens is a function of the buffering available to you. If you're recording to disk, you can buffer quite a bit, and so it's just the receive interrupts that matter. If you're in a video conference, you need to also get the encoding done and the bits on the wire to keep acceptable end-to-end latency. In either case, where you do have RT deadlines, the tightest are sub-millisecond. For the audio example you give, if you have low-latency (5ms or less) mixing, you have deadlines that are in the single-digit millisecond territory. Both are tough.

Human interaction in a UI doesn't generally require such tight deadlines. That doesn't mean human interaction should have NO deadlines. I don't think it's too much to ask that whenever I push a button on my mouse or keyboard, the UI reflects that input within, say, 100ms. Current UIs reflect my input anywhere from microseconds later to multiple whole seconds later. Depending on what's happening, the response time is effectively unbounded, and that's unacceptable when considered from a "real time" perspective. Designing a UI that decouples itself from any source of latency that keeps it from guaranteeing a response time of 100ms or better is still an RT design. It's just not so dramatically hardcore as a servo control or even a low latency audio mixer. The deadlines are a couple of orders of magnitude more lax. But, the point is, though, that I think there should be deadlines.

--
Program Intellivision and play Space Patrol!

distributed

olecom
on
May 24, 2008 - 2:59pm

The servo control loop in your hard drive is an example. If it doesn't update the parameters in a timely fashion, you could bang the head against the wall of the case and destroy the hard drive.

There are many servo controllers there, e.g. spindle rotation control (uniform platter rotation rate), precise positioning of the heads on the desired track (circular position), and a quick transition from track to track. Digital control is done by sampling control systems, where sampling frequency is the main parameter. High response performance is generally required, which is limited by available DSP/CPU, algorithms and implementation.

Unlike PC, where many systems, like NIC, HDD, VGA, peripherals, all run thrugh single virtual/physical memory, single interrupt controller, CPU(s) with one kernel and userspace, they deal with behaviour of so-called plants, that generally can be studied in lab. Thus closed-loop, optimal, adaptive control can be developed. If working conditions are met (g-shock, vibration, EMI, other noise), controllers will do task. Their number may vary even for singe plant.

Add many load/usage patterns of PC, and fun starts. Centralization isn't a way to go. This is like networking, where increasing load was is now handled in NIC (hardware flow/IRQ control), ethernet switches, etc.

I'd rather to see kernel as a set of firmware modules, uploaded to each device, controlling performance, power, doing job.

Network printers, cameras and other devices, that can download stuff directly to HDD, HDD itself with simple Norton-Commander like capability/USB/Ethernet, etc -- this is the way.

x86(s) after all, can be just a "DSP" for running gcc and friends, controlled by "thin-client" controller. GPU/VGA has X-Server as firmware, keyboard/mouse controller do TTY/pointing supervising job.

I hope that Open Source will finally bring Open Hardware, so all this will be possible. E.g. game console and compatible hardware to build ordinary PC functionality, set up kernel, upload its firmware modules, run some distribution and have another fun.

Like with GSM modems or HDD, many parts will be closed and must be closed, but pushing management closer to them, interconnecting with something like ethernet or multilayer bus matrix is quite reasonable imho. Maybe such distributed design doesn't fit to current fasion, but i hope, it will.

It exists today, somewhat

Mr_Z
on
May 25, 2008 - 5:55pm

You've roughly described how embedded SoCs work today, such as OMAP. You have dedicated processors for all the different elements, working as a loose federation.

There is still a centralizing non-real-time element, though, that annoys me. The various hard-real-time pieces--the cellular modem, the voice codecs, music playback, etc. all behave properly and are truly real time. The UI running on the ARM, though, is frustratingly not so. (This goes back to my earlier complaint of "at least make my phone work.") The hardware is all there for it but the software lags behind.

--
Program Intellivision and play Space Patrol!

i'm not sure

olecom
on
May 30, 2008 - 3:03am

I'm not sure, i didn't used them much. All i can say, that my schema is more like laptop(msft or linux or...) + server (linux devel., gcc, etc.) connected via Ethernet.

But i want it all-in-one-box with more flexible hardware. For example i'd like to have keyboard controller based on MSP430 http://www.ti.com/litv/pdf/slaa139 so i can reprogram it as i like; same with VGA, where i can put by basic text-based used interface, which was running OK on i286 back in 9X and which will be real time for me, because i'll connect VGA and MSP430 directly (using one highest priority master slot on control bus matrix).

Everything is programmable and main control interface is common wide bus like Ethernet (not PCI or USB). More bandwidth for video or audio can be done with parallel Ethernet 100Mbit or 1Gbit or something like that.

I'd like to see someone try it

Mr_Z
on
May 22, 2008 - 1:05pm

I would *love* it if someone would put together a RT UI toolkit and build an RT desktop environment around it. I don't care if processing tasks take 20%, 30% or even 4× as long, as long as the desktop and UI remain rock solid responsive.

Now, just slapping an RT scheduler under KDE or GNOME won't do it. You have a non-RT coding methodology in user space just as in kernel space. You'd need to design your GUI for RT from the get-go, threading portions and decoupling user-facing aspects from those with more unpredictable latencies. Furthermore, you'd have to probably lock significant portions of the UI into RAM to avoid paging penalties. This would require carefully containing the UI elements, and not just linking random libraries willy-nilly.

It's doable, though. Heck, the Atari 2600 provided the ultimate in RT user interface. It computed each pixel to display just microseconds before it was beamed onto the face of the CRT. Now that is real time! If a 1MHz 6507 can do it, my multi-GHz x86-64 system ought to be able to.

And even if you don't want to do it on my PC, can you at least do it on my phone? I hate the unpredictable response time higher-end phones seem to have. I love my Nokia N82, but it has some maddening pauses in its UI.

--
Program Intellivision and play Space Patrol!

tty

olecom
on
May 22, 2008 - 4:57pm

I like text mode. It worked for me on i286 and, i think, it should be so even decade before that, it works for me now (but in consoles on all kinds of terminal emulators).

TTY layer (with all that BKL fun) -- is anachronism. Emulation of dumb terminal on high tech CPU. man console_codes, man stty is all it (and any X-term emulator) does. I have no profiling evidence, but i think, pure TERM=linux makes `mplayer` music to stop, if big text console chunk has arrived from ssh. OOM times, heavy load, etc. -- just have fun waiting for your user I/O in console.

What i don't like, that there's no TUI widget library, while in 9X Turbo Vision on any CPU was perfect. I think text editing (one key press per 0.1-1 second) or paging/scrolling of 4k(screen size) of text isn't a big deal for hardware or RT scheduling. Yet it is.

Also i'd like to have tty layer at least with basic TUI widgets/scroll buffers/whatever in all that today's video RAM and shaders. So i can use it with ESC codes in shell without useless complexitty and slowdown -- yes hardware `more` or `less`, keyboard input is handled just by wires, no kernel support needed. What's the problem? Something like do-it-yourself, it's-to-much-for-(n)-cool-CPU(s), to-userspace. Ouu! It's another hardware terminal already.

After this somebody wants graphics in kernel? Something is wrong here (we need another sched or fs:). Maybe project-VGA guys can help with this? Basically what i need is a simple uC with FLASH right next to high-res video DAC or DVI backend. Any simple uC is now has more MHz than i286 (or whatever older) have had; RAM, ROM as all IBM PC peripherials (color text mode adapter, keyboard controller, etc).
____

Also i'd like to have tty

intgr
on
May 23, 2008 - 2:14am

Also i'd like to have tty layer at least with basic TUI widgets/scroll buffers/whatever in all that today's video RAM and shaders. So i can use it with ESC codes in shell without useless complexitty and slowdown

If you think utilizing the GPU for console drawing is less complex than what we have now, then I'm afraid you are severely mistaken.

RT?

ccurtis
on
May 22, 2008 - 5:02pm

I think Alan Cox is tackling the TTY layer problems. I wish him well.

And what do you mean by a TUI? For scripting, there's dialog and whiptail; and there's always ncurses, including its perl interface.

As for graphics in the kernel - this is a Good Thing, inasmuch as what's currently happening with X, modeswitching, and APM/ACPI is not. It's unfortunate that GGI/KGI seems to be bitrotting.

What any of this has to do with RT: I have no idea.

RT means determinism. On

olecom
on
May 22, 2008 - 5:59pm

RT means determinism. On modern CPU and hardware slow user input determinism is not an option. I'm asking for hard limit of 1/10 of a second for keyboard and new kilos of text on screen. It's even slower, than modern internet. If this doesn't work, then why not to create programmable hardware addons to unload bottle neck of using CPU(s)?

This one-CPU (e.g. all IRQ lines), one-kernel (any dumb LED or key needs driver to run on CPU) thing is getting nowhere on personal computers for me. It seems like history must repeat itself: microprocessor of I/O, micropocessor for ACPI, microprocessor for RT kernel, and so on.

BTW, comparing featured text user interface libraries like Turbo Vision to ncurses with friends is kind of irrelevant.

TurboVision

ccurtis
on
May 22, 2008 - 7:05pm

I must be tired. Either way, I proclaim ignorance of TurboVision. It seems to have been ported to Linux though: http://tvision.sourceforge.net/

Would this make for a better links than elinks?

Hmm

Anonymous (not verified)
on
May 22, 2008 - 4:59am

This probably is impossible...

But what if you could have two CPU schedulers.
One RT and one normal.

RT for one core, normal like CFS for the other core.
Then you can have RT and fast system?

CPU shielding

rostedt
on
May 22, 2008 - 6:17am

That is a common practice and is called CPU shielding. Where you isolate the RT tasks on one CPU and run the non-RT tasks on another CPU.

But just separating the RT tasks to their own CPUS does not cut it for RT tasks in general. You usually have more than one RT task and they vary in priorities. The RT tasks of higher priority still need to have fast reaction times and thus you still need short latencies.

To achieve low latencies, the RT patched kernel converts spinlocks into sleeping mutexes. A spinlock under contention will simply spin and never schedule away. This is done because spinlocks are suppose to be held for small periods of time. Unfortunately, there are places in the kernel that can hold spinlocks for milliseconds.

Since all tasks use the same code, we can not just convert the spin locks into mutexes for just the RT tasks running on the RT CPU. Especially since these locks will still contend with the non-RT tasks on the other CPUs. This conversion is what causes the overhead no non-RT tasks. It is quicker to spin than to schedule out, but to do so would increase the latencies of other tasks.

This is why the adaptive spinlocks helped out a lot. When contention happened, instead of just simply scheduling away, a check would be made to see if the owner of the lock was still running. So the blocked tasks would then spin, with preemption on, and will sleep if the owner sleeps, or grab the lock when the owner releases it.

Wikipedia

Anonymous (not verified)
on
May 23, 2008 - 9:42am

Please help improve the Wikipedia article on CPU shielding.
* http://en.wikipedia.org/wiki/CPU_shielding