New Kernel Speed up Patch File from Mike Galbraith?

Has anyone seen this stuff about a kernel speed up patch? Here is one Article

The Linux desktop may soon be a lot faster - Computerworld Blogs

And Yet another even stranger kernel speed up here:

Alternative To The “200 Lines Kernel Patch That Does Wonders” Which You Can Use Right Away ~ Web Upd8: Ubuntu / Linux blog

I have no idea if any of this stuff is true, but lets see some of our high power guys check it out and let us know what is true or not.

Thank You,

There actually have been threads about this issue, and yes, it’s “true” (there are some pretty impressive benchmarks on that), this patch will be part of upcoming Linux Kernels.

SJVN may be waxing a bit optimistic. I believe the purpose of the original patch and the followup shortcut is to improve interactivity under hefty load, e.g. large compiles. If you are not doing anything strenuous on your CPU, you probably won’t notice any improvement. So as such it isn’t a kernel speed up patch, but rather an interactivity improvement patch. This is the original post and the comments attached to it are more informative than second or third hand speculations.

LKML: Mike Galbraith: [RFC/RFT PATCH] sched: automated per tty task groups

There has been a lot on this ‘new’ faster kernel. Yet some say (such as Lennart Poettering, a RedHat developer ) one does not even need that patch in the new kernel to get the better performance.

I am still trying to wrap my head around this interesting post (which is in DIRECT reply to the faster kernel claims): Alternative To The “200 Lines Kernel Patch That Does Wonders” Which You Can Use Right Away ~ Web Upd8: Ubuntu / Linux blog which jdmcdaniel3 also quotes. In the end, I decided I was simply too thick and not experienced enough in Linux to understand the alternative to the 200 line kernel patch. Is that applicable to all kernels? Is it applicable to openSUSE in addition to Red Hat ?

… I’m hoping someone can provide some interesting guidance here.

Hmmm … how about rendering video ?

There are times when I setup a batch job, and have my Core i7 920 busy rendering multiple videos for 12hours or more at a time.

Yes, transcoding video and music would probably load the CPU.

I found this which was reported to be the speed up patch, but I surely could be wrong. If one had the source code for a kernel, how would this patch be applied for testing.

    
Enter your search termsSubmit search formWeblkml.org
Subject    [RFC/RFT PATCH] sched: automated per tty task groups
From    Mike Galbraith <>
Date    Tue, 19 Oct 2010 11:16:04 +0200

Greetings,

Comments, suggestions etc highly welcome.

This patch implements an idea from Linus, to automatically create task groups
per tty, to improve desktop interactivity under hefty load such as kbuild.  The
feature is enabled from boot by default,  The default setting can be changed via
the boot option ttysched=0, and can be can be turned on or off on the fly via
echo [01] > /proc/sys/kernel/sched_tty_sched_enabled.
A 100% hog overhead measurement proggy pinned to the same CPU as a make -j10

pert/s:      229 >5484.43us:       41 min:  0.15 max:12069.42 avg:2193.81 sum/s:502382us overhead:50.24%
pert/s:      222 >5652.28us:       43 min:  0.46 max:12077.31 avg:2248.56 sum/s:499181us overhead:49.92%
pert/s:      211 >5809.38us:       43 min:  0.16 max:12064.78 avg:2381.70 sum/s:502538us overhead:50.25%
pert/s:      223 >6147.92us:       43 min:  0.15 max:16107.46 avg:2282.17 sum/s:508925us overhead:50.49%
pert/s:      218 >6252.64us:       43 min:  0.16 max:12066.13 avg:2324.11 sum/s:506656us overhead:50.27%

Signed-off-by: Mike Galbraith <efault@gmx.de>

---
 drivers/char/tty_io.c |    2 
 include/linux/sched.h |   14 +++++
 include/linux/tty.h   |    3 +
 init/Kconfig          |   13 +++++
 kernel/sched.c        |    9 +++
 kernel/sched_tty.c    |  128 ++++++++++++++++++++++++++++++++++++++++++++++++++
 kernel/sched_tty.h    |    7 ++
 kernel/sysctl.c       |   11 ++++
 8 files changed, 186 insertions(+), 1 deletion(-)
Index: linux-2.6.36.git/include/linux/sched.h
===================================================================
--- linux-2.6.36.git.orig/include/linux/sched.h
+++ linux-2.6.36.git/include/linux/sched.h
@@ -1900,6 +1900,20 @@ int sched_rt_handler(struct ctl_table *t
 
 extern unsigned int sysctl_sched_compat_yield;
 
+#ifdef CONFIG_SCHED_DESKTOP
+int sched_tty_sched_handler(struct ctl_table *table, int write,
+        void __user *buffer, size_t *lenp,
+        loff_t *ppos);
+
+extern unsigned int sysctl_sched_tty_sched_enabled;
+
+void tty_sched_create_group(struct tty_struct *tty);
+void tty_sched_destroy_group(struct tty_struct *tty);
+#else
+static inline void tty_sched_create_group(struct tty_struct *tty) { }
+static inline void tty_sched_destroy_group(struct tty_struct *tty) { }
+#endif
+
 #ifdef CONFIG_RT_MUTEXES
 extern int rt_mutex_getprio(struct task_struct *p);
 extern void rt_mutex_setprio(struct task_struct *p, int prio);
Index: linux-2.6.36.git/include/linux/tty.h
===================================================================
--- linux-2.6.36.git.orig/include/linux/tty.h
+++ linux-2.6.36.git/include/linux/tty.h
@@ -327,6 +327,9 @@ struct tty_struct {
     /* If the tty has a pending do_SAK, queue it here - akpm */
     struct work_struct SAK_work;
     struct tty_port *port;
+#ifdef CONFIG_SCHED_DESKTOP
+    struct task_group *tg;
+#endif
 };
 
 /* Each of a tty's open files has private_data pointing to tty_file_private */
Index: linux-2.6.36.git/kernel/sched.c
===================================================================
--- linux-2.6.36.git.orig/kernel/sched.c
+++ linux-2.6.36.git/kernel/sched.c
@@ -78,6 +78,7 @@
 
 #include "sched_cpupri.h"
 #include "workqueue_sched.h"
+#include "sched_tty.h"
 
 #define CREATE_TRACE_POINTS
 #include <trace/events/sched.h>
@@ -612,11 +613,16 @@ static inline int cpu_of(struct rq *rq)
  */
 static inline struct task_group *task_group(struct task_struct *p)
 {
+    struct task_group *tg;
     struct cgroup_subsys_state *css;
 
     css = task_subsys_state_check(p, cpu_cgroup_subsys_id,
             lockdep_is_held(&task_rq(p)->lock));
-    return container_of(css, struct task_group, css);
+    tg = container_of(css, struct task_group, css);
+
+    tty_sched_check_attach(p, &tg);
+
+    return tg;
 }
 
 /* Change a task's cfs_rq and parent entity if it moves across CPUs/groups */
@@ -1920,6 +1926,7 @@ static void deactivate_task(struct rq *r
 #include "sched_idletask.c"
 #include "sched_fair.c"
 #include "sched_rt.c"
+#include "sched_tty.c"
 #ifdef CONFIG_SCHED_DEBUG
 # include "sched_debug.c"
 #endif
Index: linux-2.6.36.git/drivers/char/tty_io.c
===================================================================
--- linux-2.6.36.git.orig/drivers/char/tty_io.c
+++ linux-2.6.36.git/drivers/char/tty_io.c
@@ -185,6 +185,7 @@ void free_tty_struct(struct tty_struct *
 {
     kfree(tty->write_buf);
     tty_buffer_free_all(tty);
+    tty_sched_destroy_group(tty);
     kfree(tty);
 }
 
@@ -2823,6 +2824,7 @@ void initialize_tty_struct(struct tty_st
     tty->ops = driver->ops;
     tty->index = idx;
     tty_line_name(driver, idx, tty->name);
+    tty_sched_create_group(tty);
 }
 
 /**
Index: linux-2.6.36.git/kernel/sched_tty.h
===================================================================
--- /dev/null
+++ linux-2.6.36.git/kernel/sched_tty.h
@@ -0,0 +1,7 @@
+#ifdef CONFIG_SCHED_DESKTOP
+static inline void
+tty_sched_check_attach(struct task_struct *p, struct task_group **tg);
+#else
+static inline void
+tty_sched_check_attach(struct task_struct *p, struct task_group **tg) { }
+#endif
Index: linux-2.6.36.git/kernel/sched_tty.c
===================================================================
--- /dev/null
+++ linux-2.6.36.git/kernel/sched_tty.c
@@ -0,0 +1,128 @@
+#ifdef CONFIG_SCHED_DESKTOP
+#include <linux/tty.h>
+
+unsigned int __read_mostly sysctl_sched_tty_sched_enabled = 1;
+
+void tty_sched_create_group(struct tty_struct *tty)
+{
+    tty->tg = sched_create_group(&init_task_group);
+    if (IS_ERR(tty->tg)) {
+        tty->tg = &init_task_group;
+         WARN_ON(1);
+    }
+}
+EXPORT_SYMBOL(tty_sched_create_group);
+
+void tty_sched_destroy_group(struct tty_struct *tty)
+{
+    if (tty->tg && tty->tg != &init_task_group)
+        sched_destroy_group(tty->tg);
+}
+EXPORT_SYMBOL(tty_sched_destroy_group);
+
+static inline void
+tty_sched_check_attach(struct task_struct *p, struct task_group **tg)
+{
+    struct tty_struct *tty;
+    int attach = 0, enabled = sysctl_sched_tty_sched_enabled;
+
+    rcu_read_lock();
+    tty = p->signal->tty;
+    if (!tty)
+        goto out_unlock;
+
+    if (enabled && *tg == &root_task_group) {
+        *tg = p->signal->tty->tg;
+        attach = 1;
+    } else if (!enabled && *tg == tty->tg) {
+        *tg = &root_task_group;
+        attach = 1;
+    }
+
+    if (attach && !p->se.on_rq) {
+        p->se.vruntime -= cfs_rq_of(&p->se)->min_vruntime;
+        p->se.vruntime += (*tg)->cfs_rq[task_cpu(p)]->min_vruntime;
+    }
+
+out_unlock:
+    rcu_read_unlock();
+}
+
+void tty_sched_move_task(struct task_struct *p, struct task_group *tg)
+{
+    struct sched_entity *se = &p->se;
+    struct rq *rq;
+    unsigned long flags;
+    int on_rq, running, cpu;
+
+    rq = task_rq_lock(p, &flags);
+
+    running = task_current(rq, p);
+    on_rq = se->on_rq;
+    cpu = rq->cpu;
+
+    if (on_rq)
+        dequeue_task(rq, p, 0);
+    if (unlikely(running))
+        p->sched_class->put_prev_task(rq, p);
+
+    if (!on_rq)
+        se->vruntime -= cfs_rq_of(se)->min_vruntime;
+
+    se->cfs_rq = tg->cfs_rq[cpu];
+    se->parent = tg->se[cpu];
+
+    p->rt.rt_rq  = tg->rt_rq[cpu];
+    p->rt.parent = tg->rt_se[cpu];
+
+    if (!on_rq)
+        se->vruntime += cfs_rq_of(se)->min_vruntime;
+
+    if (unlikely(running))
+        p->sched_class->set_curr_task(rq);
+    if (on_rq)
+        enqueue_task(rq, p, 0);
+
+    task_rq_unlock(rq, &flags);
+}
+
+int sched_tty_sched_handler(struct ctl_table *table, int write,
+        void __user *buffer, size_t *lenp,
+        loff_t *ppos)
+{
+    struct task_struct *p, *t;
+    struct task_group *tg;
+    unsigned long flags;
+    int ret = proc_dointvec_minmax(table, write, buffer, lenp, ppos);
+
+    if (ret || !write)
+        return ret;
+
+    read_lock_irqsave(&tasklist_lock, flags);
+
+    rcu_read_lock();
+    for_each_process(p) {
+        tg = task_group(p);
+        tty_sched_move_task(p, tg);
+        list_for_each_entry_rcu(t, &p->thread_group, thread_group) {
+            tty_sched_move_task(t, tg);
+        }
+    }
+    rcu_read_unlock();
+
+    read_unlock_irqrestore(&tasklist_lock, flags);
+
+    return 0;
+}
+
+static int __init setup_tty_sched(char *str)
+{
+    unsigned long val;
+
+    val = simple_strtoul(str, NULL, 0);
+    sysctl_sched_tty_sched_enabled = val ? 1 : 0;
+
+    return 1;
+}
+__setup("ttysched=", setup_tty_sched);
+#endif
Index: linux-2.6.36.git/kernel/sysctl.c
===================================================================
--- linux-2.6.36.git.orig/kernel/sysctl.c
+++ linux-2.6.36.git/kernel/sysctl.c
@@ -384,6 +384,17 @@ static struct ctl_table kern_table] = {
         .mode        = 0644,
         .proc_handler    = proc_dointvec,
     },
+#ifdef CONFIG_SCHED_DESKTOP
+    {
+        .procname    = "sched_tty_sched_enabled",
+        .data        = &sysctl_sched_tty_sched_enabled,
+        .maxlen        = sizeof(unsigned int),
+        .mode        = 0644,
+        .proc_handler    = sched_tty_sched_handler,
+        .extra1        = &zero,
+        .extra2        = &one,
+    },
+#endif
 #ifdef CONFIG_PROVE_LOCKING
     {
         .procname    = "prove_locking",
Index: linux-2.6.36.git/init/Kconfig
===================================================================
--- linux-2.6.36.git.orig/init/Kconfig
+++ linux-2.6.36.git/init/Kconfig
@@ -652,6 +652,19 @@ config DEBUG_BLK_CGROUP
 
 endif # CGROUPS
 
+config SCHED_DESKTOP
+    bool "Desktop centric group scheduling"
+    depends on EXPERIMENTAL
+    select CGROUPS
+    select CGROUP_SCHED
+    select FAIR_GROUP_SCHED
+    select RT_GROUP_SCHED
+    select BLK_CGROUP
+    help
+      This option optimizes the group scheduler for common desktop workloads,
+      by creating separate per tty groups. This separation of workloads isolates
+      aggressive CPU burners (like build jobs) from desktop applications.
+
 config MM_OWNER
     bool

If this is bulletin sheet, then we need to know.

Thank You,

On 11/19/2010 04:36 PM, jdmcdaniel3 wrote:
>
> I found this which was reported to be the speed up patch, but I surely
> could be wrong. If one had the source code for a kernel, how would this
> patch be applied for testing.

It can only be applied if you have a 2.6.37-rcX kernel. Then you use the usual
patch mechanism. If that does not mean anything to you, then you probably do not
have the skills necessary.

The patch does not speed up the kernel. A batch job that takes 2 hours, will
still take close to 2 hours. What it does is improve the responsiveness of the
kernel by modifying the scheduler. Interactive jobs will be affected the most.

It can only be applied if you have a 2.6.37-rcX kernel. Then you use the usual
patch mechanism. If that does not mean anything to you, then you probably do not
have the skills necessary.
Yes, I guess you are right. Only people born with the knowledge should be compiling kernels with patches.

Thank You,

On 11/19/2010 10:06 PM, jdmcdaniel3 wrote:
>
>> It can only be applied if you have a 2.6.37-rcX kernel. Then you use the
>> usual
>> patch mechanism. If that does not mean anything to you, then you
>> probably do not
>> have the skills necessary.Yes, I guess you are right. Only people born with the knowledge should
> be compiling kernels with patches.

You don’t need to be born with that knowledge. In fact, computers had not yet
been invented when I was born, but I have worked very hard since 1995 in
building kernels for special purposes, and since 2000 in debugging kernels and
drivers. It does require some effort to learn, and when I see a question like
yours, it is clear that you are not quite ready. To answer your question, one
would use the “patch” utility. One thing, patch is finicky, and one does need to
know how to recover when it fails.

lwfinger I would like to say that the primary difference between people of average intelligence are the tools they have to use and their knowledge on how they work. What that means to me is those to know how a tool works need to merely lay out the order and tools to be used in order to bestow that knowledge to someone else. For various reasons, no one is obligated to tell someone else what they know but I soundly reject the notion that quote “If that does not mean anything to you, then you probably do not have the skills necessary.” Or I wonder about the statement “It can only be applied if you have a 2.6.37-rcX kernel.” when I can see in the text of the patch the kernel name linux-2.6.36.git.orig.

I know that you are very smart and knowledgeable about Linux and Linux kernels and that indeed you try to help people here everyday to the best of your ability. If you don’t think the average joe should be compiling or patching kernels, I think I would just keep silent on the subject then and leave everyone else to their own devices. Perhaps I got up on the wrong side of the bed yesterday, but this one response on the subject of patching kernels just rubbed me the wrong way and I apologize in advance for anything I have said in this message that might be wrong.

Thank You,

Just give it a go, just make sure you don’t replace your working kernel but install it alongside.

How to go about it? I know STFW sounds like RTFM, but really sometimes that is the best starting point you have. A search phrase like “how to apply source patch Linux kernel” will get you quite a few hits. Assess each webpage for appropriateness. Pages that are too old, talking about 2.4 or even 2.2 kernels might be outdated. Some pages will talk about other distros. They could still be useful, if you note the parts that are specific to those distros. And how will you know that, well you’ll have to do more digging.

It might sound like a never-ending task but at some point you will decide you have enough tentative information to proceed. Just do it, run into trouble, read some stuff again, read some more, ask questions, and eventually it will make sense. That’s the way it is really, there is no royal road, or tute that will give you the exact steps all the time. But the mistakes you make will teach you faster.

Luckily this is software. Worst comes to worst, delete the directories and start again. Not so easy with hardware when you have let the smoke out of the chip.

No need to be sarcastic, we all know how people can get themselves into trouble by trying to invoke the eledged magic. Larry is reluctant for a reason. To me it’s a bit funny that you write kernel compiling scripts, yet do not know how to deal with a kernel patch. Nothing cynical, just puzzled.

At least one of the articles I read had some instructions about applying the patch to the kernel-sources, couldn’t find it straight away, but Google will certainly bring you there.

Hello Knurpht. Until September 4th, 2010 I had never compiled a Linux Kernel before in my life. Until now, there has been no need to install or attempt to install a patch. And, I can’t even say for sure there is a reason to do it now. In this thread I was looking for more information as to if this new kernel speed up is of any value or not. It seems to have garnered a lot of publicity to not be anything to it, but I surely could be wrong there.

My overall philosophy on Linux is literally Power To The People. I don’t feel it necessary to save everyone from themselves, but rather empower everyone with as much knowledge on the subject Linux as one can. I don’t think that it is necessary to give away any secrets that are the basis of what you do to make your living. However, if one makes a reasonable effort to make a procedure sound and safe for others to use, then it should be provided to the masses. Such an effort would be the sakc script I wrote. Nothing is included within that I created from scratch, but all was learned online here in the openSUSE forums, including info read from messages left by lwfinger.

I shall continue to look through the forums here and the internet for the information that I seek and I thank everyone for their suggestions.

Thank You,

On 11/20/2010 09:06 AM, ken yap wrote:
>
> Luckily this is software. Worst comes to worst, delete the directories
> and start again. Not so easy with hardware when you have let the smoke
> out of the chip.

Unfortunately, software can have a drastic effect on parts that you would not
expect. There is an active thread in the wireless mailing list where the ath9k
driver is hitting a DMA error that is corrupting the ext4 file systems so badly
that fsck cannot recover them.

As to applying that patch, I would not touch it until it has been approved by
the big guns in Linux, and accepted into the mainline kernel. I have not
followed that on-line discussion, but I know that Linus has some objections at
the moment.

As to applying that patch, I would not touch it until it has been approved by
the big guns in Linux, and accepted into the mainline kernel. I have not
followed that on-line discussion, but I know that Linus has some objections at
the moment.
lwfinger, I highly respect your opinion on the subject and shall follow your advice.

Thank You,

On the face of it, the patch looks like it addresses exactly previous posts I’ve made in these forums searching for ways to modify application and process priorities, and the fact that MS Windows “Desktop Interaction” has been a if not the major difference in Server and Desktop versions for the past 15 years.

So, notice that the earlier comments in this thread about “better performance under load” and specific applications like transcoding aren’t probably completely accurate…

Instead, I would say that “better Desktop performance” probably applies to applications which are clearly running in a Desktop running/security context, like launched within the Gnome or KDE desktops, and you likely would see better performance when many GUI/Desktop applications are running at the cost of any daemon processes. The advantage of the patch would be like applying an overall policy change increasing the priority of Desktop apps.

To me, the current alternative to improving Desktop performance is to renice applications with higher priority than competing applications. In any case renice likely should be the clear preferred procedure when running specific applications like transcoding as opposed to more generally improving the performance of anything running in a Desktop instead of daemon context.

IMO this is just another step in the dance between Linux and Windows as the years go by… Windows all the time subsumes UNIX characteristics, and Linux does the same with Windows.

IMO,
Tony

Hello there tsu2 and thanks for your comments. I would like to add that openSUSE has already proven to be faster in every way when compared to Windows up through and including Windows 7, which I can say is an OK OS to use. Still, it is interesting to take hardware like a laptop, that can barely get out of its own way running Windows XP (Which is VERY OLD) and play multimedia files like a champ in Linux. To be able to get the Linux desktop even faster on the same setup is hard to believe, but trying it is very tempting to be sure. I am hoping it does materialize in kernel 2.6.38, but that will be too late to make it in openSUSE 11.4 I would think.

Thank You,

James, here How would I apply the Phoronix patch? is a post by brian_j that explains it all.

Clarifying my post if anyone mis-understood, increasing Linux Desktop performance with the patch** in no way relates to any Windows vs Linux comparison,** it only appears to affect Desktop vs background (daemon) performance **within the same OS.
**
Tony