Avoid the percore allocations if HT is not supported or disabled.
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
---
arch/x86/kernel/cpu/perf_event_intel.c | 44 +++++++++++++++++++++++++++----
1 files changed, 38 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c
index bc4afb1..354d1de 100644
--- a/arch/x86/kernel/cpu/perf_event_intel.c
+++ b/arch/x86/kernel/cpu/perf_event_intel.c
@@ -1053,9 +1053,38 @@ static __initconst const struct x86_pmu core_pmu = {
.event_constraints = intel_core_event_constraints,
};
+/*
+ * Check if HT is capable and enabled
+ * TBD: move it to generic place, so it can be used by others
+ */
+
+static bool ht_enabled(int cpu)
+{
+ int total_logical_processors, total_cores;
+ unsigned int eax, ebx, ecx, edx;
+
+ cpuid(1, &eax, &ebx, &ecx, &edx);
+
+ /* Bit 28 in EDX indicates if it's HT capable */
+ if (!(edx & 0x10000000))
+ return false;
+
+ total_logical_processors = (ebx >> 16) & 0xff;
+
+ ecx = 0;
+ cpuid(4, &eax, &ebx, &ecx, &edx);
+ total_cores = ((eax >> 26) & 0x3f) + 1;
+
+ /* Thread nums per core */
+ return (total_logical_processors / total_cores) > 1;
+}
+
static int intel_pmu_cpu_prepare(int cpu)
{
struct cpu_hw_events *cpuc = &per_cpu(cpu_hw_events, cpu);
+
+ if (!ht_enabled(cpu))
+ return NOTIFY_OK;
cpuc->per_core = kzalloc_node(sizeof(struct intel_percore),
GFP_KERNEL, cpu_to_node(cpu));
@@ -1073,6 +1102,15 @@ static void intel_pmu_cpu_starting(int cpu)
int core_id = topology_core_id(cpu);
int i;
+ init_debug_store_on_cpu(cpu);
+ /*
+ * Deal with CPUs that don't clear their LBRs on power-up.
+ */
+ intel_pmu_lbr_reset();
+
+ if (!ht_enabled(cpu))
+ return;
+
for_each_online_cpu(i) {
struct intel_percore *pc = per_cpu(cpu_hw_events, i).per_core;
@@ -1085,12 +1123,6 @@ static void intel_pmu_cpu_starting(int cpu)
cpuc->per_core->core_id = core_id;
...This function can be simplified as below,
static bool ht_enabled()
{
if (!cpu_has(&boot_cpu_data, X86_FEATURE_HT))
return false;
return smp_num_siblings > 1;
}
But this still can't detect if HT is on or off.
smp_num_siblings is always 2 even if HT is disabled in BIOS.
Any idea how to detect if HT is on or not?
Thanks,
--
Not quite sure, the intel docs aren't really clear on how the HW supports HT, has 2 siblings but BIOS disabled it thing works. I just tried reading the arch/x86 code but that only got me more confused. hpa, could you comment? --
... or you just keep the original code which works fine in any case without this. -Andi --
Zeroeth of all: anyone who writes () in a function prototype in C needs to get severely napalmed, maimed, hung and then really hurt. It is (void) in C, () means (...) which is literally NEVER what you want. Now, *first* of all: smp_num_siblings as it sits is obviously broken; the whole notion of a global variable for what is inherently a per-cpu property is just braindead. At least theoretically there could be cores with and without HT or with different thread count in the same system. Second, perf should get this from /proc/cpuinfo, not by grotting around cpuid by itself. Third, the code in the kernel is indeed pretty confusing, and it also has the global variable braindamage... but either it works (in which case getting the data from /proc/cpuinfo works) or it needs fixing in addition to the global variable issue. -hpa --
Its the kernel space bit that wants to know if HT is enabled on CPU So you failed to address the primary question, how do we tell if HT is enabled for a particular CPU? X86_FEATURE_HT tells us the CPU supports telling us about HT, smp_num_siblings > 1 tells us the CPU is capable of HT. But Lin found that if you disable HT in the BIOS both those are true but we still don't have HT enabled. The problem we have to address is that Intel has some MSRs shared between threads and if HT is enabled we need to allocate some memory to manage this shared resource. The simple check outlined above X86_FEATURE_HT && smp_num_siblings > 1, will get us most of the way and will I think be good enough (false positives but no false negatives). But it did get us wondering about how all this works. So could you, irrespective of Linux implementation details tell us how to detect if HT is available and enabled from a HW and BIOS perspective? --
Hi,
I had to deal with this issue with perfmon back in 2.6.30 and earlier kernels.
I remember that I was surprised not to find any easy helper function to figure
this out back then. Being HT capable is different from having HT enabled.
Seems like the situation has not improved since then.
My solution at the time (2.6.30) was to do:
ht_enabled = cpumask_weight(__get_cpu_var(cpu_sibling_map)) > 1;
Not too convinced the per-cpu variable is necessary because I don't
know of a BIOS that would allow you to turn on HT on only some of the
cores (AFAIK, this is an all or nothing feature).
--
Won't that report a machine a HT disabled when you offline a sibling? Which kinda defeats the purpose of our usage here, since we need to know it before either sibling comes online. --
Then, it seems the only hope is to peek at a MSR that reports the BIOS setting. But I don't know which one it is. Couldn't you simply over-provision, and then when the CPU is online, use my ht_enabled statement to figure out whether or not you need to handle the sharing issue? --
That's pretty much what Lin's latest does: X86_FEATURE_HT && smp_num_siblings > 1 shows the CPU is capable of HT and will allocate the needed resources. --
Ok, but once the CPU is online, you can still check whether or not HT is enabled. If not, then you don't need to bother with the MSR sharing code. So you may allocate a data structure which you won't use and I think that's fine given we don't have a good way of figuring out whether HT is enabled or not. --
From the sound of it, it doesn't seem like it's worth optimizing out the fairly small memory allocation. -hpa --
A subtle but important distinction. Several laptops ago, I had a P4-M chip that had FEATURE_HT, but num_siblings == 1. ISTR trying to boot an SMP kernel on it and it dying gloriously trying to bring up the non-existent other sibling.
| Greg KH | Og dreams of kernels |
| Jens Axboe | [PATCH 31/33] Fusion: sg chaining support |
| Arnd Bergmann | Re: finding your own dead "CONFIG_" variables |
| Mark Brown | [PATCH 2/2] Subject: natsemi: Allow users to disable workaround for DspCfg reset |
| Tony Breeds | [LGUEST] Look in object dir for .config |
git: | |
| Brian Downing | Re: Git in a Nutshell guide |
| John Benes | Re: master has some toys |
| Matthias Lederhofer | [PATCH 4/7] introduce GIT_WORK_TREE to specify the work tree |
| Alexander Sulfrian | [RFC/PATCH] RE: git calls SSH_ASKPASS even if DISPLAY is not set |
| Junio C Hamano | Re: Rss produced by git is not valid xml? |
| Linux Kernel Mailing List | iSeries: fix section mismatch in iseries_veth |
| Linux Kernel Mailing List | ixbge: remove TX lock and redo TX accounting. |
| Linux Kernel Mailing List | ixgbe: fix several counter register errata |
| Linux Kernel Mailing List | b43: fix build with CONFIG_SSB_PCIHOST=n |
| Linux Kernel Mailing List | 9p: block-based virtio client |
