David.Mentre@irisa.fr
Linux works on SMP (Symmetric Multi-Processors) machines. SMP support was introduced with kernel version 2.0, and has improved steadily ever since. The kernel locking granularity is much finer in 2.2.x than in 2.0.x, which enables better performance when processes are accessing the kernel!
HOWTO maintained by David Mentré ( David.Mentre@irisa.fr). The latest edition of this HOWTO can be found at
If you want to contribute to this HOWTO, I would prefer a diff against the
SGML version of this
document, but any remarks (in plain text) will be greatly
appreciated. If you send me an email about this HOWTO, please include a
tag like [Linux SMP HOWTO]
in the Subject:
field of your
e-mail. It helps me to automatically sort mails (and you will have a faster
reply ;)
).
This HOWTO is an improvement of a first draft made by Chris Pirih.
All information contained in this HOWTO is provided "as is." All warranties, expressed, implied or statutory, concerning the accuracy of the information of the suitability for any particular use are hereby specifically disclaimed. While every effort has been taken to ensure the accuracy of the information contained in this HOWTO, the authors assume no responsibility for errors or omissions, or for damages resulting from the use of the information contained herein.
Yes. Processes and kernel-threads are distributed among processors. User-space threads are not.
SMP is supported in 2.0 on the hypersparc (SS20, etc.) systems and Intel 486, Pentium or higher machines which are Intel MP1.1/1.4 compliant. Richard Jelinek adds: right now, systems have been tested up to 4 CPUs and the MP standard (and so Linux) theoretically allows up to 16 CPUs.
SMP support for UltraSparc, SparcServer, Alpha and PowerPC machines is in available in 2.2.x.
MIPS, m68k and ARM does not support SMP; the latter two probly won't ever.
That is, I'm going to hack on MIPS-SMP as soon as I get a SMP box ...
Most Linux distributions don't provide a ready-made SMP-aware kernel, which means that you'll have to make one yourself. If you haven't made your own kernel yet, this is a great reason to learn how. Explaining how to make a new kernel is beyond the scope of this document; refer to the Linux Kernel Howto for more information. (C. Polisher)
In kernel series 2.0 up to but not including 2.1.132, uncomment the
SMP=1
line in the main Makefile (/usr/src/linux/Makefile
).
In the 2.2 version, configure the kernel and answer "yes" to the question "Symmetric multi-processing support" (Michael Elizabeth Chastain).
AND
enable real time clock support by configuring the "RTC support" item (from Robert G. Brown). Note that inserting RTC support actually doesn't afaik prevent the known problem with SMP clock drift, but enabling this feature prevents lockup when the clock is read at boot time. A note from Richard Jelinek says also that activating the Enhanced RTC is necessary to get the second CPU working (identified) on some original Intel Mainboards.
AND
(x86 kernel) do NOT enable APM (advanced power management)! APM and SMP
are not compatible, and your
system will almost certainly (or at least probably ;)
) crash
while booting if APM is enabled (Jakob Oestergaard). Alan
Cox confirms this : 2.1.x turns APM off for SMP boxes. Basically
APM is undefined in the presence of SMP systems, and anything could
occur.
AND
(x86 kernel) enable "MTRR (Memory Type Range Register) support". Some BIOS are buggy as they do not activate cache memory for the second processor. The MTRR support contains code that solves such processor misconfiguration.
You must rebuild all your kernel and kernel modules when changing to and
from SMP mode. Remember to make modules
and make
modules_install
(from Alan Cox).
If you get module load errors, you probably did not rebuild and/or re-install your modules. Also with some 2.2.x kernels people have reported problems when changing the compile from SMP back to UP (uni-processor). To fix this, save your .config file, do make mrproper, restore your .config file, then remake your kernel (make dep, etc.) (Wade Hampton). Do not forget to run lilo after copying your new kernel.
Recap:
make config # or menuconfig or xconfig make dep make clean make bzImage # or whatever you want # copy the kernel image manually then RUN LILO # or make lilo make modules make modules_install
In the 2.0 series, comment the SMP=1
line in the main
Makefile (/usr/src/linux/Makefile).
In the 2.2 series, configure the kernel and answer "no" to the question "Symmetric multi-processing support" (Michael Elizabeth Chastain).
You must rebuild all your kernel and kernel modules when changing to and
from SMP mode. Remember to make modules
and make
modules_install
and remember to run lilo. See notes above about
possible configuration problems.
cat /proc/cpuinfo
Typical output (dual PentiumII):
processor : 0 cpu : 686 model : 3 vendor_id : GenuineIntel [...] bogomips : 267.06 processor : 1 cpu : 686 model : 3 vendor_id : GenuineIntel [...] bogomips : 267.06
Linux kernel version 2.2 has signal handling, interrupts and some I/O stuff fine grain locked. The rest is gradually migrating. All the scheduling is SMP safe.
Kernel version 2.3 (next 2.4) has really fine grained locking. In the 2.3 kernels the usage of the big kernel lock has basically disappeared, all major Linux kernel subsystems are fully threaded: networking, VFS, VM, IO, block/page caches, scheduling, interrupts, signals, etc. (Ingo Molnar)
No and Yes. There is no way to force a process onto specific CPU's but the linux scheduler has a processor bias for each process, which tends to keep processes tied to a specific CPU.
Yes. Look at PSET - Processor Sets for the Linux kernel:
The goal of this project is to make a source compatible and functionally equivalent version of pset (as defined by SGI - partially removed from their IRIX 6.4 kernel) for Linux. This enables users to determine which processor or set of processors a process may run on. Possible uses include forcing threads to separate processors, timings, security (a `root' only CPU?) and probably more.
It is focused around the syscall sysmp(). This function takes a number of parameters that determine which function is requested. Functions include:
Please report bugs to linux-smp@vger.rutgers.edu
.
If you want to gauge the performance of your SMP system, you can run some tests made by Cameron MacKinnon and available at http://www.phy.duke.edu/brahma/benchmarks.smp.
If you have to ask, you probably don't. :)
Generally, multi-processor systems can provide better performance
than uni-processor systems, but to realize any gains you need
to consider many other factors besides the number of CPU's.
For instance, on a given system, if the processor is generally
idle much of the time due to a slow disk drive, then this system
is "input/output bound", and probably won't benefit from additional
processing power. If, on the other hand, a system has many
simultaneously executing processes, and CPU utilization is very
high, then you are likely to realize increased system performance.
SCSI disk drives can be very effective when used with multiple
processors, due to the way they can process multiple commands
without tying up the CPU. (C. Polisher)
This depends on the application, but most likely not. SMP adds
some overhead that a faster uniprocessor box would not incur
(Wade Hampton).
:)
Thanks to Samuel S. Chessman, here are some useful utilities:
http://www.cs.inf.ethz.ch/~rauch/procps.html
Basically, it's procps v1.12.2 (top, ps, et. al.) and some patches to support SMP.
For 2.2.x, Gregory R. Warnes as made a patch available at http://queenbee.fhcrc.org/~warnes/procps
xosview-1.5.1 supports SMP. And kernels above 2.1.85 (included) the cpuX entry in /proc/stat file.
The official homepage for xosview is: http://lore.ece.utexas.edu/~bgrayson/xosview.html
You'll find a version patched for 2.2.x kernels by Kumsup Lee : http://www.ima.umn.edu/~klee/linux/xosview-1.6.1-5a1.tgz
The various Forissier's kernel patches are at: http://www-isia.cma.fr/~forissie/smp_kernel_patch/
By the way, you can't monitor processor scheduling precisely with xosview, as xosview itself causes a scheduling perturbation. (H. Peter Anvin)
use:
# make [modules|zImage|bzImages] MAKE="make -jX" where X=max number of processes. WARNING: This won't work for "make dep".
With a 2.2 like kernel, see also the file
/usr/src/linux/Documentation/smp.txt
for specific instruction.
BTW, since running multiple compilers allows a machine with sufficient
memory to use use the otherwise wasted CPU time during I/O caused delays,
make MAKE="make -j 2" -j 2
actually helps even on uniprocessor
boxes (from Ralf Bächle).
time
command inaccurate?
(from Joel Marchand)
In the 2.0 series, the result given by the time
command is
false. The sum user+system is right *but* the spreading between user and
system time is false.
More precisely: "The explanation is, that all time spent in processors other than the boot cpu is accounted as system time. If you time a program, add the user time and the system time, then you timing will be almost right, except for also including the system time that is correctly accounted for" (Jakob Østergaard).
This bug is corrected in 2.2 kernels.
Section by Jakob Østergaard.
This section is intended to outline what works, and what doesn't when it comes to programming multi-threaded software for SMP Linux.
Since both fork() and PVM/MPI processes usually do not share memory, but either communicate by means of IPC or a messaging API, they will not be described further in this section. They are not very specific to SMP, since they are used just as much - or more - on uniprocessor computers, and clusters thereof.
Only POSIX Threads provide us with multiple threads sharing ressources like - especially - memory. This is the thing that makes a SMP machine special, allowing many processors to share their memory. To use both (or more ;) processors of an SMP, use a kernel-thread library. A good library is the LinuxThreads, a pthread library made by Xavier Leroy which is now integrated with glibc2 (aka libc6). Newer Linux distributions include this library by default, hence you do not have to obtain a separate package to use kernel threads.
There are implementations of threads (and POSIX threads) that are application-level, and do not take advantage of the kernel-threading. These thread packages keep the threading in a single process, hence do not take advantage of SMP. However, they are good for many applications and tend to actually run faster than kernel-threads on single processor systems.
Multi-threading has never been really popular in the UN*X world though. For some reason, applications requiring multiple processes or threads, have mostly been written using fork(). Therefore, when using the thread approach, one runs into problems of incompatible (not thread-ready) libraries, compilers, and debuggers. GNU/Linux is no exception to this. Hopefully the next few sections will sched a little light over what is currently possible, and what is not.
Older C libraries are not thread-safe. It is very important that you use GNU LibC (glibc), also known as libc6. Earlier versions are, of course possible to use, but it will cause you much more trouble than upgrading your system will, well probably :)
If you want to use GDB to debug your programs, see below.
There is a wealth of programming languages available for GNU/Linux, and many of them can be made to use threads one way or the other (some languages like Ada and Java even have threads as primitives in the language).
This section will, however, currently only describe C and C++. If you have experience in SMP Programming with other languages, please enlighten us.
GNU C and C++, as well as the EGCS C and C++ compilers work with the thread support from the standard C library (glibc). There are however a few issues:
The GNU Debugger GDB as of version 4.18, should handle threads correctly. Most Linux distribution offer a patched, thread-aware gdb.
It is not necessary to patch glibc in any way just to make it work with threads. If you do not need to debug the software (this could be true for all machines that are not development workstations), there is no need to patch glibc.
Note that core-dumps are of no use when using multiple threads. Somehow, the core dump is attached to one of the currently running threads, and not to the program as a whole. Therefore, whenever you are debugging anything, run it from the debugger.
Hint: If you have a thread running haywire, like eating 100% CPU time, and you cannot seem to figure out why, here is a nice way to find out what's going on: Run the program straight from the shell, no GDB. Make the thread go haywire. Use top to get the PID of the process. Run GDB like gdb program pid. This will make GDB attach itself to the process with the PID you specified, and stop the thead. Now you have a GDB session with the offending thread, and can use bt and the like to see what is happening.
ElectricFence: This library is not thread safe. It should be possible, however, to make it work in SMP environments by inserting mutex locks in the ElectricFence code.
Look at the Linux Parallel Processing HOWTO
Lots of useful information can be found at Parallel Processing using Linux
Look also at the Linux Threads FAQ
Yes. For programs, you should look at:
Multithreaded programs on linux (I love hyperlinks, did
you know that ? ;)
)
As far as library are concerned, there are:
Thanks to David Buccarelli, Andreas Schiffler and Emil Briggs, it exists in a multithreaded version (right now [1998-05-11], there is a working version that provides speedups of 5-30% on some OpenGL benchmarks). The multithreaded stuff is now included in the regular Mesa distribution as an experimental option. For more information, look at the Mesa library
Pentium Pro Optimized BLAS and FFTs for Intel Linux
Multithreaded BLAS routines are not available right now, but a dual proc library is planned for 1998-05-27, see Blas News for details.
Emil Briggs, the same guy who is involved in multithreaded Mesa, is also working on multithreaded The GIMP plugins. Look at http://nemo.physics.ncsu.edu/~briggs/gimp/index.html for more info.
Short answer: no.
Long answer: Intel claims ownership to the APIC SMP scheme, and unless a company licenses it from Intel they may not use it. There are currently no companies that have done so. (This of course can change in the future) FYI - Both Cyrix and AMD support the non-proprietary OpenPIC SMP standard but currently there are no motherboards that use it.
Put it into MP1.1/1.4 compliant mode.
check "Configure Hardware" -> "View / Edit details" -> "Advanced mode" (F7 I think) for a configuration option "APIC mode" and set this to "full Table mode". This is an official Compaq recommandation. (Daniel Roesen)
(Adrian Portelli)To do this:
From Robert Hyatt : ALR Revolution quad-6 seems quite safe, while some older revolution quad machines without P6 processors seem "iffy"...
From Alan Cox: If one of your CPU's is reporting a very low bogomips value the cache is not enabled on it. Your vendor probably provides a buggy BIOS. Get the patch to work around this or better yet send it back and buy a board from a competent supplier.
A 2.0 kernel (> 2.0.36) contains the MTRR patch which should solve this problem (select option "Handle buggy SMP BIOSes with bad MTRR setup" in the "General setup" menu).
I think buggy SMP BIOS handling is automatic in latest 2.2 kernels.
Some IBM machines have the MP1.4 bios block in the EBDA, allowed but not supported below 2.2 kernels.
There is an old 486SLC based IBM SMP box. Linux/SMP requires hardware FPU support.
Nope (according to Alan :)
), 1.4 is just a stricker specs of 1.1.
This is known problem with IRQ handling and long kernel locks in the 2.0 series kernels. Consider upgrading to a later 2.2 kernel.
From Jakob Oestergaard: Or, consider running xntpd. That should keep your clock right on time. (I think that I've heard that enabling RTC in the kernel also fixes the clock drift. It works for me! but I'm not sure whether that's general or I'm just being lucky)
There are some kernel fixes in the later 2.2.x series that may fix this.
The CPU number is assigned by the MB manufacturer and doesn't mean anything. Ignore it.
(Doug Ledford) Try recompiling LILO with LARGE_EBDA support and then making sure to always use make bzImage when compiling the kernel. That appears to have fixed the SMP boot hangs here on Intel multi-Xeon boards. However, please note that this also appears to break LILO in that the root= option no longer works, so make sure you rdev your kernel image at the same time you run lilo to make sure that the kernel loads the correct root filesystem at boot.
(Robert M. Hyatt) With 3 cpus, do you have a terminator in the 4th slot?
Try boot options "noapic" (John Aldrich) and/or "reboot=bios" (Terry Shull).
Try the later 2.2.x kernels and the knfsd patches. This is currently under investigation. (Wade Hampton)
If you are using kernels 2.2.11 or 2.2.12, get the latest kernel. For example 2.2.13 has a number of SMP fixes. Several people have reported these kernels to be unstable for SMP. These same kernels may have NFS problems that can cause lockups. Also, use a serial console to capture your oops messages. (Wade Hampton)
If the problem remains (and the other suggestions on this list didn't help either), then you could try the latest 2.3 kernels. They have more verbose (and more robust) SMP/APIC code, and automatic hard-lockup-prevention code which will produce meaningful oopses instead of a silent hang. (Ingo Molnar)
(Osamu Aoki) You MUST also disable all BIOS related power save features. Example of good configuration (Dual Celeron 466 Abit BP6):
POWER MANAGEMENT SETUP. ACPI: Disabled POWER MANAGEMENT: Disabled PM CONTROL by APM: No
(item by Wade Hampton)
A good means of debugging lockups is to get the ikd patch from Andrea Arcangeli: ftp://ftp.suse.com/pub/people/andrea/kernel-patches
There are several of debug options, but do NOT use the
soft lockup option! For newer SMP boxes,
turn kernel debugging then turn on the NMI oopser.
To verify that the NMI oopser is working, after booting the
new kernel,
/cat /proc/interrupts
and verify that you are getting
NMIs. When the box locks up, you should get an OOPS.
You may also try the %eip option. This allows the kernel to print on the console the %eip address every time a kernel function is called. When the box locks up, write down the first column ordered by the second column then lookup the addresses in the System.map file. This works only in console mode.
Also note that the use of a serial console can greatly facilitate debugging kernel lockups, not just SMP kernel lockups!
A message like:
APIC error interrupt on CPU#0, should never happen. ... APIC ESR0: 00000002 ... APIC ESR1: 00000000
In this section you'll find some possible reasons for a crash of an SMP machine (credits are due to Jakob Østergaard for this part). As far as I (David) know, theses problems are Intel specific.
From Ralf Bächle: [Related to case size and fans] It's important that the air is flowing. It of course can't where cables etc. are preventing this like in too small cases. On the other side I've seen oversized cases causing big problems. There are some tower cases on the market that actually are worse for cooling than desktops. In short, the right thing is thinking about aerodynamics in the case. Extra cases for hot peripherals are usefull as well.
Of course you can always go to Radio Shack (or similar) and get another fan. You can use the lm_sensors to monitor the CPU temperature of newer PII and PIII processors. This might help you to determine if heat is a problem. (Wade Hampton)
Don't buy cheap RAM and don't use mixed RAM modules on a motherboard that is picky about it.
Especially Tyan motherboards are known to be picky about RAM speed (see the Tyan paragraph below for a possible solution).
There have been some report of 10ns PC100 RAM being sold with motherboards where the CPU really needs 8ns RAM. (Wade Hampton)
Check /proc/cpuinfo
to see that your CPUs are same stepping.
...and even if it is stable, DON'T overclock.
From Ralf Bächle: Overclocking causes very subtle problems. I have a nice example, one of my overclocked old machines misscomputes a couple of pixels of a 640 x 400 fractal. The problem is only visible when comparing them using tools. So better say never, nuncas, jamais, niemals overclock.
2.0.x kernels on high performance fast ethernet systems have significant (and known) problems with a race/deadlock condition in the networking interrupt handler.
The solution is to get the latest 100BT development drivers from CESDIS Linux Ethernet device drivers site (ones that define SMPCHECK).
If you had a system using the 440FX chipset then your problem with the lockups was possibly due to a documented errata in the chipset. Here is a reference
References: Intel 440FX PCIset 82441FX (PMC) and 82442FX (DBX) Specification Update. pg. 13
http://www.intel.com/design/pcisets/specupdt/297654.htm
The problem can be fixed with a BIOS workaround (Or a kernel patch) and in fact David Wragg wrote a patch that's included with Richard Gooch's MTTR patch. For more information and a fix look here:
http://nemo.physics.ncsu.edu/~briggs/vfix.html
From Mark Duguid, dumb rule #1 with W6LI
motherboards. ;)
Sometime, some cards are not recognized or can trigger IRQ conflicts. Try shuffling cards on slots in different ways and possibly moving them to different IRQs.
Contributed by hASCII : removing an " append="hisax=9,2,3"" line in lilo.conf allowed using a kernel from the 2.1.xx series with activated ISDN + Hisax support. Kernels from the 2.0.xx series doesn't make problems like this.
Try also to set BIOS setup option like "MP 1.4 mode" or "route PCI interrupts through IOAPIC", or "OS Type" not set to DOS neither Novell (Ingo Molnar).
If you lockup when trying to access the floppy (for example
while sound is playing) you may have to edit drivers/pci/quirks.c
and set /int isa_dma_bridge_buggy = 1;
This is a problem with my Dell WS400 dual PII/300, 2.2.x, SMP
(Wade Hampton).
Please note: Some more specific information can be found with the list of Motherboards rumored to run Linux SMP
(Stéphane Écolivet)
The lowest cost SMP Linux boxes with nowadays buyable processors are dual Celeron systems. Such a system is not officially possible according to Intel. Better think about the second generation of Celeron, those with 128 Kb L2 cache.
Official answer from Intel: no, Celeron cannot work in SMP mode.
Practical answer: it is possible, but requires hardware alteration for Slot 1 processors. Alteration is described by Tomohiro Kawada on his Dual Celeron System page. Of course, this kind of modification removes warranties... Some versions of Celeron processor are also available in Socket 370 format. In that case, alteration may just be done on the Socket 370 to Slot 1 adapter or may even be sold pre-wired for SMP use. (Andy Poling, Hans - Erik Skyttberg, James Beard)
There is also a motherboard (ABIT BP6) allowing two Celerons in Socket 370 format to be inserted (Martijn Kruithof, Ryan McCue). ABIT Computer BP6 verified tested and native to linux with dual ppga socket 370 (Andre Hedrick).
Fine, thank you.
It may work. However, overclocking this kind of system is not as easy as overclocking a mono-processor one. It is definitly not a good idea for a production system. For personal use, dual Celeron 300A systems running rock-solid at 450 MHz have been reported. (numerous people)
It is impossible. Celeron processors have nearly the same features as basic Pentium II chips. If you want more than 2 processors in your system, you'll have to look at Pentium Pro, Pentium II Xeon or Pentium III (?) boxes.
A system using a "re-enable" Celeron processor and a Pentium II processor with the same steppings may theorically work.
Alexandre Charbey as made such a system:
Quoting the UltraLinux web page (only SMP systems):
UltraLinux has ran on a 14 CPUs machine (see the dmesg output).
(David Miller) There should not be any worries.
The only known problem, and one we don't intend to fix, is that if you build an SMP kernel for 32-bit (ie. non-ultrasparc) systems, this kernel will not work on sun4c systems.
(David Miller) There is a bug in the include/linux/tasks.h header file, it needs to define NR_CPUS to 64 on UltraSparc as this is the upper limit for the hardware we support :-)
(Cort Dougan) Not supported: PPC RS/6000 systems
Nothing. Usual SMP compiling (see above). As usual, be aware, modules are specific either for UP or SMP. Recompile them. (Paul Mackerras)
(Geerten Kuiper) SMP works for most, if not all, AXP servers.
(Jay A Estabrook) SMP does seem to work on most of our [Compaq] boxes with 2 or more CPUs. That includes :
It does not include :
None (really ? :-)
To subscribe, send subscribe linux-smp
in the message body
at
majordomo@vger.rutgers.edu
To unsubscribe, send unsubscribe linux-smp
in the message body at
majordomo@vger.rutgers.edu
Linux SMP archives at progressive-comp.com
linux-smp@vger.rutgers.edu
Many thanks to those who help me to maintain this HOWTO: