acpi=off in default install of 2.4.0
by removing acpi = off and/or rewriting it to acpi = on ---- your trixbox 1.2.xxx as well as 2.4.0.1 + will allow your dual core / quad core and multi cpu to work....
nice to see all the motor spinning once that is resolved :)
P.S. you can find the conf. file to permanently change in /boot/grub
I am not using this for multi-proc - but purely for more effecient processing of interrrupts. was not aware of multi-cpu/core bit. see below regarding interrupt handling
# cat /proc/interrupts
CPU0 CPU1
0: 8393785 88197 IO-APIC-edge timer
1: 4 4 IO-APIC-edge i8042
8: 2 1 IO-APIC-edge rtc
9: 0 0 IO-APIC-level acpi
12: 3 1 IO-APIC-edge i8042
50: 104836 2772 IO-APIC-level libata
58: 144 559249 PCI-MSI eth0
169: 29431 9277017 IO-APIC-level wanpipe1, wanpipe2, wanpipe3
225: 9 14 IO-APIC-level uhci_hcd:usb1, uhci_hcd:usb3, ehci_hcd:usb4
233: 0 0 IO-APIC-level uhci_hcd:usb2
NMI: 0 0
LOC: 8321870 8321877
ERR: 0
MIS: 0
/Hyp
to have problems with acpi=on - It throws ACPI errors on boot, which make me nervous - but, even with acpi=off in /etc/grub.conf, I can still see all four cores in cat /proc/interrupts.
On our Dell PowerEdge-1800 Systems, which have Hyperthreader Xeons in them, if acpi=off is in grub.conf, cat /proc/interrupts only shows one core - it's hard to tell if it's much of an improvement turning it on and enabling the other core (actually Pipeline in the HT chip) but I have it turned on and running on our production machine, and have had no problems whatsoever - It works quite well.
Several places I have read have stated that Hyperthreading and Asterisk don't mix, but we have 4 machines installed right now (including our office machine) in production and they all work great - with HT turned on.
In the 2.2 builds, I noticed that acpi=off did indeed seem to turn of SMP - Noticed this on a Quad-Core machine that we installed, and then when I pulled up cat /proc/interrupts, it only showed one core - bummer, but removed acpi=off and rebooted and they all showed.
Maybe it's more of a Kernel-Version thing as far as what get's turned off.
Greg
solution before i go on and apply it on new systems. i got burned too many times adopting ideas here. i will give it a few weeks before i go on and apply it. after all we are in teleco business. people dont like their brand new pbx to drop calls and dead lock. acpi=off INHO is desigen to shut down the power saving chip (disable).
Isaac
I would not say this should be a default setting one way or the other. Weather or not you require acpi=off will be 100% dependent on your hardware. The systems we use are crippled by acpi=off where some systems are crippled without it. The best thing to do is try it on a non production version of your hardware and see what works for you. If you are having issues with usb locks or smp I would say this would be a good troubleshooting option.
Linux ACPI
line
Introduction
The goal of this project is to enable Linux to take advantage of platforms that support ACPI (Advanced Configuration & Power Interface). ACPI has been supported on virtually all high-volume i386, x86_64, and ia64 systems, since 1999.
ACPI is an abstraction layer between the OS and platform firmware and hardware. This abstraction allows the OS and the platform to evolve independently. Not only should a new OS be able to handle old hardware, but an old OS should be able to handle new hardware.
The latest ACPI specification is published on the ACPI home page: http://www.acpi.info.
The core of the Linux ACPI implementation comes from ACPICA (ACPI Component Architecture). ACPICA includes an ACPI Machine Language (AML) interpreter that is resident in the Linux kernel. Several other operating systems use the same ACPICA core interpreter, including BSD and OpenSolaris. ACPICA also comes with a simulator, test suites, and a compiler, to translate ACPI Source Language (ASL) into AML.
Kerry,
As stated above one way is not right for everyone and you have a 50/50 shot at it. Maybe do a poll and see what the majority wants and use that as your basis. Some group is going to loose out and have to make the edit. You could always not add it and say that this is the Linux default. I think time could probably be better spent on bugs and improvements that aren't split in half by hardware and that can be resolved in seconds by whichever group needs whatever setting.
I am guessing it is a somewhat more conservative kernel setting to have it turned off?
I would leave it alone and let people decide for themselves.
For the implementers, just do a cat /proc/interrupts and see if you are getting all the cores you paid for and go from there.
Greg
I'm low on the totem pole here in terms of experience, however I have spent nearly four months trying to get the additional cores and additional processor running and didnt figure it out; gave up and was told by people "in the business" that it would be fixed in asterisk 1.4 to just wait. Well when you made the jump to a stable 2.4.x i jumped with you and was surprised to find that my problem wasnt solved.
While blabbing on the phone with a friend, i happen to be logged into my actual terminal during a reboot and noticed acpi=off. I asked him about it (hes a m$ guy) and he explained that it well could be the root of my evils -- and behold - it was.
Perhaps it could be useful to have an option at boot which would, once chosed be the default startup option...?
its an easy check for the difference..
with acpi=on - boot CentOS and :-
cat /proc/interrupts
then with acpi=off do the same, the results differ on my Dell R200 system with interrupt 2 (XT-PIC Cascade) - output at end of this post. I am unsure if the fact that interrupt 2 appears actually affects anything on CentOS operations, but a difference occurs when the acpi change is made.
On the CentOS release with tb 2.4.x, apic=off is an invalid startup option - the boot logs show this, whereas acpi=off IS valid. - so to completely disable ALL apic/acpi functions, googling about gave me these grub.conf boot options with regards to acpi and apic :-
noapic nolapic noacpi acpi=off
acpi=on:-
CPU0 CPU1
0: 81926678 960934 IO-APIC-edge timer
1: 4 4 IO-APIC-edge i8042
8: 2 1 IO-APIC-edge rtc
9: 0 0 IO-APIC-level acpi
12: 3 1 IO-APIC-edge i8042
50: 976434 2772 IO-APIC-level libata
58: 144 2519369 PCI-MSI eth0
169: 29431 90068687 IO-APIC-level wanpipe1, wanpipe2, wanpipe3
225: 9 14 IO-APIC-level uhci_hcd:usb1, uhci_hcd:usb3, ehci_hcd:usb4
233: 0 0 IO-APIC-level uhci_hcd:usb2
NMI: 0 0
LOC: 81330160 81330188
ERR: 0
MIS: 0
acpi=off:-
CPU0 CPU1
0: 132557 39276 IO-APIC-edge timer
1: 5 4 IO-APIC-edge i8042
2: 0 0 XT-PIC cascade
8: 1 0 IO-APIC-edge rtc
12: 1 3 IO-APIC-edge i8042
113: 29388 136764 IO-APIC-level wanpipe1, wanpipe2, wanpipe3
145: 0 0 IO-APIC-level uhci_hcd:usb2
153: 9 14 IO-APIC-level uhci_hcd:usb1, uhci_hcd:usb3, ehci_hcd:usb4
169: 6192 2702 IO-APIC-level libata
217: 129 526 PCI-MSI eth0
NMI: 0 0
LOC: 168933 168942
ERR: 0
MIS: 0
no one is confusing acpi and apic please read the whole thread....
ACPI, Advanced Configuration and Power Interface, is a standardised set
of interfaces for letting the operating system know what hardware is
there and how to set it up. It also tells the OS how to put it into a
low power (or no power) mode when the OS is being suspended (to RAM or
disk, for example).
An interrupt is a way for the processor and the hardware to let each
other know that there's data waiting to be serviced. So, for example,
when an Ethernet card detected that it had received data intended for
that PC, it would send an interrupt. The interrupt would (normally)
trigger the processor to stop running the current program, and switch
control to the operating system. The OS would work out that the Ethernet
device had sent the interrupt, arrange for the incoming data to be
copied somewhere safe, and return control to the program. (Later, the OS
would decode the data it had been sent).
A PIC, a Programmable Interrupt Controller, is something that receives
interrupts from a number of devices, sends an interrupt to the
processor, and then can be queried by the processor to find out which
device sent the interrupt in the first place. (That simplifies the
processor interface, you see.)
The original IBM PC and its immediate successor, the PC XT defined a
rather simple PIC scheme with a number of limitations. So an Advanced
Programmable Interrupt Controller was defined, that can cope with more
devices and share them evenly among multiple processors.
Spotted this thread yesterday - seems that tb2.4.x with acpi=off is affecting others in a similar way it did to my Dell R200.
http://www.trixbox.org/forums/trixbox-forums/open-discussion/tb-2...
I must agree that it seems like a 50/50 issue. I dont think an option during installation will help. Given this is specific to each make/model etc, then its something that must be watched and tested. Example being my previous server IBM x305 with tb2.2.x had a known series hardware bug with APIC. To avoid hardware crashing, it was necessary to force acpi and apic off with :-
noapic nolapic noacpi acpi=off
Dont you wish the hardware manufacturers would just get it right ?
/Hyp
I think it should be something that everyone needs to test under problem situations, in Hyperus' case his PSTN card was given a higher priority IRQ with acpi=off - this could be a definite gain if he had certain Digium cards.
- as for extra cpu cores showing up - do people see performance gains when acpi=on - or maybe the better question - do they need performance gains? Not really sure on how acpi is directly tied to how many cpus/cores are used by the OS - I thought this was just done by using SMP kernel??
I think we will categorize this under Troubleshooting in the wiki.
The Dell 1950 dual dual-core Xeon based systems appear to run smoother from my perspective with "acpi=off" removed at start up.
I can see where disabling ACPI would work better on some hardware, but honestly I believe Trixbox would be better off booting with no special options. A quick FAQ about what you might want to try if you have issues with stability after an installation would not hurt.
Let's face it: Almost any new system you purchase today will have at least two processor cores. Anything you purchase in the server segment is going to have PCI-X or PCIe busses, modern chipsets (complex, lots of integrated features/functions), multiple NICs, etc. New systems are better off running with ACPI enabled.
Every single Trixbox I have running has had the "acpi=off" removed, regardless of the number of processors or T1 interface cards.
From Konrad - 11/1/07 8:00 AM
If you are seeing increasing "overruns" when you run "ifconfig" and you have audio quality issues, check whether or not your system is using XT-PIC or IO-APIC. You can check this by running the command "cat /proc/interrupts". XT-PIC is an older interrupt controller that can't always handle the number of interrupts generated by real-time communications.
For Trixbox 2.2.3
You need to put ACPI=ON APIC=ON at the end of the title line
title CentOS-4 i386 (2.6.9-34.0.2.ELsmp)
root (hd0,0)
kernel /vmlinuz-2.6.9-34.0.2.ELsmp ro root=LABEL=/1 acpi=on apic=on
initrd /initrd-2.6.9-34.0.2.ELsmp.img
I have an A200 and an A500 in my Dell PowerEdge R200 - running very latest TB 2.6 fully yummed. I have pasted current IF config below - no issues here with overruns - thats for about 3 days running in this example.
I Do however still get a crazy NMI event that leaves Centos "dazed and Confused" but asterisk keeps functioning... about every 7 days. It changes the server "blue all ok" front panel light into an "orange - bad stuff has happened" state. Centos reports there is most likely a memory Dimm failed, but ram all tests ok in diags. Sangoma say that the PCI-e bus gets bad data on it and have been able to show me this in emails. Dell are wiping their hands of it as they say they dont support centos and claim that the issue is with the Sangoma cards.
I would be interested in hearing from anyone experiencing similar.
/Hyp
anyone got anything like this happening ?
[somewhere.com.au ~]# uptime
09:39:15 up 3 days, 23:07, 2 users, load average: 0.00, 0.02, 0.00
[somewhere.com.au ~]# uptime
09:39:38 up 3 days, 23:07, 2 users, load average: 0.00, 0.01, 0.00
[somewhere.com.au ~]# ifconfig
eth0 Link encap:Ethernet HWaddr 00:19:B9:F9:39:C5
inet addr:203.26.72.30 Bcast:203.26.72.63 Mask:255.255.255.192
inet6 addr: fe80::219:b9ff:fef9:39c5/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:3328725 errors:0 dropped:0 overruns:0 frame:0
TX packets:3206882 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:711725889 (678.7 MiB) TX bytes:708467208 (675.6 MiB)
Interrupt:169
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:12669215 errors:0 dropped:0 overruns:0 frame:0
TX packets:12669215 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:1326303965 (1.2 GiB) TX bytes:1326303965 (1.2 GiB)
w1g1 Link encap:Point-to-Point Protocol
UP POINTOPOINT RUNNING NOARP MTU:80 Metric:1
RX packets:34242943 errors:0 dropped:0 overruns:2 frame:2
TX packets:34242943 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:100
RX bytes:2739435440 (2.5 GiB) TX bytes:2739435440 (2.5 GiB)
Interrupt:169 Memory:f8ac0000-f8ac1fff
w2g1 Link encap:Point-to-Point Protocol
UP POINTOPOINT NOARP MTU:80 Metric:1
RX packets:0 errors:0 dropped:0 overruns:2 frame:2
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:100
RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)
Interrupt:169 Memory:f8ac0000-f8ac1fff
w3g1 Link encap:Point-to-Point Protocol
UP POINTOPOINT RUNNING NOARP MTU:8 Metric:1
RX packets:342422263 errors:0 dropped:0 overruns:0 frame:0
TX packets:342422263 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:100
RX bytes:2739378104 (2.5 GiB) TX bytes:2739378104 (2.5 GiB)
Interrupt:169 Memory:f8b20000-f8b21fff
[somewhere.com.au ~]#
Konrad,
I've followed your instructions and my machine is using XT-PIC, because I had to pass noapic as a kernel parameter because otherwise CentOS didn't started over my AMD machine and if I put ACPI=ON APIC=ON the problem persist and linux couldn't boot
How can I solve this problem ?
Regards,
[localhost.localdomain ~]# cat /proc/interrupts
CPU0 CPU1
0: 1137290 0 XT-PIC timer
1: 348 0 XT-PIC i8042
2: 0 0 XT-PIC cascade
6: 6 0 XT-PIC floppy
7: 2 0 XT-PIC ehci_hcd:usb2
8: 1143241 0 XT-PIC rtc
9: 0 0 XT-PIC acpi
10: 1 0 XT-PIC ohci_hcd:usb1
12: 4 0 XT-PIC i8042
14: 9981 0 XT-PIC ide0
15: 9860 0 XT-PIC sata_nv
89: 1256 136364 PCI-MSI eth0
NMI: 0 0
LOC: 1137226 1137231
ERR: 1
MIS: 0
I was having major errors in that I could only get Zaptel OR asterisk to run. asterisk wouldn't run if wanrouter or zaptel were running and zaptel wouldn't run if asterisk was running.
After 2 weeks of hunting and checking and editing, James pointed me to a site that mentioned removing that bit of code, and it worked.
One command and deleting 8 characters. It was as simple as that to fix my machine. (2.6.1). Stupid isn't it.



Member Since:
2007-12-09