[PlanetCCRMA] NTP problems with planetccrma kernels

Fernando Lopez-Lezcano nando@ccrma.Stanford.EDU
Mon Dec 12 18:15:03 2005


On Mon, 2005-12-12 at 21:51 +0100, Nigel Henry wrote:
> On Friday 02 December 2005 08:08, Tracey Hytry wrote:
> > Hi, Nigel
> >
> > It's me again, the carefull one that has all of the machines around here
> > running the planet's FC3.  We run a local time server here and have the
> > rest of the machines use it.
> >
> > A few weeks back there was a FC3 update that included NTP.  Everything
> > broke here until we noticed that the update used different setup files for
> > NTP.  Also, I have noticed a few different bugs with various programs while
> > using the edge kernels.  This sometimes complicates things.
> >
> > Last week I downloaded the source for the 2.6.14.3 kernel from kernel.org
> > and built a nice little kernel but didn't try it out because of lack of
> > time and not feeling like diverging too far from the planet by building the
> > newest alsa to go with the kernel.  Right now I am busy with a bunch of
> > other things and even though I would like to try a newer kernel/alsa setup;
> >  I have more pressing things to do.
> >
> > I really need to debug the problem with the broken snd_mtpva(?) driver
> > that's included with the 2.6.12-0.21.rdt.rhfc3.ccrma kernel sometimes, or
> > at least see if it's fixed with the .14 kernel(and newest alsa) with and
> > without Ingo's patches.  The kernel does an oops when it tries to register
> > the interrupt while running the latest planet FC3 edge kernel.
> >
> > We tried a number of planet FC3 edge kernels and found that 2.6.12-0.21.rdt
> > is fantastic for speed, stablility, and low sound latencies while running
> > openGL apps with the Nvidia driver.  This is cool, but the broken driver
> > isn't.  It's not such a big deal, as the machine in the music room dosen't
> > need openGL.  What this all comes down to is that there is quite a bit of
> > variation in how things work with the various edge kernels.
> >
> > I'm kinda waiting for Fernando to release a new kernel and alsa for FC3 to
> > see if the problems go away and no new ones come up.  I'm trying to run a
> > pristine* planet setup here so that when bugs come up I know it's not
> > because of someting I loaded on the machines.
> >
> > *If you want to do a little bit of exploration check out the .config file
> > for the planet kernels.  You'll notice that there is a lot of stuff
> > specific to Fernando's build machine in the kernels.  It can be fun
> > checking out what hardware he uses :)
> >
> > Tracey.
> 
> Hi Tracey. Sorry about the slow reply. I've been busy playing with NTP. I'm 
> still getting around to looking at the ccrma kernel .config files.
> 
> Anyway. I'm trying to find a script that I can run 
> in /home/user/.kde/autostart. Someone on another list made a suggestion, but 
> as I somehow need to run /etc/init.d/ntpd start, which needs to be run as 
> root, I'm not sure which way to go. Any ideas?
> 
> By way of interest, I've set out below the 2 relevant messages log parts for 
> the original kernel (2.6.5-1.358) and the particular ccrma kernel 
> (2.6.10-0.4.rdt.rhfc2.ccrma) thats running on FC2 at the mo. Mind you the 
> same problem appears to be there with other ccrma kernels on FC2 and FC3. 
> Havn't tried FC1, and have only just finished the iso DL's for FC4, and need 
> to find somewhere to install it.
> 
> This messages log is for the original kernel, where ntpd starts and continues 
> to run.
> 
> Dec 12 20:55:47 localhost xinetd[2044]: pmap_set failed. service=sgi_fam 
> program=391002 version=2
> Dec 12 20:55:47 localhost ntpd:  succeeded   (opening firewall 192.168.0.228)
> Dec 12 20:55:47 localhost ntpd:  succeeded   (opening firewall 192.168.0.230)
> Dec 12 20:55:46 localhost ntpdate[2064]: step time server 192.168.0.230 offset 
> -1.047769 sec   ( this is the synching with server line)
> Dec 12 20:55:46 localhost ntpd:  succeeded   
> Dec 12 20:55:46 localhost ntpd[2068]: ntpd 4.2.0@1.1161-r Thu Mar 11 11:46:39 
> EST 2004 (1)
> Dec 12 20:55:46 localhost ntpd: ntpd startup succeeded
> Dec 12 20:55:46 localhost ntpd[2068]: precision = 2.000 usec
> Dec 12 20:55:46 localhost ntpd[2068]: kernel time sync status 0040
> Dec 12 20:55:46 localhost ntpd[2068]: frequency initialized -200.730 PPM 
> from /var/lib/ntp/drift
> Dec 12 20:55:47 localhost su(pam_unix)[2081]: session opened for user root by 
> (uid=0)
> 
> This next messages log is from booting with the ccrma kernel. The 4th line is 
> particularly interesting, and the part "cap_set_proc failed" . Is 
> cap_set_proc a kernel related thing do you think?
> 
> 
> Dec 12 20:14:10 localhost xinetd[2718]: Started working: 0 available services
> Dec 12 20:14:15 localhost ntpd:  succeeded
> Dec 12 20:14:15 localhost ntpd:  succeeded
> Dec 12 20:14:16 localhost ntpdate[2739]: cap_set_proc failed.  (THIS LINE)

AHA!!

> Dec 12 20:14:16 localhost ntpd:  failed
> Dec 12 20:14:16 localhost ntpd[2744]: ntpd 4.2.0@1.1161-r Thu Mar 11 11:46:39 
> EST 2004 (1)
> Dec 12 20:14:16 localhost ntpd: ntpd startup succeeded
> Dec 12 20:14:16 localhost ntpd[2744]: precision = 1.000 usec
> Dec 12 20:14:16 localhost ntpd[2744]: kernel time sync status 0040
> Dec 12 20:14:16 localhost ntpd[2744]: frequency initialized -200.730 PPM 
> from /var/lib/ntp/drift
> Dec 12 20:14:16 localhost ntpd[2744]: cap_set_proc failed.
> Dec 12 20:28:23 localhost su(pam_unix)[2758]: session opened for user root by 
> (uid=0)
> 
> So what do you think of that lot?  My only thoughts are trying to find a way 
> of running /etc/init.d/ntpd start from KDE when it starts, because manually 
> restarting the daemon from the CLI does work, and it keeps running. All the 
> best. 

_Now_ I remember!

I know what you need to do. First, an explanation of what's happening.
The problem in those Planet CCRMA kernels is one of configuration file
incompatibility with the Fedora kernels. The realtime lsm kernel module
needed something else (can't remember right now) to be a kernel module
but that is built into the Fedora kernel (not as a module), so to build
the realtime lsm module I had to configure something else as a module
(which is hardwired in the Fedora kernels). 

Now, ntpd is using capabilities, which would be hardwired in a Fedora
kernel but has to be loaded in the Planet CCRMA kernel. I think that
loading the realtime lsm loads whatever is needed by ntpd, and that is
why things work fine after the machine boots (because by that time the
lsm module is installed as well as the capabilities thing that ntpd
needs). 

What you need to do is to change the startup order of the rtload script
so that it loads _before_ ntpd tries to start... That's done with the
"chkconfig" line in /etc/rc.d/init.d/rtload, change the second number to
be lower than the equivalent priority number in ntpd's startup script.
You will need to:
  /sbin/chkconfig rtload off
  /sbin/chkconfig rtload on
so that the proper levels are relinked into the new priority...
"man chkconfig" for more details...

-- Fernando