Issue
COS 7.2 (kernel) issues
Hi all,
I used to run COS 6 headless in gateway mode on this hardware without any problems.
Since I installed COS 7.1 (updated to 7.2) I have several problems, but lets just start with one.
Now I am required to connect a screen to my box, because the default kernel (option 2 below) does not boot.
I get the following options when booting:
Only option 4 boots (beyond: Starting ClearOS API ...) and gives me Internet access on my clients, but not without problems.
A few of the problems are (not sure if they are kernel related):
At another location I used the same hardware and DVD without these problems.
Please tell me what you need to assist me with these issues.
John
I used to run COS 6 headless in gateway mode on this hardware without any problems.
Since I installed COS 7.1 (updated to 7.2) I have several problems, but lets just start with one.
Now I am required to connect a screen to my box, because the default kernel (option 2 below) does not boot.
I get the following options when booting:
ClearOS (3.10.0-327.10.1.v7.x86_64) 7 (Final) with debugging
ClearOS (3.10.0-327.10.1.v7.x86_64) 7 (Final)
ClearOS, with Linux 3.10.0-229.7.2.v7.x86_64
ClearOS, with Linux 0-rescue-d9b02baf62884e1993f90d404dc5a4b2
Only option 4 boots (beyond: Starting ClearOS API ...) and gives me Internet access on my clients, but not without problems.
A few of the problems are (not sure if they are kernel related):
Most webconfig pages show an "Additional Info: Connection Failure" error.
I am unable to start the "Intrusion Prevention System".
The Market Place shows a "DNS lookup failed." error.
Events and Notifications show a "Firewall has entered panic mode" error (I didn't edit the firewall rules manually).
Sometimes I have to manually start "Windows Networking (Samba)" on the webconfig
At another location I used the same hardware and DVD without these problems.
Please tell me what you need to assist me with these issues.
John
In Installation
Share this post:
Responses (25)
-
Accepted Answer
@John
There is not much that I was able to troubleshoot
How about reading the hints on the 'emergency mode' screen and letting us know the results from following them? - the log in particular might provide a clue.
Also please check the /var/log/messages log and provide the last 10 (and only 10) lines added to the log before the boot stalls. Please put between code tags, and thanks for fixing up the code tags in your recent append.
How procedure do you use when you get the 'emergency mode' screen to initiate a reboot to try again?
A short summary of the hardware would also be appreciated. -
Accepted Answer
Hi all,
Thanks for your responses and sorry about the late reply.
The problem is that on my COS box at home, I have to reboot several times before it boots up normally. With the COS box at work the problem only happened once.
By default the latest / first kernel is selected, but as you can see in the pictures it initially boots up in emergency mode.
There is not much that I was able to troubleshoot. Before the last big update I was able to boot up with the rescue kernel.
I have checked the disk space and I did not find any problems:
Filesystem Size Used Available Use % Mounted
/dev/mapper/clearos-root 147G 37G 110G 25% /
/dev/mapper/sil_caacaecebjeb1 497M 229M 269M 47% /boot
Please assist,
John -
Accepted Answer
Insufficient space in /var can also cause weird intermittent problems, often a result of one or more extra large log files. Laterthe problem disappears when they are rotated out...
As Nick intimated - without more detail John, we are just staring into a very cloudy crystal ball - and making "educated" guesses. If you want help you need to first give us decent information to work with... otherwise your request "Can anyone help me to prevent this problem from returning" is virtually impossible to accomplish... -
Accepted Answer
-
Accepted Answer
Hi all,
Since the latest reboot I discovered that the problem has partially returned on two separate COS boxes. Our web & mail server was temporarily off-line because of it.
I was unable to reproduce the problem, but to be sure that it does not happen again, I changed the default kernel to the rescue kernel.
grub2-set-default #
Can anyone help me to prevent this problem from returning and to use the latest kernel instead of the rescue one by default ... ?!?
Please assist,
John -
Accepted Answer
Hi John,
I may have hit the same issue and I think I've fixed it. The fundamental cause seems to be /boot running out of space. I had three kernels installed, just due to automatic updates. When I tried removing a NIC driver I saw a repeating error message:
When I next booted the box it failed and none of the three kernels were bootable, just the recovery one. Looking more at the system I did a:dracut: creation of /boot/initramfs-3.10.0-327.10.1.v7.x86_64.tmp failed
and saw my 200MB /boot partition was at 91% so I suspected it ran out of space when trying to write the new initramfs. The solution was to remove the latest kernel and the one before then reboot - still to the recovery kernel then update the kernel again through yum. This was successful and left me with two kernels installed. The latest one now boots and I am about to delete the oldest one I did not remove earlier.df -h
Can you look at "df -h", and also, how big is your /boot partition?
I was fortunate as I am building my system to replace the current one and I have a separate data partition which is mirroring the live machine. I was able to delete the partition and move sda2 and sda3 a bit so I could increase the size of sda1 (/boot) to make it less likely to fail again. I could then recreate my data partition and re-sync it with the live installation. If you only have three partitions it is not so easy as you can't shrink an xfs partition. You can just move or increase it. -
Accepted Answer
-
Accepted Answer
Thanks Tony,
For now this is a solution I can work with.
# egrep ^menuentry /etc/grub2.cfg | cut -f 2 -d \'
ClearOS (3.10.0-327.10.1.v7.x86_64) 7 (Final) with debugging
ClearOS (3.10.0-327.10.1.v7.x86_64) 7 (Final)
ClearOS, with Linux 3.10.0-229.7.2.v7.x86_64
ClearOS, with Linux 0-rescue-d9b02baf62884e1993f90d404dc5a4b2
# grub2-set-default 3
It does not explain why the default kernel (2nd option) does not boot, but for now I can at least run my box headless.
If anyone encountered a similar problem and has a more definite solution, please respond.
Greetings,
John -
Accepted Answer
-
Accepted Answer
# dmraid -r
/dev/sdc: sil, "sil_caacaecebjeb", mirror, ok, 312579760 sectors, data@ 0
/dev/sdb: sil, "sil_caacaecebjeb", mirror, ok, 312579760 sectors, data@ 0
Everything looks ok and I don't want to break anything.
I would like to know how to automatically chose another default kernel, so I can disconnect my screen from my box. I do not intend to uninstall kernels, because that could mean that I would not be able to connect to the Internet at all anymore.
Please assist,
John -
Accepted Answer
Thanks John for the update. As suspected you were using the fake raid BIOS to provide the raid function. Having no first hand experience with fake raid (avoid it at all costs) not sure what your options are. Do the BIOS routines contain any code to check the integrity of the raid (i.e. the contents of each disk are identical) or other useful utilities? If the content of the two disks is identical - then the raid must be working, and my concern above is negated. I just do not know what to expect to see with fake raid.
Alternatively, the dmraid program that should be installed in your ClearOS system may provide help. Dmraid is supposed to work with the fake raid BIOS in linux. I'm not familiar with it - use the man pages (man dmraid) to discover what options you have. "dmraid -r" might be a good starting place. There is also a "dmraid-events-logwatch" rpm that provides dmraid logwatch-based email reporting. Is this installed and working on your system? There may also be some good tutorials on the web for dmraid - I'll leave you to do the research. If your raid is working correctly and the integrity of the disks is established - then really don't have much in the way of suggestions to solve your initial problem. If the initial kernel loads OK then you can always use yum to remove the later kernels and try the latest one again.
Not sure how familiar you are with software raid and mdadm. If you are; then saving your data, split the disks (i.e. break the raid1) and re-install as two separate disks and fresh install using mdadm software raid to create a raid1 is an option. This depends on your skill level and how comfortable you are with using software raid. -
Accepted Answer
Hi Tony,
Thanks for the info and for making clear that it's not hardware raid and setup using LVM.
No worries, I am already glad that you took the time to try and figure out what is going on with my box.
I created the raid before I installed COS 6 and I didn't change it before installing COS 7.
I pressed F4 using the routine in the raid controllers BIOS and then I got the option to set it up in raid 1.
Not sure how this will help, but I hope that it does.
Please assist,
John
Ps. I tried to upload a picture of my raid BIOS, but it didn't work. -
Accepted Answer
OK - it's gone midnight here and bed-time, but some initial thoughts - will look at this again tomorrow...
1) Sil 3132 controller. That's a fake raid controller that uses firmware in the BIOS chip to produce software raid using your CPU, not real hardware on the card.
2) Although you stated you were not - you are in fact using LVM - that is the default for ClearOS 7.x - even if you choose your own partitioning. Something that is not immediately obvious during the install and isn't really made clear.
3) The listing of duplicates in the LVM command "pvdisplay" worries me. It almost looks like the initial install recognized the fake raid and installed on both disks as a raid1 mirror, but later sometime that broke and when the system now boots it sees the two disks as separate disks and using one physical disk only, and not as a raid1 mirror? If it was seeing a proper raid1 mirror surely should have only found and reported one disk - the raid1 mirror.
How did you create the raid initially - using the routine in the raid controllers BIOS before you booted the install media? -
Accepted Answer
Hi Tony,
Thanks for trying to help.
I have tried to figure out what the hardware specs of the raid controller are, but after physically disconnecting it, all I could read on it was this:
Sweex
PU203 06300658346
IE-S13-B602-00-01391
F05
94V-0
E132041
0831 PI4132-10X2A
By entering the raid utility I found this:
SiI 3132 SATARaid BIOS Version 7.4.01
Copyright (C) 1997-2006 Silicon Image, Inc.
cat /etc/fstab:
# Created by anaconda on Fri Apr 15 15:24:03 2016
#
# Accessible filesystems, by reference, are maintained under '/dev/disk'
# See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info
#
/dev/mapper/clearos-root / xfs defaults 0 0
UUID=787e9d8c-6997-43d5-8c75-add29197ec8e /boot xfs defaults 0 0
/dev/mapper/clearos-swap swap swap defaults 0 0
/usr/sbin/pvdisplay:
Found duplicate PV rVt1kd997DWxwxSezK2UFOdjusn8god5: using /dev/sdb2 not /dev/sda2
Using duplicate PV /dev/sdb2 which is last seen, replacing /dev/sda2
Found duplicate PV rVt1kd997DWxwxSezK2UFOdjusn8god5: using /dev/mapper/sil_caacaecebjeb2 not /dev/sdb2
Using duplicate PV /dev/mapper/sil_caacaecebjeb2 from subsystem DM, replacing /dev/sdb2
--- Physical volume ---
PV Name /dev/mapper/sil_caacaecebjeb2
VG Name clearos
PV Size 148.56 GiB / not usable 0
Allocatable yes
PE Size 4.00 MiB
Total PE 38031
Free PE 11
Allocated PE 38020
PV UUID rVt1kd-997D-Wxwx-SezK-2UFO-djus-n8god5
Please advice,
John -
Accepted Answer
John, You said you were using hardware raid. What raid controller is it? (make and model please)...
If the raid is integrated with the motherboard - make and model of motherboard please...
Looking to see if it is genuine hardware raid or fake-raid...
Also can you let us know how the disk(s) are partitioned (cat /etc/fstab)
and finally show the output of /usr/sbin/pvdisplay -
Accepted Answer
I would not know if that was interesting. I marked it as text and I changed the value to "text", but that did not change anything. I doubt that my situation is unique. AFAIK if this had anything to do with partitioning, the rescue kernel would also not be able to boot. All I can say is that I don't have a clue how to solve this.
Please assist,
John -
Accepted Answer
Interesting. There's one grub option that is included in the standard boot entries, but not in the rescue boot entry:
set gfxpayload=keep
Which has something to do with the graphics mode - see https://www.gnu.org/software/grub/manual/html_node/gfxpayload.html -
Accepted Answer
Thanks Peter,
It's not my expertise either and I only use 2 mirrored HD's unlike LVM.
I'm not sure if this will help, but here is the contents of "/etc/grub2.cfg":
#
# DO NOT EDIT THIS FILE
#
# It is automatically generated by grub2-mkconfig using templates
# from /etc/grub.d and settings from /etc/default/grub
#
### BEGIN /etc/grub.d/00_header ###
set pager=1
if [ -s $prefix/grubenv ]; then
load_env
fi
if [ "${next_entry}" ] ; then
set default="${next_entry}"
set next_entry=
save_env next_entry
set boot_once=true
else
set default="${saved_entry}"
fi
if [ x"${feature_menuentry_id}" = xy ]; then
menuentry_id_option="--id"
else
menuentry_id_option=""
fi
export menuentry_id_option
if [ "${prev_saved_entry}" ]; then
set saved_entry="${prev_saved_entry}"
save_env saved_entry
set prev_saved_entry=
save_env prev_saved_entry
set boot_once=true
fi
function savedefault {
if [ -z "${boot_once}" ]; then
saved_entry="${chosen}"
save_env saved_entry
fi
}
function load_video {
if [ x$feature_all_video_module = xy ]; then
insmod all_video
else
insmod efi_gop
insmod efi_uga
insmod ieee1275_fb
insmod vbe
insmod vga
insmod video_bochs
insmod video_cirrus
fi
}
terminal_output console
if [ x$feature_timeout_style = xy ] ; then
set timeout_style=menu
set timeout=5
# Fallback normal timeout code in case the timeout_style feature is
# unavailable.
else
set timeout=5
fi
### END /etc/grub.d/00_header ###
### BEGIN /etc/grub.d/00_tuned ###
set tuned_params=""
### END /etc/grub.d/00_tuned ###
### BEGIN /etc/grub.d/10_linux ###
menuentry 'ClearOS (3.10.0-327.10.1.v7.x86_64) 7 (Final) with debugging' --class clearos --class gnu-linux --class gnu --class os --unrestricted $menuentry_id_option 'gnulinux-3.10.0-229.7.2.v7.x86_64-advanced-d612ef2b-f5cb-42af-95bf-ad1b168314a5' {
load_video
set gfxpayload=keep
insmod gzio
insmod part_msdos
insmod xfs
set root='hd0,msdos1'
if [ x$feature_platform_search_hint = xy ]; then
search --no-floppy --fs-uuid --set=root --hint='hd0,msdos1' 787e9d8c-6997-43d5-8c75-add29197ec8e
else
search --no-floppy --fs-uuid --set=root 787e9d8c-6997-43d5-8c75-add29197ec8e
fi
linux16 /vmlinuz-3.10.0-327.10.1.v7.x86_64 root=/dev/mapper/clearos-root ro rd.lvm.lv=clearos/swap crashkernel=auto rd.lvm.lv=clearos/root rd.dm.uuid=sil_caacaecebjeb rhgb quiet systemd.log_level=debug systemd.log_target=kmsg LANG=en_US.UTF-8
initrd16 /initramfs-3.10.0-327.10.1.v7.x86_64.img
}
menuentry 'ClearOS (3.10.0-327.10.1.v7.x86_64) 7 (Final)' --class clearos --class gnu-linux --class gnu --class os --unrestricted $menuentry_id_option 'gnulinux-3.10.0-229.7.2.v7.x86_64-advanced-d612ef2b-f5cb-42af-95bf-ad1b168314a5' {
load_video
set gfxpayload=keep
insmod gzio
insmod part_msdos
insmod xfs
set root='hd0,msdos1'
if [ x$feature_platform_search_hint = xy ]; then
search --no-floppy --fs-uuid --set=root --hint='hd0,msdos1' 787e9d8c-6997-43d5-8c75-add29197ec8e
else
search --no-floppy --fs-uuid --set=root 787e9d8c-6997-43d5-8c75-add29197ec8e
fi
linux16 /vmlinuz-3.10.0-327.10.1.v7.x86_64 root=/dev/mapper/clearos-root ro rd.lvm.lv=clearos/swap crashkernel=auto rd.lvm.lv=clearos/root rd.dm.uuid=sil_caacaecebjeb rhgb quiet LANG=en_US.UTF-8
initrd16 /initramfs-3.10.0-327.10.1.v7.x86_64.img
}
menuentry 'ClearOS, with Linux 3.10.0-229.7.2.v7.x86_64' --class clearos --class gnu-linux --class gnu --class os --unrestricted $menuentry_id_option 'gnulinux-3.10.0-229.7.2.v7.x86_64-advanced-d612ef2b-f5cb-42af-95bf-ad1b168314a5' {
load_video
set gfxpayload=keep
insmod gzio
insmod part_msdos
insmod xfs
set root='hd0,msdos1'
if [ x$feature_platform_search_hint = xy ]; then
search --no-floppy --fs-uuid --set=root --hint='hd0,msdos1' 787e9d8c-6997-43d5-8c75-add29197ec8e
else
search --no-floppy --fs-uuid --set=root 787e9d8c-6997-43d5-8c75-add29197ec8e
fi
linux16 /vmlinuz-3.10.0-229.7.2.v7.x86_64 root=/dev/mapper/clearos-root ro rd.lvm.lv=clearos/swap crashkernel=auto rd.lvm.lv=clearos/root rd.dm.uuid=sil_caacaecebjeb rhgb quiet
initrd16 /initramfs-3.10.0-229.7.2.v7.x86_64.img
}
menuentry 'ClearOS, with Linux 0-rescue-d9b02baf62884e1993f90d404dc5a4b2' --class clearos --class gnu-linux --class gnu --class os --unrestricted $menuentry_id_option 'gnulinux-0-rescue-d9b02baf62884e1993f90d404dc5a4b2-advanced-d612ef2b-f5cb-42af-95bf-ad1b168314a5' {
load_video
insmod gzio
insmod part_msdos
insmod xfs
set root='hd0,msdos1'
if [ x$feature_platform_search_hint = xy ]; then
search --no-floppy --fs-uuid --set=root --hint='hd0,msdos1' 787e9d8c-6997-43d5-8c75-add29197ec8e
else
search --no-floppy --fs-uuid --set=root 787e9d8c-6997-43d5-8c75-add29197ec8e
fi
linux16 /vmlinuz-0-rescue-d9b02baf62884e1993f90d404dc5a4b2 root=/dev/mapper/clearos-root ro rd.lvm.lv=clearos/swap crashkernel=auto rd.lvm.lv=clearos/root rd.dm.uuid=sil_caacaecebjeb rhgb quiet
initrd16 /initramfs-0-rescue-d9b02baf62884e1993f90d404dc5a4b2.img
}
### END /etc/grub.d/10_linux ###
### BEGIN /etc/grub.d/20_linux_xen ###
### END /etc/grub.d/20_linux_xen ###
### BEGIN /etc/grub.d/20_ppc_terminfo ###
### END /etc/grub.d/20_ppc_terminfo ###
### BEGIN /etc/grub.d/30_os-prober ###
### END /etc/grub.d/30_os-prober ###
### BEGIN /etc/grub.d/40_custom ###
# This file provides an easy way to add custom menu entries. Simply type the
# menu entries you want to add after this comment. Be careful not to change
# the 'exec tail' line above.
### END /etc/grub.d/40_custom ###
### BEGIN /etc/grub.d/41_custom ###
if [ -f ${config_directory}/custom.cfg ]; then
source ${config_directory}/custom.cfg
elif [ -z "${config_directory}" -a -f $prefix/custom.cfg ]; then
source $prefix/custom.cfg;
fi
### END /etc/grub.d/41_custom ###
Please assist,
John -
Accepted Answer
-
Accepted Answer
Hi Peter,
That is going to be a problem. I use a hardware RAID 1 setup and during the install I used the default partitioning.
I looked at /etc/grub2.cfg, but I do not know what I am supposed to compare.
The number that repeats itself in all kernel options is: 787e9d8c-6997-43d5-8c75-add29197ec8e
Please assist,
John
Ps.
All network problems where solved after restarting the DNS with the following command:
service dnsmasq start
-
Accepted Answer
Yikes... that has something to do with partitioning and hard disks. That stuff is over my head, but here's one suggestion -- take a look at /etc/grub2.cfg. It looks like you may have already done this, but compare the "Linux 0-rescue-d9b02baf62884e1993f90d404dc5a4b2" configuration to the other boot entries that don't work. -
Accepted Answer
Hi Peter,
Thanks for responding.
When I attempt to boot with the second kernel I get the following:
Welcome to emergency mode! After logging in, type "journalctl -xb" to view
system logs, "systemctl reboot" to reboot, "systemctl default" or ^D to
try again to boot into default mode.
Give root password for maintenance
(or type Control-D to continue): [ 30.696540] device-mapper: table: 253:3: linear: dm-linear: Device lookup failed
[ 30.696930] device-mapper: table: 253:4: linear: dm-linear: Device lookup failed
Pressing Control-D eventually results in the same error as mentioned above, after briefly showing this:
Error getting authority: Error initializing authority: Could not connect: No such file or directory: (g-io-error-quark, 1)
Sometimes it shows the load bar for a few moments before returning to the same error.
When I press Alt-2 it tries to continue booting but ends with the following error:
[FAILED] Failed to start Crash recovery kernel arming.
See 'systemctl status kdump.service' for details.
I have Bandwidth and QoS installed, but I never used it before. I don't know how to disable it.
Eventually I will create another thread under network for the other issues.
Greetings,
John -
Accepted Answer
Please login to post a reply
You will need to be logged in to be able to post a reply. Login using the form on the right or register an account if you are new here.
Register Here »