Troubleshooting Common Linux Boot Issues
Diagnosing Linux Boot Problems
When a Linux system fails to successfully complete the boot process and load the operating system, there are several common culprits that should be investigated first:
Corrupted Bootloader Configuration
The bootloader is the first piece of software launched when a computer starts up. It is responsible for loading the Linux kernel into memory so the rest of the operating system can start. The most common bootloader used by Linux distributions is called GRUB (Grand Unified Bootloader).
If the GRUB configuration file becomes corrupted or modified incorrectly, it may fail to detect the Linux kernel and root filesystem on boot. Symptoms include immediate error messages about missing partitions or operating systems, or the system getting stuck at a blank screen after POST.
Failed Hard Drive Mount
In order for the Linux OS to fully load, the root filesystem must be mounted read-write so all the system binaries and libraries can be accessed. If the disk partition holding the root directory is degraded and won’t mount, start up will fail with errors about being unable to access critical files.
Missing Critical System Files
Linux relies on thousands of files that must be present under the root folder for normal operation. If critical boot components like the init daemon, systemd binary, or shared libraries required early in the process are corrupted or missing, Linux will be unable to finish booting up.
Full Root Partition
Linux needs free space in the root filesystem to function and update properly. If the disk becomes 100% full, typically from logs and temporaries taking up space after heavy usage, Linux may fail to boot with out of space errors even though space exists elsewhere in the system.
Recovering GRUB Bootloader
When GRUB has a bad configuration or doesn’t detect the Linux kernel, it must be recovered before the operating system can be started again. This requires entering a Linux live environment, mounting the installed OS partitions, and reinstalling the bootloader with good settings.
Using a Live CD to Reinstall GRUB
The easiest way to recover a broken GRUB install is to boot from a Linux live CD distribution and reconfigure everything from there. Some common choices are SystemRescueCD or Ubuntu/Debian/Fedora install media.
After booting into the live environment, the partition holding the Linux install must be mounted. This is commonly /dev/sda1 or another sda device. Use mount or df -h to identify partitions:
# mkdir /mnt/linux # mount /dev/sda1 /mnt/linux
With the Linux filesystem mounted under /mnt, GRUB can be cleanly reinstalled using a command like this:
# grub-install --root-directory=/mnt /dev/sda
The grub-install tool will scan all disks for Linux installs and add appropriate menu entries, along with installing the bootloader itself to the mbr or a partition boot sector depending on system setup.
Configuring GRUB to Detect Linux Installs
In additional to reinstalling GRUB, it’s often useful to manually specify where to search for Linux kernels. This forces GRUB to rescan disks even if it looks like no installs are found. Just chroot into the mounted disk and run:
# grub-mkconfig -o /boot/grub/grub.cfg
After exiting the chroot, unmount file systems and reboot. GRUB should now load properly and show menu entries for all installed Linux OS instances.
Checking Filesystem Integrity
If Linux will not boot due to disk errors, filesystem corruption is often the cause. The filesystem checker tool fsck can perform repairs and get damaged partitions into a bootable state again.
Using fsck to Check Filesystems
When boot issues occur, fsck should be run on each filesystem before trying to access or recover data. This verifies disk integrity and fixes common errors that could otherwise cause further problems.
fsck requires unmounting the partition to check, which generally requires a boot disk or live CD environment:
# umount /dev/sda1 # fsck -y /dev/sda1
The -y argument tells fsck to automatically attempt fixes for any errors found. Without -y, fsck will ask to confirm changes interactively – which is not possible when booted into external media.
Ideally, repeat the fsck process on each filesystem from your Linux install until they show clean and no bad blocks.
Fixing Full Root Partitions
If the root filesystem becomes completely full, Linux may fail to finish booting with errors about out of space. The solution requires identifying what is filling up the disk and removing unneeded files.
Checking Partition Usage
First, identify what path is full using df. When booted from a live CD on the troubled system, mount the Linux root FS to /mnt and run:
# df -h /mnt Filesystem Size Used Avail Use% Mounted on /dev/sda1 9.7G 9.5G 0 100% /mnt
This shows the root partition is completely full. We need to delete content until space is freed up.
Removing Unused Packages and Files
Start removing unnecessary packages installed by the package manager from within the chrooted disk environment:
# chroot /mnt # apt-get remove --purge unwanted-package
Look in other filesystem locations for log files, temp files, and cached content that can safely be deleted such as:
# rm -rf /tmp/* # rm /var/log/*.log # rm -rf ~/.cache/*
Finding and removing about 10% free space is typically enough to allow Linux to finish booting again.
Expanding Partition Size
If no additional files can be deleted, the next option is expanding the partition’s size. This requires specialized tools like parted, gparted, or fdisk. Caution should be taken when attempting partition resizing to avoid data loss or further issues.
As an example, booting from gparted media and selecting the root partition, click Resize/Move, expand the size by 10%, and click Resize/Move. After rebooting, the partition should have more free space.
Verifying Critical System Binaries
If Linux boots to a prompt but appears unstable or critical commands like ls, cp, and bash are missing or malfunctioning, system files have become corrupted.
Reinstalling System Binaries and Libraries
The package manager can be used to reinstall missing or problematic executables. First identify the broken application, for example the bash shell:
# bash bash: error while loading shared libraries: libc.so.6: cannot open shared object file: No such file or directory
Next, refresh the package index if needed, and attempt a reinstallation:
# apt-get update # apt-get install --reinstall bash
This will overwrite the current binaries and libraries with fresh copies from the repositories. Repeat this for any critical components showing errors.
For non-packaged executables, restore backups or source install replacements manually under guidance from application vendors.