Troubleshooting Common Linux Boot Issues

Diagnosing Linux Boot Problems

When a Linux system fails to successfully complete the boot process and load the operating system, there are several common culprits that should be investigated first:

Corrupted Bootloader Configuration

The bootloader is the first piece of software launched when a computer starts up. It is responsible for loading the Linux kernel into memory so the rest of the operating system can start. The most common bootloader used by Linux distributions is called GRUB (Grand Unified Bootloader).

If the GRUB configuration file becomes corrupted or modified incorrectly, it may fail to detect the Linux kernel and root filesystem on boot. Symptoms include immediate error messages about missing partitions or operating systems, or the system getting stuck at a blank screen after POST.

Failed Hard Drive Mount

In order for the Linux OS to fully load, the root filesystem must be mounted read-write so all the system binaries and libraries can be accessed. If the disk partition holding the root directory is degraded and won’t mount, start up will fail with errors about being unable to access critical files.

Missing Critical System Files

Linux relies on thousands of files that must be present under the root folder for normal operation. If critical boot components like the init daemon, systemd binary, or shared libraries required early in the process are corrupted or missing, Linux will be unable to finish booting up.

Full Root Partition

Linux needs free space in the root filesystem to function and update properly. If the disk becomes 100% full, typically from logs and temporaries taking up space after heavy usage, Linux may fail to boot with out of space errors even though space exists elsewhere in the system.

Recovering GRUB Bootloader

When GRUB has a bad configuration or doesn’t detect the Linux kernel, it must be recovered before the operating system can be started again. This requires entering a Linux live environment, mounting the installed OS partitions, and reinstalling the bootloader with good settings.

Using a Live CD to Reinstall GRUB

The easiest way to recover a broken GRUB install is to boot from a Linux live CD distribution and reconfigure everything from there. Some common choices are SystemRescueCD or Ubuntu/Debian/Fedora install media.

After booting into the live environment, the partition holding the Linux install must be mounted. This is commonly /dev/sda1 or another sda device. Use mount or df -h to identify partitions:

# mkdir /mnt/linux
# mount /dev/sda1 /mnt/linux

With the Linux filesystem mounted under /mnt, GRUB can be cleanly reinstalled using a command like this:

# grub-install --root-directory=/mnt /dev/sda

The grub-install tool will scan all disks for Linux installs and add appropriate menu entries, along with installing the bootloader itself to the mbr or a partition boot sector depending on system setup.

Configuring GRUB to Detect Linux Installs

In additional to reinstalling GRUB, it’s often useful to manually specify where to search for Linux kernels. This forces GRUB to rescan disks even if it looks like no installs are found. Just chroot into the mounted disk and run:

# grub-mkconfig -o /boot/grub/grub.cfg

After exiting the chroot, unmount file systems and reboot. GRUB should now load properly and show menu entries for all installed Linux OS instances.

Checking Filesystem Integrity

If Linux will not boot due to disk errors, filesystem corruption is often the cause. The filesystem checker tool fsck can perform repairs and get damaged partitions into a bootable state again.

Using fsck to Check Filesystems

When boot issues occur, fsck should be run on each filesystem before trying to access or recover data. This verifies disk integrity and fixes common errors that could otherwise cause further problems.

fsck requires unmounting the partition to check, which generally requires a boot disk or live CD environment:

# umount /dev/sda1 
# fsck -y /dev/sda1

The -y argument tells fsck to automatically attempt fixes for any errors found. Without -y, fsck will ask to confirm changes interactively – which is not possible when booted into external media.

Ideally, repeat the fsck process on each filesystem from your Linux install until they show clean and no bad blocks.

Fixing Full Root Partitions

If the root filesystem becomes completely full, Linux may fail to finish booting with errors about out of space. The solution requires identifying what is filling up the disk and removing unneeded files.

Checking Partition Usage

First, identify what path is full using df. When booted from a live CD on the troubled system, mount the Linux root FS to /mnt and run:

# df -h /mnt
Filesystem             Size  Used Avail Use% Mounted on
/dev/sda1             9.7G  9.5G     0 100% /mnt

This shows the root partition is completely full. We need to delete content until space is freed up.

Removing Unused Packages and Files

Start removing unnecessary packages installed by the package manager from within the chrooted disk environment:

  
# chroot /mnt
# apt-get remove --purge unwanted-package

Look in other filesystem locations for log files, temp files, and cached content that can safely be deleted such as:

# rm -rf /tmp/*
# rm /var/log/*.log
# rm -rf ~/.cache/*

Finding and removing about 10% free space is typically enough to allow Linux to finish booting again.

Expanding Partition Size

If no additional files can be deleted, the next option is expanding the partition’s size. This requires specialized tools like parted, gparted, or fdisk. Caution should be taken when attempting partition resizing to avoid data loss or further issues.

As an example, booting from gparted media and selecting the root partition, click Resize/Move, expand the size by 10%, and click Resize/Move. After rebooting, the partition should have more free space.

Verifying Critical System Binaries

If Linux boots to a prompt but appears unstable or critical commands like ls, cp, and bash are missing or malfunctioning, system files have become corrupted.

Reinstalling System Binaries and Libraries

The package manager can be used to reinstall missing or problematic executables. First identify the broken application, for example the bash shell:

# bash
bash: error while loading shared libraries: libc.so.6: cannot open shared object file: No such file or directory

Next, refresh the package index if needed, and attempt a reinstallation:

# apt-get update
# apt-get install --reinstall bash

This will overwrite the current binaries and libraries with fresh copies from the repositories. Repeat this for any critical components showing errors.

For non-packaged executables, restore backups or source install replacements manually under guidance from application vendors.

Leave a Reply

Your email address will not be published. Required fields are marked *