Results 1 to 4 of 4

Thread: RH5 Bare Metals Failure

  1. #1
    Junior Member
    Join Date
    Dec 2010
    Location
    Rochester, NY
    Posts
    7

    RH5 Bare Metals Failure

    Hi There;

    I am a fairly new user to Unitrends, and I hit a snag trying to do a Bare Metals restore of a Red Hat 5 Web server. This is going to get long, but I want to try to document the whole situation

    The server (called Web1) was giving me trouble, so I made a Virtual clone of it, took it offline, and moved it to my testing network. I renamed the server (Web2) and gave it a new IP address to correspond with the new subnet and not conflict with the Virtual clone. I was able to get the server up long enough to make a Bare Metals ISO disk (with the credentials as Web2 and the test network IP) as well as replace the failing hard drives.

    I moved the server back onto the production network, changed its name and IP address back to Web1, took the Virtual machine offline, and started regular backups with the Unitrends agent. Of course, a few days later, the server failed again, and this time in an unrecoverable state.

    I again brought up the Virtual copy of the server as Web1, moved the physical server to the test network, and replaced all of the hard drives in the failing RAID. I restarted the server with the Bare Metals CD. The the Bare Metals Interface came up and I tried the Test. The test failed, saying (paraphrasing) iit was unable to make TCP connection. I checked the Hosts file and it was correct, I checked the IP/Gateway/Unit IPs and the were all correct (showed the Test network address and Web2). I dropped into the shell and confirmed the right NIC had the right IP address. I was able to ping my own address and my gateway, but every time I tried to ping off the subnet (namely the Unitrends unit), I kept getting a Destination Unreachable error.

    So Unitrends support was closed yesterday, so I would at least try manually rebuilding the filesystems. I did the Restore > Format Partitions which seemed to work. I tried to then create the filesystems, but it errored out into what looked to me like a fsck command. It stalled on inode blocks multiple times and I finally bailed out of the process (maybe not the best idea).

    This morning I attempted to start the server again from the Bare Metals CD, but now I can't even get to the Restore interface. Below is a summation of the error messages that flash by as best as I can catch them, be basically it just gives me a bash interface and fails to start.

    So now I'm at a loss and not sure how to proceed besides starting with the RH5 install and try to do a folder-by-folder restore.



    modprobe: fatal error inserting hid_dummy

    udevd-event[3264]: wait_for_sysfs: waiting for /sys/devices/ ... ioerr_cnt failed

    /sbin/init: line 59: 3378 segmentation fault

    mdadm: no arrays found in config file

    /dev/cdrom: open failed: Read-only file system

    can't find device uuid

    refusing activation of partial LV LogVol00 use --partial to override

    Found volume groups Volgroup00
    Found volume groups Volgroup01

    Starting SSH
    could not load host key /etc/ssh/ssh_host_key
    Disabling protocol version 1

  2. #2
    Junior Member
    Join Date
    Dec 2010
    Location
    Rochester, NY
    Posts
    7
    So to continue with the saga:

    On the suggestion from Unitrends tech support, I moved the Backup appliance to the same subnet as the virtual Web server (with the correct name and IP address), made a new Bare Metals ISO using the virtual version on the Web server, and then connected just the physical server and the backup appliance using a small hub. So now the information for the physical web server contained on the Bare Metals ISO corresponded to the correct information for the Web server client information on the Backup unit as well as them being on the same subnet. Unfortunately, the Bare Metals ISO still did not load properly. I then tried booting a completely different server from both of the Bare Metals ISO disks and I loaded the Bare Metals interface with no problem. This did show that something was written to the disks on the Web server during my attempts at manual recovery that the ISO was reading and caused the ISO to load and run properly.

    I ended up simply reinstalling Red Hat on the physical server and did a file-by-file restore and made the necessary configuration changes to get the Web server back online. Not the ideal situation, but at least it is now up and running.

    Has anyone else had problems with the Linux Bare Metals not communicating via a gateway or across a subnet?

  3. #3
    Junior Member
    Join Date
    Dec 2010
    Location
    Rochester, NY
    Posts
    7
    I then decided to do a little testing. I took an old server, connected it to my DMZ (where the web servers are) subnet and put a fresh copy of Red Hat on it. I installed the Unitrends Agent on it, confirmed that it was separated by a firewall, opened the necessary 1743 and 1745 ports on the server's firewall software, and set up the client on the server. I then successfully did a Master backup of the test server and created a Bare Metals ISO disk for it. I then booted the server from the disk and again, it was unable to communicate with anything off the local subnet and the test. For these tests I left the Unitrends appliance on version 5.0.2-1 (which was were it was at when working with the Web servers). I then upgraded the Unit to 5.1.0 (which supposedly included updated to the Bare Metals process) and repeated the entire test (including making a new Bare Metals ISO disk). I ended up with the same results; communication was fine between the test server and and the backup appliance when it was booted into the OS, but failed failed to communicate when booted to the Bare Metals ISO.

    So that's where I am so far. My plan today is to move my test server to the same subnet as the Backup appliance, make a new Bare Metals ISO ant try doing a restore from there and see if it works.

  4. #4
    Junior Member
    Join Date
    Dec 2010
    Location
    Rochester, NY
    Posts
    7
    Well, I did some more testing today. I moved my test server from the DMZ network to an internal subnet. My goal was to see if the problem might be with the fact that there was a firewall issue and I had a port closed that the Recovery process used (even though Backups and Bare Metal ISO creation was working fine). That would not explain the inability to Ping between the Backup appliance and the server booted into the ISO, but it was worth a shot. So now the test server and the Backup appliance were only separated by a Layer three switch that is only acting as a Gateway and is doing no filtering or packet inspection. I created a new Bare Metals ISO disk with the new credentials. I booted the server with the new disk, confirmed it had the right network settings, and again the ISO was unable to contact anything off the local subnet. So that at least ruled out that it was an issue with the firewall.

    Finally, I moved the test server to the SAME subnet at the Backup appliance, burned a new Bare Metals ISO, and booted up the server from that disk. I did notice something different flash by on the screen when the Bare Metals Recovery interface was loading; there was a moments when I saw IP Address, Subnet Mask, and Gateway go by all set to 0.0.0.0. Now I can't say for certain if that was present when I started up any of the other ISOs, but I thought that might be a clue. So, from the Bare Metals Recovery interface I was able to successfully run a Test and start the Restore process.

  5. #5
    Junior Member
    Join Date
    Dec 2010
    Location
    Rochester, NY
    Posts
    7
    I think I have finally figured out what is going n here. With some more tests and a few discussions with Tech Support, it would seem that the BareMetals OS really does not have a Default Gateway. From the Hot Bare Metal menu on I looked under "View Info" --> "Network routes", this was the result:

    Networks:
    eth0 10.x.x.149 10.x.x.255 255.255.255.0 43100000
    lo 127.0.0.1 255.0.0.0 255.0.0.0 49000000
    Routes:
    0.0.0.0 0.0.0.0 10.x.x.254 09cd09c0
    10.x.x.0 255.255.255.0 0.0.0.0 09cd0900
    169.254.0.0 255.255.0.0 0.0.0.0 09cd0960

    So to me that sure looked like the default route (0.0.0.0 0.0.0.0) was pointing to the correct gateway address. On the advice of Tech Support, I dropped into the shell of the boot CD and looked at the routes there. From the shell, the "route" command returned this:

    sh-3.2#route
    Kernel IP routing table
    Destination Gateway Genmask Flags Metric Ref Use Iface
    10.x.x.0 * 255.255.255.0 U 0 0 0 eth0

    No default gateway address. I manually added the address and checked the routes again:

    sh-3.2#route add default gw 10.x.x.254
    sh-3.2#route
    Kernel IP routing table
    Destination Gateway Genmask Flags Metric Ref Use Iface
    10.1.1.0 * 255.255.255.0 U 0 0 0 eth0
    default 10.1.1.254 0.0.0.0 UG 0 0 0 eth0

    After adding the route, I was able to communicate with devices off-subnet and the DPU. I have not tried a test Restore yet (that is next), but I am pretty confident that this should at least be a easy work-around for now.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •