CoEPP RC
 

Customise UVic's CernVM Images for OpenStack

Download Images

  • Download and unzip UVic's CernVM 2.5.1 images of version 4 (dual-hypervisor: KVM+Xen):
    • Production:
      $ wget http://repoman.heprc.uvic.ca/api/images/raw/crlb/kvm/cernvm-batch-node-2.5.1-3-1-x86_64-v4.img.gz
      $ gunzip cernvm-batch-node-2.5.1-3-1-x86_64-v4.img.gz
    • Test:
      $ wget http://repoman.heprc.uvic.ca/api/images/raw/crlb/kvm/cernvm-batch-node-test-2.5.1-3-1-x86_64-v4.img.gz
      $ gunzip cernvm-batch-node-test-2.5.1-3-1-x86_64-v4.img.gz

Rename the Images

  • Change the image names:
    $ mv cernvm-batch-node-2.5.1-3-1-x86_64-v4.img cernvm-batch-nectar-2.5.1-3-1-x86_64-v4.img
    $ mv cernvm-batch-node-2.5.1-3-1-x86_64-v4.img cernvm-batch-nectar-test-2.5.1-3-1-x86_64-v4.img

Mount CernVM Image

  • Use the following comments to mount/unmount the image for further modification:
    $ kpartx -av cernvm-batch-nectar-test-2.5.1-3-1-x86_64-v4.img
    $ mount /dev/mapper/loop0p1 /mnt
    ...
    $ umount /mnt
    $ kpartx -d cernvm-batch-nectar-test-2.5.1-3-1-x86_64-v4.img
  • Please NOTE:
    • We use the test image all through this recipe on Ubuntu 11.10 with a root permission.
    • You will have to do the same to the production image.

Change Image's Label

  • Modify the file /mnt/.image.metadata, change the name attribute as following:
    $ vi /mnt/.image.metadata
    hypervisor: kvm,xen
    name: cernvm-batch-nectar-test-2.5.1-3-1-x86_64-v4.img.gz

Change root's Password

  • Change root's password for access in case we need to boot a VM with the image locally for package installation:
    $ chroot /mnt
    $ passwd
    Changing password for user root.
    New UNIX password:
    Retype new UNIX password:
    passwd: all authentication tokens updated successfully.
    $ exit

Turn Off Firewall Rules

  • Turn off iptables service by default. The security groups would take care of access rules.
    $ chroot /mnt
    $ chkconfig iptables off
    $ chkconfig --list | grep iptables
    iptables           0:off    1:off    2:off    3:off    4:off    5:off    6:off
    $ exit

Configure OpenSSH

  • Modify the file /mnt/etc/ssh/sshd_config and make sure the following options have correct default values:
    $ vi /mnt/etc/ssh/sshd_config
    ....
    PermitRootLogin without-password
    RSAAuthentication yes
    PubkeyAuthentication yes
    PasswordAuthentication no
    ChallengeResponseAuthentication no
    UsePAM no
    ....

Turn On OpenSSH Service

  • The service sshd is turned on by default on startup. Make sure it has proper run-levels:
    $ chroot /mnt
    $ chkconfig sshd on
    $ chkconfig --list | grep sshd
    sshd               0:off    1:off    2:on    3:on    4:on    5:on    6:off
    $ exit

Remove All SSH Keys Embedded in the Image

  • It's highly recommended by Nectar to remove all ssh keys from image and use its metadata service for downloading public key and add it to the root authorized_keys file during VM instantiation.
  • To remove them:
    $ rm -rf /mnt/root/.ssh/authorized_keys
    $ rmdir /mnt/root/.ssh/

Create Startup Scripts

format_mount_vdb

We don't use this anymore as Openstack provides a formatted drive already, with the label “ephemeral0”. We just add a line in /etc/fstab to mount LABEL=ephemeral0 to /scratch

  • This script is to format and mount the on-instance/secondary/ephemeral storage of the 30 GB (per CPU core) disk.
  • Create an init script /mnt/etc/init.d/format_mount_vdb:
    #!/bin/bash
    #
    # format_mount_vdb  Set correct hostname based on FQDN after networking
    #
    # chkconfig: 2345 00 99
    # description: Format and mount the on-instance/secondary/ephemeral storage
    #              of the 30 GB (per CPU core) disk.
    #
    ### BEGIN INIT INFO
    # Provides:          format_mount_vdb
    # Required-Start:
    # Required-Stop:
    # Default-Start:     2 3 4 5
    # Default-Stop:      0 1 6
    # Short-Description: format_mount_vdb daemon, formatting and mounting the
    #                    secondary storage.
    # Description:       The format_mount_vdb daemon is a script which creates
    #                    the partition table for the secondary storage of 30
    #                    GB per CPU core disk (/dev/vdb); formats it and then
    #                    mounts it to /scratch directory.
    ### END INIT INFO
     
    lockfile=/var/log/format_mount_vdb
     
    # Carry out specific functions when asked to by the system
    case "$1" in
      start)
        if [ ! -f $lockfile ] ; then
     
          # Create partition table for the secondary disk
          (echo n; echo p; echo 1; echo ; echo; echo w) | /sbin/fdisk /dev/vdb
     
          # Format and mount the secondary disk
          /bin/sleep 3 && /sbin/mkfs.ext2 -L blankpartition0 /dev/vdb1 && /bin/mount -t ext2 /dev/vdb1 /scratch
     
     
          touch $lockfile
        fi
        ;;
      stop)
        echo "Stopping script format_mount_vdb"
        ;;
      status)
        if [ -f $lockfile ]; then
          echo "The secondary storage /dev/vdb1 has been formatted and mounted into /scratch by format_mount_vdb."
        else
          echo "The secondary storage /dev/vdb1 hasn't been formatted and mounted into /scratch by format_mount_vdb."
        fi
        ;;
      *)
        echo "Usage: /etc/init.d/format_mount_vdb {start|stop|status}"
        exit 1
        ;;
    esac
     
    exit 0
  • Set correct file permissions:
    $ chmod 755 /mnt/etc/init.d/format_mount_vdb
  • Create symbolic links:
    $ chroot /mnt
    $ ln -s /etc/init.d/format_mount_vdb /etc/rc.d/rc0.d/K99format_mount_vdb
    $ ln -s /etc/init.d/format_mount_vdb /etc/rc.d/rc1.d/K99format_mount_vdb
    $ ln -s /etc/init.d/format_mount_vdb /etc/rc.d/rc2.d/S00format_mount_vdb
    $ ln -s /etc/init.d/format_mount_vdb /etc/rc.d/rc3.d/S00format_mount_vdb
    $ ln -s /etc/init.d/format_mount_vdb /etc/rc.d/rc4.d/S00format_mount_vdb
    $ ln -s /etc/init.d/format_mount_vdb /etc/rc.d/rc5.d/S00format_mount_vdb
    $ ln -s /etc/init.d/format_mount_vdb /etc/rc.d/rc6.d/K99format_mount_vdb
    $ exit
  • Enable the init script:
    $ chroot /mnt
    $ chkconfig --add format_mount_vdb
    $ chkconfig --list | grep format_mount_vdb
    format_mount_vdb    0:off    1:off    2:on    3:on    4:on    5:on    6:off
    $ exit

fix_hostname

  • This script is to fix the wrong default hostname assigned by Nectar. In Nectar cloud platform, they haven't provided a DNS service for VMs yet. A solution of DNS provision is under investigation and it will be rolled out to all Nectar cloud nodes at some point. Till then, this script is used as a workaround.
  • Create an init script /mnt/etc/init.d/fix_hostname:
    #!/bin/bash
    #
    # fix_hostname  Set correct hostname based on FQDN after networking
    #
    # chkconfig: 2345 11 91
    # description: Set correct hostname based on FQDN at boot time.
    #
    ### BEGIN INIT INFO
    # Provides:          fix_hostname
    # Required-Start:    $network
    # Required-Stop:
    # Default-Start:     2 3 4 5
    # Default-Stop:      0 1 6
    # Short-Description: fix_hostname daemon, setting correct hostname FQDN
    # Description:       The fix_hostname daemon is a script which fix the wrong
    #   hostname assigned by Nectar cloud by default.  We want it to be active
    #   in runlevels 2, 3, 4 and 5, as these are the runlevels as the networking
    #   service.
    ### END INIT INFO
     
    lockfile=/var/log/fix_hostname
     
    # Carry out specific functions when asked to by the system
    case "$1" in
      start)
        if [ ! -f $lockfile ] ; then
     
          echo "Starting script fix_hostname"
          # EC2 Metadata server
          EC2_METADATA=169.254.169.254
     
          # Get the IP address assigned
          # Note: The IP addresses assigned in the Research Cloud are listed as
          #       private-ips as they are automatically assigned rather than elastic
          #       IPs.
          IP_ADDRESS=`curl -m 10 -s http://$EC2_METADATA/latest/meta-data/local-ipv4`
     
          # Get host name based on FQDN
          # eg. host 115.146.94.139 | grep 'domain name pointer' | awk '{print $5}'
          # eg. nslookup 115.146.94.139 | grep 'name =' | awk '{print $4}'
          EXTHOSTNAME=`nslookup $IP_ADDRESS | grep 'name =' | awk '{print $4}'`
          # Remove last character
          EXTHOSTNAME=${EXTHOSTNAME%?}
     
          # Fix hostname settings
          echo $IP_ADDRESS $EXTHOSTNAME >> /etc/hosts
          sed -i "s/^HOSTNAME=.*$/HOSTNAME=$EXTHOSTNAME/" /etc/sysconfig/network
     
          # Set hostname to correct host FQDN and set it to external
          echo $EXTHOSTNAME
          hostname $EXTHOSTNAME
          touch $lockfile
     
        fi
        ;;
      stop)
        echo "Stopping script fix_hostname"
        #rm -f $lockfile
        ;;
      status)
        if [ -f $lockfile ]; then
          echo "Real hostname has been set to external by fix_hostname."
        else
          echo "Hostanme hasn't been changed by fix_hostname."
        fi
        ;;
      *)
        echo "Usage: /etc/init.d/fix_hostname {start|stop|status}"
        exit 1
        ;;
    esac
     
    exit 0
  • Set correct file permissions:
    $ chmod 755 /mnt/etc/init.d/fix_hostname
  • Create symbolic links:
    $ chroot /mnt
    $ ln -s /etc/init.d/fix_hostname /etc/rc.d/rc0.d/K91fix_hostname
    $ ln -s /etc/init.d/fix_hostname /etc/rc.d/rc1.d/K91fix_hostname
    $ ln -s /etc/init.d/fix_hostname /etc/rc.d/rc2.d/S11fix_hostname
    $ ln -s /etc/init.d/fix_hostname /etc/rc.d/rc3.d/S11fix_hostname
    $ ln -s /etc/init.d/fix_hostname /etc/rc.d/rc4.d/S11fix_hostname
    $ ln -s /etc/init.d/fix_hostname /etc/rc.d/rc5.d/S11fix_hostname
    $ ln -s /etc/init.d/fix_hostname /etc/rc.d/rc6.d/K91fix_hostname
    $ exit
  • Enable the init script:
    $ chroot /mnt
    $ chkconfig --add fix_hostname
    $ chkconfig --list | grep fix_hostname
    fix_hostname    0:off    1:off    2:on    3:on    4:on    5:on    6:off
    $ exit

rc.local

  • Create a startup script as /mnt/etc/rc.d/rc.local to do the following tasks:
    • Download SSH public key from Openstack Metadata service and store in place during VM startup.
    • Format and mount the 30 GB (per CPU core) ephemeral disk.
    • Fix hostname issue.
  • Simply copy the source code listed below into /mnt/etc/rc.d/rc.local:
    #!/bin/sh
    #
    # This script will be executed *after* all the other init scripts.
    # You can put your own initialization stuff in here if you don't
    # want to do the full Sys V style init stuff.
     
    touch /var/lock/subsys/local
     
    #------------------------------------------------------------------#
    #     Download SSH public key from Openstack Metadata service      #
    #------------------------------------------------------------------#
     
    # Create .ssh directory if it doesn't exit
    if [ ! -d /root/.ssh ] ; then
      mkdir -p /root/.ssh
      chmod 700 /root/.ssh
    fi
     
    # Fetch public key from Nectar cloud metadata service
    wget -O /tmp/my-key http://169.254.169.254/latest/meta-data/public-keys/0/openssh-key
    if [ $? -eq 0 ] ; then
      cat /tmp/my-key >> /root/.ssh/authorized_keys
      chmod 600 /root/.ssh/authorized_keys
      rm /tmp/my-key
    fi
  • Make sure /mnt/etc/rc.d/rc.local is executable:
    $ ll /mnt/etc/rc.d/rc.local
    -rwxr-xr-x 1 root root 1596 2012-11-19 21:52 /mnt/etc/rc.d/rc.local

Fix Python Version

  • There are two python versions existing in the image and CernVM uses Python 2.7.3 which is not the one we need to use:
    $ chroot /mnt
    $ /usr/local/bin/python -V
    Python 2.7.3
    $ /usr/bin/python -V
    Python 2.4.3
    $ which python
    /usr/local/bin/python
    $ python -V
    Python 2.7.3
    $ exit

New way of dealing with multiple Python versions

The only script which references /usr/local/bin/python is /usr/local/bin/contexthelper. So I did this to make sure contexthelper could still run

rm /usr/local/bin/python
cp /usr/bin/python /usr/local/bin/python-cs
vi /usr/local/bin/contexthelper

   Change:

   #!/usr/local/bin/python

   To:

   #!/usr/local/bin/python-cs

Rucio pilots

Default CERNVM was not working with Rucio pilots. This was caused by Python not being able to import the 'hashlib' module. On regular SL5, Python2.4 also doesn't have the 'hashlib' module, but 'md5' was used as a fallback. 'md5' wasn't working on the CERNVM image either, so 'hashlib' was installed.

Sean compiled hashlib and copied the files into /usr/lib64/python2.4/site-packages

Hashlib was downloaded from here - https://pypi.python.org/pypi/hashlib/20081119. The file downloaded was hashlib-20081119.zip

unzip hashlib-20081119.zip
cd hashlib-20081119
python setup.py build
python setup.py install

This installs into /usr/lib64/python2.4/site-packages:

_md5.so
_sha.so
_sha256.so
_sha512.so
hashlib.py
hashlib.pyc

You should now be able to import the module hashlib into python

[root@vm-118-138-241-196 build]# python
Python 2.4.3 (#1, Sep 20 2011, 06:04:31)
[GCC 4.1.2 20080704 (Red Hat 4.1.2-51)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from hashlib import md5
>>>

CVMFS Configuration

  • Specify CVMFS HTTP proxy and CVMFS server to use:
    $ vi /mnt/etc/cvmfs/default.local
    CVMFS_REPOSITORIES=atlas.cern.ch,atlas-condb.cern.ch,grid.cern.ch
    CVMFS_QUOTA_LIMIT=20000
    CVMFS_HTTP_PROXY="http://rcsquid1.atlas.unimelb.edu.au:3128|http://rcsquid2.atlas.unimelb.edu.au:3128;http://cernvm-webfs.atlas-canada.ca:3128"
    CVMFS_CACHE_BASE=/scratch/cvmfs
    $ vi /mnt/etc/cvmfs/domain.d/cern.ch.local
    CVMFS_SERVER_URL="http://cvmfs.fnal.gov:8000/opt/@org@;http://cvmfs.racf.bnl.gov:8000/opt/@org@;http://cernvmfs.gridpp.rl.ac.uk:8000/opt/@org@;http://cvmfs-stratum-one.cern.ch:8000/opt/@org@"

Fix Failed VM Instance Registration with Condor (In progress)

  • There are some workaround to enable registration of VM instances on Nectar cloud with Condor:
    • Modify the Condor init script, /mnt/etc/init.d/condor
      • Add the following funtion into /mnt/etc/init.d/condor:
        setup_on_nectar() {
                local_file=`get_condor_config_val LOCAL_CONFIG_FILE`
         
                # Get the IP address assigned by Openstack
                # Note: The IP addresses assigned in the Nectar Cloud are listed as
                #       private-ips as they are automatically assigned rather than
                #       elastic IPs.
                public_ip=`curl -m 10 -s http://$EC2_METADATA/latest/meta-data/local-ipv4`
                test $public_ip != "0.0.0.0" > /dev/null 2>&1
                HAS_PUBLIC_IP=$?
         
                curl -m 10 -s http://$EC2_METADATA/ >/dev/null 2>&1
                IS_EC2=$?
                if [ $IS_EC2 -eq 0 ] ; then
         
                        # Get host name based on FQDN
                        EXTHOSTNAME=`nslookup $public_ip | grep 'name =' | awk '{print $4}'`
                        # Remove last character
                        EXTHOSTNAME=${EXTHOSTNAME%?}
         
                        # Fix hostname settings
                        echo $public_ip $EXTHOSTNAME >> /etc/hosts
                        sed -i "s/^HOSTNAME=.*$/HOSTNAME=$EXTHOSTNAME/" /etc/sysconfig/network
         
                        # Set hostname to correct host FQDN and set it to external
                        hostname $EXTHOSTNAME
         
                        if [ $HAS_PUBLIC_IP -eq 0 ] ; then
                            private_network_name=nectar-`curl -s http://$EC2_METADATA/latest/meta-data/placement/availability-zone`
                            replace_or_append "PRIVATE_NETWORK_NAME" "PRIVATE_NETWORK_NAME=$private_network_name" $local_file
         
                            tcp_forwarding_host=`curl -s http://$EC2_METADATA/latest/meta-data/local-ipv4`
                            replace_or_append "TCP_FORWARDING_HOST" "TCP_FORWARDING_HOST=$tcp_forwarding_host" $local_file
         
                            private_network_interface=`curl -s http://$EC2_METADATA/latest/meta-data/local-ipv4`
                            replace_or_append "PRIVATE_NETWORK_INTERFACE" "PRIVATE_NETWORK_INTERFACE=$private_network_interface" $local_file
                        else
                            private_network_interface=`curl -s http://$EC2_METADATA/latest/meta-data/local-ipv4`
                            replace_or_append "PRIVATE_NETWORK_INTERFACE" "PRIVATE_NETWORK_INTERFACE=$private_network_interface" $local_file
                        fi
                fi
        }
      • Modify start() function in /mnt/etc/init.d/condor:
                # setup_on_ec2
                setup_on_nectar
    • Install ec2contexthelper script in the image.
      • Boot a VM instance from the image you are currently working on. You can use virt-manager to instantiate it.
      • After the VM is booted, use the root account and its password you just created in previous step to log into the VM instance.
      • Get the source code from https://github.com/hep-gc/cloud-scheduler/tree/master/scripts/ec2contexthelper and install it in the image.
        $ mkdir /root/Git/ec2contexthelper
        $ cd /root/Git/ec2contexthelper
        $ wget https://raw.github.com/hep-gc/cloud-scheduler/master/scripts/ec2contexthelper/context --no-check-certificate
        $ wget https://raw.github.com/hep-gc/cloud-scheduler/master/scripts/ec2contexthelper/contexthelper --no-check-certificate
        $ wget https://raw.github.com/hep-gc/cloud-scheduler/master/scripts/ec2contexthelper/setup.py --no-check-certificate
        $ python setup.py install
        $ chkconfig context on
        $ chmod a+x /etc/init.d/context
      • Fix the line at the top of /etc/init.d/context:
        $ vi /etc/init.d/context
        ...
        CONTEXT_HELPER="/usr/local/bin/contexthelper"
        ...
      • shutdown the VM instance.
        $ shutdown -h now

Modify .bashrc for users

  • Modify the file .bashrc under each atlas user's home directory (eg. /home/atlas01/.bashrc);
# Workaround for condor not setting $HOME.
# voms-proxy-info requires this.
if [[ -z "$HOME" ]] ; then
  export HOME=/home/`whoami`
fi

## Set up grid environment:
## Option 1: gLite 3.1 in CernVM
#. /opt/external/etc/profile.d/grid-env.sh
## Option 2: gLite 3.2 in AtlasLocalRootBase
shopt -s expand_aliases
export ATLAS_LOCAL_ROOT_BASE=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase
alias setupATLAS='source ${ATLAS_LOCAL_ROOT_BASE}/user/atlasLocalSetup.sh'
setupATLAS --quiet
localSetupEmi
# Fix for using AtlasLocalRootBase with a kit
unset  AtlasSetupSite
rm ~/.asetup

# Site-specific variables (e.g. Frontier and Squid servers)
# are set based on ATLAS_SITE_NAME.
export ATLAS_SITE_NAME=Australia-NECTAR
# This auto-setup is only temporarily needed, and will soon become automatic
. /cvmfs/atlas.cern.ch/repo/sw/local/bin/auto-setup
  • Major changes are the ATLAS_SITE_NAME, and the change to localSetupEmi instead of localSetupGlite

Repair swapfile

  • The original swapfile /mnt/var/swap in the image is corrupt and can't be used. The following is the testing result from a test VM instance booted from the original image:
    [root@i-000029ac ~]# swapon /var/swap
    swapon: Skipping file /var/swap - it appears to have holes.
  • Create a new 1GB swapfile and copy it into image:
    $ dd if=/dev/zero of=/var/swap bs=1024 count=1048576
    $ mkswap -c -v1 /var/swap
    $ mv /mnt/var/swap /mnt/var/swap.old
    $ mv /var/swap /mnt/var/swap

Network Checking List

  • There are a couple of things you need to check in your image.

Remove the network persistence rules

  • If you don't remove the network persistence rules then the network interface will not come up as eth0 and you won't be able to connect to it in the cloud.
  • Simply check if the network persistence rules exist in your image. If so, then delete it:
    $ rm -rf /mnt/etc/udev/rules.d/70-persistent-net.rules

Remove the line of HWADDR= from eth0 configuration file

  • The operating system records the MAC address of the virtual ethernet card in /etc/sysconfig/network-scripts/ifcfg-eth0 during the instance process. However, each time the image boots up, the virtual ethernet card will have a different MAC address, so this information must be deleted from the configuration file.
  • Edit /mnt/etc/sysconfig/network-scripts/ifcfg-eth0 and remove the HWADDR= line.
    DEVICE=eth0
    BOOTPROTO=dhcp
    NM_CONTROLLED=yes
    ONBOOT=yes

Filesystem Performance Tuning (Future work)

EXT4 Filesystem

  • Use virt-manager to boot a VM instance and log in with root password.
  • Install package e4fsprogs:
    $ conary install e4fsprogs:runtime

XFS Filesystem

  • Use virt-manager to boot a VM instance and log in with root password.
  • Install package xfsprogs:
    $ conary install xfsprogs:runtime
cloud/image_cernvm_2.5.1.txt · Last modified: 2013/07/25 15:31 by scrosby
 
Except where otherwise noted, content on this wiki is licensed under the following license: CC Attribution-Share Alike 4.0 International
Recent changes RSS feed Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki