This guide describes how to power up a previously-installed CORD POD that has been powered down (cleanly or otherwise).  The end goal of the power up procedure is a fully functioning CORD POD.  The guide applies to the CORD 3.0 release.  

Boot the head node

Physical  POD: power on the head node

CiaB: bring up the prod VM:

$ cd ~/cord/build; vagrant up prod

Check the head node services

  1. Verify that mgmtbr and fabric interfaces are up and have IP addresses

  2. Verify that MAAS UI is running and accessible:

    1. Physical POD: http://<head-node>/MAAS

    2. CiaB: http://<ciab-server>:8080/MAAS

      Troubleshooting: MAAS UI not available on CiaB 

      If you are running a CiaB and there is no webserver on port 8080, it might be necessary to refresh port forwarding to the prod VM.  Run ps ax|grep 8080 and look for an SSH command (will look something like this):

      31353 pts/5    S      0:00 ssh -o User=vagrant -o Port=22 -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -o PasswordAuthentication=no -o ForwardX11=no -o IdentityFile="/users/acb/cord/build/targets/cord-in-a-box/.vagrant/machines/prod/libvirt/private_key" -L *:8080:192.168.121.14:80 -N 192.168.121.14

      A workaround is to kill this process, and then copy and paste the command above into another window on the CiaB server to set up a new SSH port forwarding connection.

  3. Verify that the following Docker containers are running: mavenrepo, switchq, automation, provisioner, generator, harvester, storage, allocator, registry

  4. Use sudo lxc list to ensure that juju lxc containers are running. If any are stopped, use sudo lxc start <name> to restart them.
  5. Verify that juju services are running with juju status --format tabular . They should all have a status of Unit is ready . If they do not have this status, with the service name run  juju resolved --retry <servicename>/0 , and then rerun the juju status ... to see if the issue is resolved.
  6. Run:  source /opt/cord_profile/admin-openrc.sh

  7. Verify that the following OpenStack commands work:

    1. keystone user-list

    2. nova list --all-tenants

    3. neutron net-list

      Troubleshooting: OpenStack commands give SSL error

      Sometimes Keystone starts up in a strange state and OpenStack commands will fail with various SSL errors.  To fix this, it is often sufficient to run:

      $ ssh ubuntu@keystone sudo service apache2 restart 

Power on leaf and spine switches

On a physical POD: power on the switches.  

On CiaB: bring up the switch VMs:

$ cd ~/cord/build; vagrant up leaf-1 leaf-2 spine-1

Check the switches

On the head node (i.e., prod VM for CiaB):

  1.  Get switch IPs by running: cord prov list
  2. Verify that ping works for all switch IPs 

Boot the compute nodes

On a physical POD and CiaB: Log into the MAAS UI and power on the compute node.

Check the compute nodes

Once the compute nodes are up:

  1. Login to the head node
  2. Run: source /opt/cord_profile/admin-openrc.sh
  3. Verify that nova service-list shows the compute node as “up”.

    It may take a few minutes until the node's status is updated in Nova.
  4. Verify that you can log into the compute nodes from the head node as the ubuntu user

Check XOS

Verify that XOS UI is running and accessible:

  • Physical POD: http://<head-node>/xos

  • CiaB: http://<ciab-server>:8080/xos

If it's not working, try restarting XOS (replace rcord with the name of your profile):
$ cd /opt/cord_profile; docker-compose -p rcord restart

Check VTN

Verify that VTN is initialized correctly:
  1. Run onos> cordvtn-nodes
  2. Make sure the compute nodes have COMPLETE status.
  3. Prior to rebooting existing OpenStack VMs:
    1. Run onos> cordvtn-ports
    2. Make sure some ports show up
    3. If not, try this:
      1. onos> cordvtn-sync-neutron-states <keystone-url> admin admin <password>
      2. onos> cordvtn-sync-xos-states <xos-url> xosadmin@opencord.org <password>

Boot OpenStack VMs

To bring up OpenStack VMs that were running before the POD was shut down:

  1. source /opt/cord_profile/admin-openrc.sh
  2. Get list of VM IDs: nova list --all-tenants
  3. For each VM:
    1. nova start <vm-id>
    2. nova console-log <vm-id>
    3. Inspect the console log to make sure that the network interfaces get IP addresses.

To restart a vSG inside the vSG VM:

  • SSH to the vSG VM

  • Run: sudo rm /root/network_is_setup

  • Save the vSG Tenant in the XOS UI

  • Once the synchronizer has re-run, make sure can ping 8.8.8.8 from inside the vSG container

    • sudo docker exec -ti vcpe-222-111 ping 8.8.8.8