CORD : How to power up a CORD POD

This guide describes how to power up a previously-installed CORD POD that has been powered down (cleanly or otherwise). The end goal of the power up procedure is a fully functioning CORD POD. The guide applies to the CORD 3.0 release.

Boot the head node
Check the head node services
Power on leaf and spine switches
Check the switches
Boot the compute nodes
Check the compute nodes
Check XOS
Check VTN
Boot OpenStack VMs

Boot the head node

Physical POD: power on the head node

CiaB: bring up the prod VM:

$ cd ~/cord/build; vagrant up prod

Check the head node services

Verify that mgmtbr and fabric interfaces are up and have IP addresses
Verify that MAAS UI is running and accessible:
1. Physical POD: http://<head-node>/MAAS
2. CiaB: http://<ciab-server>:8080/MAAS
  Troubleshooting: MAAS UI not available on CiaB
  If you are running a CiaB and there is no webserver on port 8080, it might be necessary to refresh port forwarding to the prod VM. Run ps ax|grep 8080 and look for an SSH command (will look something like this):
  31353 pts/5 S 0:00 ssh -o User=vagrant -o Port=22 -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -o PasswordAuthentication=no -o ForwardX11=no -o IdentityFile="/users/acb/cord/build/targets/cord-in-a-box/.vagrant/machines/prod/libvirt/private_key" -L *:8080:192.168.121.14:80 -N 192.168.121.14
  A workaround is to kill this process, and then copy and paste the command above into another window on the CiaB server to set up a new SSH port forwarding connection.
Verify that the following Docker containers are running: mavenrepo, switchq, automation, provisioner, generator, harvester, storage, allocator, registry
Use sudo lxc list to ensure that juju lxc containers are running. If any are stopped, use sudo lxc start <name> to restart them.
Verify that juju services are running with juju status --format tabular . They should all have a status of Unit is ready . If they do not have this status, with the service name run juju resolved --retry <servicename>/0 , and then rerun the juju status ... to see if the issue is resolved.
Run: source /opt/cord_profile/admin-openrc.sh
Verify that the following OpenStack commands work:

keystone user-list
nova list --all-tenants
neutron net-list
Troubleshooting: OpenStack commands give SSL error
Sometimes Keystone starts up in a strange state and OpenStack commands will fail with various SSL errors. To fix this, it is often sufficient to run:
```
$ ssh ubuntu@keystone sudo service apache2 restart 
```

Power on leaf and spine switches

On a physical POD: power on the switches.

On CiaB: bring up the switch VMs:

$ cd ~/cord/build; vagrant up leaf-1 leaf-2 spine-1

Check the switches

On the head node (i.e., prod VM for CiaB):

Get switch IPs by running: cord prov list
Verify that ping works for all switch IPs

Boot the compute nodes

On a physical POD and CiaB: Log into the MAAS UI and power on the compute node.

Check the compute nodes

Once the compute nodes are up:

Login to the head node
Run: source /opt/cord_profile/admin-openrc.sh
Verify that nova service-list shows the compute node as “up”.
It may take a few minutes until the node's status is updated in Nova.
Verify that you can log into the compute nodes from the head node as the ubuntu user

Check XOS

Verify that XOS UI is running and accessible:

Physical POD: http://<head-node>/xos
CiaB: http://<ciab-server>:8080/xos

If it's not working, try restarting XOS (replace rcord with the name of your profile):

$ cd /opt/cord_profile; docker-compose -p rcord restart

Check VTN

Verify that VTN is initialized correctly:

Run onos> cordvtn-nodes
Make sure the compute nodes have COMPLETE status.
Prior to rebooting existing OpenStack VMs:
1. Run onos> cordvtn-ports
2. Make sure some ports show up
3. If not, try this:
  1. onos> cordvtn-sync-neutron-states <keystone-url> admin admin <password>
  2. onos> cordvtn-sync-xos-states <xos-url> xosadmin@opencord.org <password>

Boot OpenStack VMs

To bring up OpenStack VMs that were running before the POD was shut down:

source /opt/cord_profile/admin-openrc.sh
Get list of VM IDs: nova list --all-tenants
For each VM:
1. nova start <vm-id>
2. nova console-log <vm-id>
3. Inspect the console log to make sure that the network interfaces get IP addresses.

To restart a vSG inside the vSG VM:

SSH to the vSG VM
Run: sudo rm /root/network_is_setup
Save the vSG Tenant in the XOS UI
Once the synchronizer has re-run, make sure can ping 8.8.8.8 from inside the vSG container
- ```
sudo docker exec -ti vcpe-222-111 ping 8.8.8.8 
```