The most current version of this doc is here: https://docs.google.com/document/d/1nou2c8AsRzhaDJmA_eYvFgd0Y33KiCsioveU77AOVCI/edit#
Best to comment in the google-doc or on the SEBA mailing list
Assumptions
Target for Fall 2018 described below. Derived from SEBA FCAPS User Stories.pdf
All software and hardware pre-installed as a SEBA Pod (possibly by system-integrator)
This includes internal physical connectivity and internal private addressing between software and hardware components.
SEBA Pod internally includes
NEM, VOLTHA and ONOS services deployed using Kubernetes (k8s) using internal (local) docker registry
1 physical AGG switch (Tomahawk based)
Up to 16 EdgeCore OLTs with a single NNI port connected to the AGG switch and up to 64 ONUs on each PON port
3 compute nodes connected to AGG switch
1 GE management switch to which all OLTs, AGG and compute node management ports are connected. This management switch is ONLY for internal pod connectivity; it has no connectivity outside of the pod
In the AT&T workflow, there will be a Gateway VM instantiated on one of the compute nodes (outside of k8s deployment)
In the AT&T workflow, there will be an additional container which will present one or more Abstract OLT view(s) of the Pod (see details below). SEBA is constructed such that other operators can choose to use the Abstract OLT view or choose to directly use the NEM northbound APIs.
Pod will be racked up and connected to external BNG by operator
Pod’s Gateway VM will dhcp for a single IP address via the AGG switch using vlan 4093 --- this vlan is recognized as a ‘management’ vlan in the external AT&T network. This IP is the only IP via which the POD will be accessible.
The gateway VM will be responsible for NATing traffic meant to reach external systems on the management network (eg. radius server)
SEBA Pod representation to External OSS: SEBA pod must be represented to AT&T OSS as a traditional chassis OLT in order to interwork with existing OSS system. This kind of a representation is sometimes referred to as an “Abstract OLT” where the AGG switch forms the backplane of the abstract OLT and each “Slot” of the abstract OLT is a separate EdgeCore physical OLT device, or possibly another “Host Switch” that houses SFP OLT modules (eg. Tibit)
The Abstract OLT will be assigned a name by the external OSS - eg. PLANOTX3400OLT1; All communication with external OSS will reference this name.
Ports on the AGG switch will be either “Uplink ports” connected to the external BNG or ports that connect to the “Slots” (OLT Devices).
There will be a maximum of 16 such slots connected to the AGG switch
Each slot will have a maximum of 16 PON ports
Each PON port will have a maximum of 64 ONUs.
A single SEBA POD may contain more than 1 Abstract OLT - in such a case a different physical AGG switch will be used
S & C VLAN tags will be unique within 1 Abstract OLT, but need not be unique across multiple Abstract OLTs.
EdgeCore OLTs have multiple NNI ports that can be used in active-backup mode; however this functionality will not be used; As a result each Slot only connects to the AGG switch with a single NNI port.
The AGG switch uplink to the external BNG is a single 10G port. Multiple such ports can be LAG-ed together
In the AT&T setup, the external BNG requires that a single port connected to the AGG switch must still be part of a single-member LAG group with the use of LACP control.
The Abstract OLT container will interact northbound with the external OSS (via the gateway VM). Southbound it will trigger various NEM apis.
NEM is responsible for lifecycle-management of internal software; external OSS or other software will NOT reach into pod to upgrade anything
[TODO: work-in service-types, technology-profiles, speed-profiles, FCAPS and eventually multicast in the workflow below]
Workflow high-level
ONT discovered bottom-up
If ONT serial number is not allowed or unknown (i.e it has NOT been provisioned by OSS), disable the ONT; generate an event to external OSS that an ONT has been discovered but not yet provisioned. If the ONU’s physical location does not match the location specified for the ONU by the OSS, disable the ONT and generate an event to external OSS that an ONT with valid serial number has been discovered but at the wrong location.
When OSS provisions the ONT, re-enable it & program 802.1x flow - UNI port(s) will be UP
Note that RG authentication and DHCP cannot proceed here without subscriber/ service-binding provisioning by external OSS. Assume subscriber provisioning happens from external OSS
802.1x EAPOL message comes from RG, and ONOS AAA app adds options and sends to radius server. Options are pulled from Sadis/NEM - NASPortId is a required here
If RG authentication fails, allow it to keep trying (in the future consider redirection to captive / self-help portal). DHCP should not succeed since RG authentication has failed
If RG authentication succeeds, ONOS AAA app notifies via an event on the kafka bus that authentication has succeeded
NEM can listen for the event, and trigger the programming of the data plane in both AGG switch and the PON for the subscriber’s C and S vlan info and the DHCP trap flows. DHCP process can start
DHCP L2 relay -> add option 82, learn public IP address, forward via dataplane to external DHCP server
If RG is disconnected from UNI port, force authentication again (even if subscriber/service-binding has been provisioned by OSS). Upon reconnection to UNI port, RG must re-authenticate before DHCP/other-traffic can flow on the provisioned VLANs.
Workflow details
Each section below details steps in various workflow items. Note: some workflow items are common to all operators using SEBA, but some items are significantly different. SEBA needs to be designed to handle these variations. Also this is WIP - some steps can and will likely change as we learn more, and accommodate variations in operator requirements. Other details like FCAPS and Technology profiles have to be worked into the steps below.
OLT bringup
Operator creates an Abstract OLT record in NEM. In the Abstract OLT it includes the Slot information for the actual physical OLT.
This call will NOT contain the management IP of the physical OLT because the external OSS does not have this internal private address - we need to find a way for NEM to know this address (perhaps via static IPs or an internal DHCP server)
NEM also needs to know the port of the AGG-switch where the OLT NNIs are connected. Note: the need to configure NNI ports and their connectivity will go away when the same ONOS cluster is used for VOLTHA & AGG switch control, and their connectivity is discovered as is the norm in SDN.
NEM’s vOLT service pre-provisions the OLT - makes a ‘preprovision-olt’ call to VOLTHA.
NEM’s vOLT service brings up the OLT - makes a ‘enable’ call to VOLTHA.
As part of this the OLT’s PON and NNI ports are brought up, and the logical OpenFlow switch is created by VOLTHA and connected to ONOS
NEM’s vOLT service queries VOLTHA after OLT enable to get OLT information and PON ports to build its inventory.
OLT config record will be sent to ONOS from NEM’s vOLT service.
Logical device ID - retrieved from VOLTHA by NEM after OLT enable
OF port number of NNI port
ONU bringup
The customer connects and boot up their ONU. The ONU will be detected on the PON by the OLT and be brought up automatically in a bottom-up discovery model
This will trigger a PORT_STATUS message to ONOS indicating that a new port has been added to the logical switch. The port will be added in an administratively enabled state.
ONOS sends an event to NEM noting that a new UNI port with a particular ONU serial number has been added. NEM will determine whether the ONU is valid by consulting local preprovisioned data (ie. a ‘whitelist’)
A whitelist includes information that the OSS has already pre-provisioned for an ONU serial number and has allocated a name and OLT slot/pon-port (‘location’) according to the Abstract OLT model
If the ONU serial number IS valid and the location is correct, there is nothing else to be done as the UNI port is already up (enabled). ONOS will be informed that the ONU is allowed, and it will program the 802.1x trap flow for the ONU.
If the ONU serial number is NOT in the whitelist or it’s location (slot/pon-port) does not match what is configured
Disable the ONU with a call to VOLTHA. This should in turn result in PORT_DOWNs sent to ONOS by VOLTHA for all uni ports on that ONU
Generate an event to the external OSS saying that an ONU has been discovered which has not been provisioned yet, or there is a mismatch in config vs. discovered location.
At a later time, when the OSS ‘top-down’ provisions an ONU, NEM will enter the ONU in its whitelist, and check if the ONU has already been ‘bottom-up’ discovered (and disabled previously). If so, NEM will re-enable the ONU, and ONOS will program the 802.1x trap flow
Add Subscriber
OSS will add subscriber/service-binding info to NEM. Sample config of a new subscriber account, provisioned from OSS
SubscriberID, S-Tag & C-Tag are assigned by OSS system
NasPortId
CircuitId
Subscriber-SLA-profile
Describes the type of the service, high speed internet, with or without VOIP, and/or IP Multicast.
traffic details, such as speed tier of the bandwidth for internet access.
subscriber status, such as ‘user suspended’ ‘user registered’, etc.
details info are captured in AT&T’s Functional Policy Document for BNG and RG
RG auth / UNI port auth
After ONU is enabled, trap flows have already been added for 802.1x (EAPOL) in the ONU bringup step.
Radius server is outside pod - ONOS AAA app needs to reach the radius server using a management vlan <4093> and configured radius server IP. This communication happens via the Gateway VM.
Note that DHCP from the RG should fail here because
subscriber/service-binding has not been provisioned by OSS yet or
If the subscriber/service-binding has been provisioned by OSS, it should still fail, because DHCP is being attempted before RG authentication
The RG begins EAPOL authentication exchange. EAPOL packets are trapped at the OLT, sent up to ONOS. AAA app encapsulates in RADIUS and forwards to RADIUS server.
AAA app needs to encapsulate info as a specially formatted string, for Abstract OLT name/slot/port as well as ONU id and serial-number in the Radius message
Such a string could be generated by NEM/ONOS or prescribed by OSS
We chose to go with the NASportid string being prescribed by ISS as part of subscriber provisioning by external OSS
Radius completes authentication and ONOS AAA app sends events concerning subscriber authentication status into message bus (e.g. AUTHENTICATED, DENIED)
NEM (subscriber-service) listens for auth status events, and updates state in subscriber record
If RG authentication fails, let it keep trying (in the future consider redirection to captive / self-help portal). DHCP should not succeed since RG authentication has failed
If RG is successfully authenticated, NEM already has been pre-provisioned by external OSS. It can now:
create the subscriber service instance with the vlans (and speed/service-type information)
DHCP and IGMP trap flows are also programmed at this point.
Agg switches are also programmed by ONOS to L2 forward QinQ VLANs to external BNG
Need to work in service-type/technology-profile and speed-profile
Need to work in speed-profile/ tech-profile and service-type (residential/business) for subscriber
If RG is disconnected from UNI port, force authentication again (even if subscriber/service-binding has been provisioned by OSS). Upon reconnection to UNI port, RG must re-authenticate before DHCP/other-traffic can flow on the provisioned VLANs.
DHCP L2-relay
BNG is deployed as an external workload (physical BNG). DHCP server (that allocates subscriber IP addresses) is the BNG itself or some external server beyond the BNG.
Do DHCP traps on the OLT to go to DHCP L2 relay app in ONOS.
DHCP packets are C-tagged going to the ONOS DHCP relay app
Option 82 is added by the DHCP relay app (calling-station id, etc)
DHCP packets should be returned to the data-plane (via packet-out) and should be double-tagged upstream of the OLT - here is where the subscriber information is necessary for C and S tags
They will be forwarded by the AGG switches to the external BNG
DHCP L2 relay app should send events into Kafka regarding the DHCP state machine for each subscriber (including IP + MAC info).
For FCAPS purposes, and picked up by NEM
Suspend Subscriber
Done at the external BNG subscriber configuration level and used IP filtering and redirect to move the subscriber data traffic particular Wall Garden.
Or simply disable UNI port?
Subscriber/ONU removal
ONU record is marked for removal in NEM - assignment to slot/port removed
NEM (vOLT service) instructs ONOS to remove the trap flows for the UNI port
NEM (vOLT service) instructs ONOS to set the UNI port to the administratively disabled state
ONU serial number may be moved later to a new slot/port for a different subscriber
Another case: ONU may go bad, in which case the ONU serial number would have to be updated at the same slot/port for the same subscriber
OLT removal
Rare event
OLT record is marked for removal in NEM
Checks this can only be done if there are no subscribers on the OLT
NEM vOLT service disables OLT by making ‘disable’ call to VOLTHA
NEM vOLT service removes OLT by making ‘delete’ call to VOLTHA (depends on VOL-908)
Slot may get assigned new serial number if the EdgeCore box is replaced
Attachments:
SEBA Abstract OLT.pdf (application/pdf)
SEBA FCAPS User Stories.pdf (application/pdf)
seba-att2.png (image/png)
seba-att2.png (image/png)