CORD : CORD Monitoring Service


Introduction

The XOS Monitoring service provides a generic platform to support real time network observability for SDN fixed and mobile networks in a Telco Central Office, such that any 3rd party Analytic solution can be brought into this framework.

The platform is being developed as part of CORD (Central Office Re-architected as Datacenter), and focuses on fixed/residential network environment in the initial phases. The main goals of this platform are:

  • A generic “Analytics” platform inside XOS infrastructure framework

  • MUST be scalable and support multi-tenancy

  • MUST be possible to instrument (control probing level on) services in addition to compute and network devices

  • MUST be possible to adjust the level of probing in the underlying devices

  • MUST be possible to aggregate probing information

  • MUST be possible to redirect data streams through a "probe VM" for deeper level of instrumentation that is not otherwise available from underlying devices

As shown in Figure 1, the Monitoring service fetches probe information from different CORD network elements, including compute servers, white-box switches, I/O devices, and software-based services running on the compute servers, and make it available to the one or more Analytics applications running on top of CORD.  Initially, this platform is being integrated with CORD residential services such as vSG (Virtualized Subscriber Gateway), vOLT (Virtualized OLT), vRouter (Virtualized Router), although architecturally it is extendable to other services such as Mobility and Enterprise services.

Figure 1. High-level organization of the XOS Monitoring Service.

Architecture

The Monitoring service leverages the open source OpenStack Ceilometer framework, and enhances it with few key features, as depicted in Figure 2 ( http://docs.openstack.org/developer/ceilometer/).

Figure 2. Openstack Ceilometer with A-CORD enhancements.

As shown in Figure 3, the service provides both Notification-based and Polling-based measurement collection mechanisms on its southbound side (interfacing to the underlying services and devices), and on the northbound side, provides Query-based and Publish/Subscribe-based interfaces towards analytic applications. The analytic application correlate these events coming from different network sources into more meaningful events for performing any real-time closed feedback loop operations, but may also feed those events back into the Monitoring framework, such that other applications can make use of them in their analysis.

In addition, the Monitoring service provides a platform to dynamically control the probe information available in the underlying services and devices. As part of this functionality, it also provides flexibility to analytics applications to instantiate “application-specific” probe functions in the network, and potentially mirrors selective traffic to those functions. This would be done to perform deeper levels of instrumentation that would otherwise not available from the underlying devices. The probe functions can directly feed the probe data directly to applications or feed them back into the Monitoring framework, such that other application can make use of those events.

Figure 3. Internal implementation based on OpenStack’s Ceilometer.

In aligning with XOS core principle of Everything-as-a-Service (XaaS), the Monitoring framework is imported into XOS as an elastically scalable, multi-tenant service. It does this by taking advantage of XOS service construction toolkit (e.g., Data Model, Synchronizer). Each of the components can then be scaled up and down based on workload. Multi-tenancy, as shown in Figure 4, is achieved by creating a lightweight proxy container for every tenant of the Monitoring service, such that each tenant is able to access only to the instrumentation data of the network resources belonging to that tenant. Typically the tenants would be the analytics applications running on top of this service.

Figure 4. Monitoring service integrated into XOS.

The following sections details out usage of the XOS Monitoring service  and briefly touches on internal details of this service.

Setting Up the Service

Refer to Build A-CORD components for setting up Monitoring service.

Creation of Service Tenants

The tenants for the Monitoring service can be created either using GUI or using REST API (The REST URL: http://<xos endpoint>/xoslib/monitoringchannel/)

Example REST APIs:

Create Tenant: curl -X POST -v -G -H "Accept: application/json; indent=4" -u padmin@vicci.org:pass http://<xos endpoint>/api/tenant/monitoring/monitoringchannel/

Query Tenants: curl -v -G -H "Accept: application/json; indent=4" -u padmin@vicci.org:pass http://<xos endpoint>/api/tenant/monitoring/monitoringchannel/


When a Monitoring service tenant is created, the backend synchronizer picks an available openstack resource and launches a docker container for this tenant and pushes the list of access controls to be applied for this tenant.

The tenant container hosts a simple proxy web server that translates ceilometer queries between tenant application and backend openstack ceilometer module after applying configured access controls.

The URL to proxy web server along with list of authorized openstack project IDs is returned back to the tenant as part of the Monitoring service tenant creation (Tenant application should poll for this information using the GET method). Example ceilometer tenant info is shown below

Monitoring Channel Info
{
    "humanReadableName": "ceilometer-tenant-8",
    "id": 8,
    "provider_service": 4,
    "service_specific_attribute": "{\"instance_id\": 4, \"creator_id\": 1}",
    "ceilometer_url": "http://10.11.10.20:8888/",
    "tenant_list_str": "f28d2e522e1f466a95194c10869acd0c, d41cdd2ade394e599b40b9b50d9cd623",
    "creator": "padmin@vicci.org",
    "instance": "mysite_ceilometer-4",
    "computeNodeName": "cp-1.xos-ceilometer.xos-pg0.clemson.cloudlab.us"
} 

 

The tenant application should use the "ceilometer_url" specified in the return value to use ceilometer service. Below are few example ceilometer service queries:


Adding New Meters

Ceilometer supports two mechanisms to collect statistics data from different components as shown in below Figure 4:

  • Push Mechanism

    • Preferred approach

    • Data is pushed from source to ceilometer through AMQP (RabbitMQ) notification bus

  • Pull Mechanism

    • Less preferred way

    • Ceilometer will poll for the data by invoking the component’s APIs

    • Polling is done in a configured interval

Figure 4. Openstack Ceilometer data collection mechanism.

The following section describes the mechanisms used to integrate meters from CORD “vcep” service into ceilometer. (The “vcpe” service is now called vSG.) The same procedure can be reused for other services as well.

The Push mechanism is used to integrate meters from CORD “vcpe” services into ceilometer. A high level workflow and set of components involved in collection of “vcpe” measurements is shown below:

Figure 5. Components involved in vcpe data collection.

As shown above, adding new meters to the ceilometer framework using Push mechanism needs the following components:

  • RabbitMQ broker: The default messaging system used by ceilometer to collect the data.  

    • Exchange: Exchanges are RabbitMQ entities where messages are sent. Exchanges take a message and route it into zero or more queues. Ceilometer notification framework uses “topic” type exchanges where the routing_key embedded in the message determines the target queue.

    • Topic/Queue: A queue that is bound to an Exchange with a routing_key

  • Notification Listener: Listens to RabbitMQ Notification bus on a specified exchange for any events/meters and grabs messages off the bus if the event types are of interest to this notification listener. The messages are handed over to corresponding Endpoints for further processing.   

  • Endpoints/Meter handlers: Endpoints specify the event types they’re interested in and a callback for processing messages accordingly. The Notification listeners filter the incoming messages based on their event type value before being passed to the specified Endpoint callback so the Endpoints only receive events they have expressed an interest in seeing.

  • Notifier: Sends the event/meter info to Ceilometer Notification bus bound to a specific Exchange.

Typical workflow to define new meters using Push mechanism:

  • Messages between Notification listener and Notifier uses the following message format:

    • event_type: ‘type of event’

    • message_id: ‘unique message id’

    • publisher_id: ‘unique notifier id’

    • timestamp: ‘timestamp of event’

    • priority: ‘rabbitmq queue priority eg:INFO, ERROR...etc’

    • payload: ‘message specific data in dictionary format’

    • An example “vcpe” notification message format is shown in below code snippet:

      RabbitMQ Event Format
      msg = {'event_type': 'vcpe',
                  'message_id':six.text_type(uuid.uuid4()),
                  'publisher_id': 'vcpe_publisher_on_host1',
                  'timestamp':datetime.datetime.now().isoformat(),
                  'priority':'INFO',
                  'payload': {'vcpe_id':'vcpe-382',
                                   'user_id':'1e3ce043029547f1a61c1996d1a531a2',
                                   'tenant_id':'3a397e70f64e4e40b69b6266c634d9d0'
                                  }
                  }
  • The Notification listener converts the received notification messages, after processing, to Samples of the following format. It is completely up to the Notification listeners on what fields of notification message to be used to generate the Sample data.
    • name: 'meter name'

    • type: 'type of meter eg: gauge, cumulative or delta'

    • unit: 'name of unit eg: MB, entries...etc'

    • volume: 'measurable value'

    • resource_id: 'resource id'

    • project_id: 'project id'

    • Sample code logic of “vcpe” notification listener that converts the notification message format to ceilometer sample format is shown in below code snippet:

      Sample Format
      {
                  name=message['event_type'],
                  type=sample.TYPE_GAUGE,
                  unit='vcpe',
                  volume=1,
                  user_id=message['payload']['user_id'],
                  project_id=message['payload']['tenant_id'],
                  resource_id=message['payload']['vcpe_id'],
                  message=message
      }
  • Define Ceilometer Notification Listener and Endpoints for the service.

    • NOTE: With Openstack “Liberty” release, a new approach is introduced to avoid writing notification handlers for every newly added meters, instead new meters can be added to the framework simply by defining them in separate configuration file, called “ceilometer/meter/data/meter.yaml”. Until the Monitoring Service is migrated completely to Openstack Liberty release, the below procedures apply.
    • Ceilometer has defined an extensible notification framework such that new notification listeners can be added to the framework by simply extending the existing interface and overwriting few methods of that interface. Code snippet for an example “vcpe” notification listener is shown below:

      Ceilometer Notification Plugin
      import oslo.messaging
      from oslo_config import cfg
      from ceilometer.agent import plugin_base
      from ceilometer import sample
      
      #Optional: The default ceilometer exchange also can be used here
      OPTS = [
          cfg.StrOpt('vcpeservice_control_exchange',
                     default='vcpeservice',
                     help="Exchange name for VCPE notifications."),
      ]
      
      class VCPENotification(plugin_base.NotificationBase):
          resource_name = None
          event_types = ['vcpe$']
      
          @staticmethod
          def get_targets(conf):
              """Return a sequence of oslo.messaging.Target
              This sequence is defining the exchange and topics to be connected for
              this plugin.
              """
              return [oslo.messaging.Target(topic=topic,
                                            exchange=conf.vcpeservice_control_exchange)
                      for topic in conf.notification_topics]
      
          def process_notification(self, message):
              yield sample.Sample.from_notification(
                  name='vcpe',
                  type=sample.TYPE_GAUGE,
                  unit='vcpe',
                  volume=1,
                  user_id=message['payload']['user_id'],
                  project_id=message['payload']['tenant_id'],
                  resource_id=message['payload']['vcpe_id'],
                  message=message)

  

    • As shown above, the notification listener should extend the ceilometer “NotificationBase” class and override “get_targets” and “process_notification” methods.
    • Ceilometer Notification framework loads one or more listener plugins, using the namespace “ceilometer.notification” defined in entry_points.txt file. So all the defined notification listeners for a given service should be added to this file under the “ceilometer.notification” namespace. Sample code snippet for “vcpe” notification listeners is given below:

      entry_points.txt
      [ceilometer.notification]
      .
      .
      .
      vcpe= ceilometer.network.ext_services.vcpe.notifications:VCPENotification
      vcpe.dns.cache.size= ceilometer.network.ext_services.vcpe.notifications:VCPEDNSCacheSize
      vcpe.dns.total_instered_entries= ceilometer.network.ext_services.vcpe.notifications:VCPEDNSTotalInsertedEntries
      vcpe.dns.replaced_unexpired_entries= ceilometer.network.ext_services.vcpe.notifications:VCPEDNSReplacedUnexpiredEntries
      vcpe.dns.queries_answered_locally= ceilometer.network.ext_services.vcpe.notifications:VCPEDNSQueriesAnsweredLocally
      vcpe.dns.queries_forwarded= ceilometer.network.ext_services.vcpe.notifications:VCPEDNSQueriesForwarded
      vcpe.dns.server.queries_sent= ceilometer.network.ext_services.vcpe.notifications:VCPEDNSServerQueriesSent
      vcpe.dns.server.queries_failed= ceilometer.network.ext_services.vcpe.notifications:VCPEDNSServerQueriesFailed



    • Ensure “ceilometer-notification-agent” daemon is restarted after making any of the above changes
  • Define the Notifier that sends data to Ceilometer notification bus.

    Data Publisher
    def setup_rabbit_mq_channel():
         vcpeservice_exchange = Exchange("vcpe_service", "topic", durable=False)
         connection = BrokerConnection(rabbitmq_host, rabbitmq_user, rabbitmq_password)
         channel = connection.channel()
         producer = Producer(channel, exchange=vcpeservice_exchange, routing_key='notifications.info')
    
    def publish_vcpe_stats():
         msg = {'event_type': 'vcpe',
                'message_id':six.text_type(uuid.uuid4()),
                'publisher_id': cpe_publisher_id,
                'timestamp':datetime.datetime.now().isoformat(),
                'priority':'INFO',
                'payload': {'vcpe_id':'vcpe-382',
                            'user_id':'1e3ce043029547f1a61c1996d1a531a2',
                            'tenant_id':'3a397e70f64e4e40b69b6266c634d9d0'
                           }
               }
         producer.publish(msg)
         msg = {'event_type': 'vcpe.dns.cache.size',
                'message_id':six.text_type(uuid.uuid4()),
                'publisher_id': cpe_publisher_id,
                'timestamp':datetime.datetime.now().isoformat(),
                'priority':'INFO',
                'payload': {'vcpe_id':'vcpe-382',
                            'user_id':'1e3ce043029547f1a61c1996d1a531a2',
                            'tenant_id':'3a397e70f64e4e40b69b6266c634d9d0',
                            'cache_size':150.0
                          }
               }
         producer.publish(msg)


Software Components

The XOS Monitoring service has the following components:

  • Django data models that define the authoritative state for the service

  • Front-end APIs for users to interact with the service and update the model’s state

  • A Synchronizer that keeps the back­end system that implements the service in sync with the model state

  • Ceilometer service tenant instances and Openstack Ceilometer components running inside the XOS slice associated with the service

Figure 6. Software components of XOS Monitoring service.

Models

There are two Django models defined for Ceilometer service: CeilometerService and MonitoringChannel, also known as CeilometerTenant. The relationship of these two models with XOS core models is shown below.

Figure 7. Ceilometer service data model.

The MonitoringChannel model is inherited from the core XOS data model TenantWithContainer  and holds list of authorized tenants along with ceilometer proxy URL for a given tenant.  

Synchronizer

The role of the Ceilometer synchronizer is to monitor the associated data model for any changes and ensure the state of backend system is up-to-date with latest data model changes. Ceilometer synchronizer uses Ansible playbook to reflect any data model changes to the backend.

When a new Ceilometer tenant data model is created or updated, the synchronizer performs the following operations:

  • Synchronizer finds a VM that is designated to host ceilometer tenant docker containers. If it does not find any VMs or existing VMs are fully utilized, the synchronizer instantiates a new VM and install any necessary software (python, docker, pipework...etc) on that VM.

  • Synchronizer launches a new docker container for the tenant on the VM and creates port-forwarding from VM port to docker container port 8000. It also writes the tenant related authorization information (such as authorized tenant list...etc) into the tenant docker container.  

    • The docker image is pre-loaded with a web server that runs on port 8000

    • The web server is ceilometer proxy server that waits for any Ceilometer queries from the tenant applications and forwards them to the Openstack ceilometer service after applying the configured access controls.   

Front-end APIs

Like any other XOS service, Ceilometer provides three front-end mechanisms to interact with this services:

  • RESTful interface

  • Admin GUI interface

  • TOSCA Resource Templates

Summary

The Monitoring service runs in CORD infrastructure, providing real-time instrumentation data from different elements. This enables network operators to run a one or more Analytics applications to monitor and react to various aspects of the network, including anomaly detection and mitigation, congestion avoidance in the fabric, customer care troubleshooting/diagnosis, and so on.

Attachments:

pasted image 0[2].png (image/png)