Introduction
The XOS Monitoring service provides a generic platform to support real time network observability for SDN fixed and mobile networks in a Telco Central Office, such that any 3rd party Analytic solution can be brought into this framework.
The platform is being developed as part of CORD (Central Office Re-architected as Datacenter), and focuses on fixed/residential network environment in the initial phases. The main goals of this platform are:
A generic “Analytics” platform inside XOS infrastructure framework
MUST be scalable and support multi-tenancy
MUST be possible to instrument (control probing level on) services in addition to compute and network devices
MUST be possible to adjust the level of probing in the underlying devices
MUST be possible to aggregate probing information
MUST be possible to redirect data streams through a "probe VM" for deeper level of instrumentation that is not otherwise available from underlying devices
As shown in Figure 1, the Monitoring service fetches probe information from different CORD network elements, including compute servers, white-box switches, I/O devices, and software-based services running on the compute servers, and make it available to the one or more Analytics applications running on top of CORD. Initially, this platform is being integrated with CORD residential services such as vSG (Virtualized Subscriber Gateway), vOLT (Virtualized OLT), vRouter (Virtualized Router), although architecturally it is extendable to other services such as Mobility and Enterprise services.
Figure 1. High-level organization of the XOS Monitoring Service.
Architecture
The Monitoring service leverages the open source OpenStack Ceilometer framework, and enhances it with few key features, as depicted in Figure 2 ( http://docs.openstack.org/developer/ceilometer/).
Figure 2. Openstack Ceilometer with A-CORD enhancements.
As shown in Figure 3, the service provides both Notification-based and Polling-based measurement collection mechanisms on its southbound side (interfacing to the underlying services and devices), and on the northbound side, provides Query-based and Publish/Subscribe-based interfaces towards analytic applications. The analytic application correlate these events coming from different network sources into more meaningful events for performing any real-time closed feedback loop operations, but may also feed those events back into the Monitoring framework, such that other applications can make use of them in their analysis.
In addition, the Monitoring service provides a platform to dynamically control the probe information available in the underlying services and devices. As part of this functionality, it also provides flexibility to analytics applications to instantiate “application-specific” probe functions in the network, and potentially mirrors selective traffic to those functions. This would be done to perform deeper levels of instrumentation that would otherwise not available from the underlying devices. The probe functions can directly feed the probe data directly to applications or feed them back into the Monitoring framework, such that other application can make use of those events.
Figure 3. Internal implementation based on OpenStack’s Ceilometer.
In aligning with XOS core principle of Everything-as-a-Service (XaaS), the Monitoring framework is imported into XOS as an elastically scalable, multi-tenant service. It does this by taking advantage of XOS service construction toolkit (e.g., Data Model, Synchronizer). Each of the components can then be scaled up and down based on workload. Multi-tenancy, as shown in Figure 4, is achieved by creating a lightweight proxy container for every tenant of the Monitoring service, such that each tenant is able to access only to the instrumentation data of the network resources belonging to that tenant. Typically the tenants would be the analytics applications running on top of this service.
Figure 4. Monitoring service integrated into XOS.
The following sections details out usage of the XOS Monitoring service and briefly touches on internal details of this service.
Setting Up the Service
Refer to Build A-CORD components for setting up Monitoring service.
Creation of Service Tenants
The tenants for the Monitoring service can be created either using GUI or using REST API (The REST URL: http://<xos endpoint>/xoslib/monitoringchannel/)
Example REST APIs:
Create Tenant: curl -X POST -v -G -H "Accept: application/json; indent=4" -u padmin@vicci.org:pass http://<xos endpoint>/api/tenant/monitoring/monitoringchannel/
Query Tenants: curl -v -G -H "Accept: application/json; indent=4" -u padmin@vicci.org:pass http://<xos endpoint>/api/tenant/monitoring/monitoringchannel/
When a Monitoring service tenant is created, the backend synchronizer picks an available openstack resource and launches a docker container for this tenant and pushes the list of access controls to be applied for this tenant.
The tenant container hosts a simple proxy web server that translates ceilometer queries between tenant application and backend openstack ceilometer module after applying configured access controls.
The URL to proxy web server along with list of authorized openstack project IDs is returned back to the tenant as part of the Monitoring service tenant creation (Tenant application should poll for this information using the GET method). Example ceilometer tenant info is shown below
{ "humanReadableName": "ceilometer-tenant-8", "id": 8, "provider_service": 4, "service_specific_attribute": "{\"instance_id\": 4, \"creator_id\": 1}", "ceilometer_url": "http://10.11.10.20:8888/", "tenant_list_str": "f28d2e522e1f466a95194c10869acd0c, d41cdd2ade394e599b40b9b50d9cd623", "creator": "padmin@vicci.org", "instance": "mysite_ceilometer-4", "computeNodeName": "cp-1.xos-ceilometer.xos-pg0.clemson.cloudlab.us" }
The tenant application should use the "ceilometer_url" specified in the return value to use ceilometer service. Below are few example ceilometer service queries:
Example ceilometer queries
GET all tenant meters: curl http://10.11.10.20:8888/v2/meters
GET all tenant resources: curl http://10.11.10.20:8888/v2/resources
GET statistics for a given meter type: curl http://10.11.10.20:8888/v2/meters/vcpe/statistics
GET samples for a given meter type: curl 'http://10.11.10.20:8888/v2/samples?q.field=meter&q.op=eq&q.value=vcpe'
GET statistics for a given meter type based on a query: curl 'http://10.11.10.20:8888/v2/meters/vcpe/statistics?q.field=timestamp&q.field=timestamp&q.op=ge&q.op=le&q.type=&q.type=&q.value=2015-10-19+01%3A08%3A11.478998%2B00%3A00&q.value=2015-10-20+01%3A08%3A11.478998%2B00%3A00&period=86400'
Adding New Meters
Ceilometer supports two mechanisms to collect statistics data from different components as shown in below Figure 4:
Push Mechanism
Preferred approach
Data is pushed from source to ceilometer through AMQP (RabbitMQ) notification bus
Pull Mechanism
Less preferred way
Ceilometer will poll for the data by invoking the component’s APIs
Polling is done in a configured interval
Figure 4. Openstack Ceilometer data collection mechanism.
The following section describes the mechanisms used to integrate meters from CORD “vcep” service into ceilometer. (The “vcpe” service is now called vSG.) The same procedure can be reused for other services as well.
The Push mechanism is used to integrate meters from CORD “vcpe” services into ceilometer. A high level workflow and set of components involved in collection of “vcpe” measurements is shown below:
Figure 5. Components involved in vcpe data collection.
As shown above, adding new meters to the ceilometer framework using Push mechanism needs the following components:
RabbitMQ broker: The default messaging system used by ceilometer to collect the data.
Exchange: Exchanges are RabbitMQ entities where messages are sent. Exchanges take a message and route it into zero or more queues. Ceilometer notification framework uses “topic” type exchanges where the routing_key embedded in the message determines the target queue.
Topic/Queue: A queue that is bound to an Exchange with a routing_key
Notification Listener: Listens to RabbitMQ Notification bus on a specified exchange for any events/meters and grabs messages off the bus if the event types are of interest to this notification listener. The messages are handed over to corresponding Endpoints for further processing.
Endpoints/Meter handlers: Endpoints specify the event types they’re interested in and a callback for processing messages accordingly. The Notification listeners filter the incoming messages based on their event type value before being passed to the specified Endpoint callback so the Endpoints only receive events they have expressed an interest in seeing.
Notifier: Sends the event/meter info to Ceilometer Notification bus bound to a specific Exchange.
Typical workflow to define new meters using Push mechanism:
Messages between Notification listener and Notifier uses the following message format:
event_type: ‘type of event’
message_id: ‘unique message id’
publisher_id: ‘unique notifier id’
timestamp: ‘timestamp of event’
priority: ‘rabbitmq queue priority eg:INFO, ERROR...etc’
payload: ‘message specific data in dictionary format’
An example “vcpe” notification message format is shown in below code snippet:
RabbitMQ Event Formatmsg = {'event_type': 'vcpe', 'message_id':six.text_type(uuid.uuid4()), 'publisher_id': 'vcpe_publisher_on_host1', 'timestamp':datetime.datetime.now().isoformat(), 'priority':'INFO', 'payload': {'vcpe_id':'vcpe-382', 'user_id':'1e3ce043029547f1a61c1996d1a531a2', 'tenant_id':'3a397e70f64e4e40b69b6266c634d9d0' } }
- The Notification listener converts the received notification messages, after processing, to Samples of the following format. It is completely up to the Notification listeners on what fields of notification message to be used to generate the Sample data.
name: 'meter name'
type: 'type of meter eg: gauge, cumulative or delta'
unit: 'name of unit eg: MB, entries...etc'
volume: 'measurable value'
resource_id: 'resource id'
project_id: 'project id'
Sample code logic of “vcpe” notification listener that converts the notification message format to ceilometer sample format is shown in below code snippet:
Sample Format{ name=message['event_type'], type=sample.TYPE_GAUGE, unit='vcpe', volume=1, user_id=message['payload']['user_id'], project_id=message['payload']['tenant_id'], resource_id=message['payload']['vcpe_id'], message=message }
Define Ceilometer Notification Listener and Endpoints for the service.
- NOTE: With Openstack “Liberty” release, a new approach is introduced to avoid writing notification handlers for every newly added meters, instead new meters can be added to the framework simply by defining them in separate configuration file, called “ceilometer/meter/data/meter.yaml”. Until the Monitoring Service is migrated completely to Openstack Liberty release, the below procedures apply.
Ceilometer has defined an extensible notification framework such that new notification listeners can be added to the framework by simply extending the existing interface and overwriting few methods of that interface. Code snippet for an example “vcpe” notification listener is shown below:
Ceilometer Notification Pluginimport oslo.messaging from oslo_config import cfg from ceilometer.agent import plugin_base from ceilometer import sample #Optional: The default ceilometer exchange also can be used here OPTS = [ cfg.StrOpt('vcpeservice_control_exchange', default='vcpeservice', help="Exchange name for VCPE notifications."), ] class VCPENotification(plugin_base.NotificationBase): resource_name = None event_types = ['vcpe$'] @staticmethod def get_targets(conf): """Return a sequence of oslo.messaging.Target This sequence is defining the exchange and topics to be connected for this plugin. """ return [oslo.messaging.Target(topic=topic, exchange=conf.vcpeservice_control_exchange) for topic in conf.notification_topics] def process_notification(self, message): yield sample.Sample.from_notification( name='vcpe', type=sample.TYPE_GAUGE, unit='vcpe', volume=1, user_id=message['payload']['user_id'], project_id=message['payload']['tenant_id'], resource_id=message['payload']['vcpe_id'], message=message)
- As shown above, the notification listener should extend the ceilometer “NotificationBase” class and override “get_targets” and “process_notification” methods.
Ceilometer Notification framework loads one or more listener plugins, using the namespace “ceilometer.notification” defined in entry_points.txt file. So all the defined notification listeners for a given service should be added to this file under the “ceilometer.notification” namespace. Sample code snippet for “vcpe” notification listeners is given below:
entry_points.txt[ceilometer.notification] . . . vcpe= ceilometer.network.ext_services.vcpe.notifications:VCPENotification vcpe.dns.cache.size= ceilometer.network.ext_services.vcpe.notifications:VCPEDNSCacheSize vcpe.dns.total_instered_entries= ceilometer.network.ext_services.vcpe.notifications:VCPEDNSTotalInsertedEntries vcpe.dns.replaced_unexpired_entries= ceilometer.network.ext_services.vcpe.notifications:VCPEDNSReplacedUnexpiredEntries vcpe.dns.queries_answered_locally= ceilometer.network.ext_services.vcpe.notifications:VCPEDNSQueriesAnsweredLocally vcpe.dns.queries_forwarded= ceilometer.network.ext_services.vcpe.notifications:VCPEDNSQueriesForwarded vcpe.dns.server.queries_sent= ceilometer.network.ext_services.vcpe.notifications:VCPEDNSServerQueriesSent vcpe.dns.server.queries_failed= ceilometer.network.ext_services.vcpe.notifications:VCPEDNSServerQueriesFailed
- Ensure “ceilometer-notification-agent” daemon is restarted after making any of the above changes
Define the Notifier that sends data to Ceilometer notification bus.
Data Publisherdef setup_rabbit_mq_channel(): vcpeservice_exchange = Exchange("vcpe_service", "topic", durable=False) connection = BrokerConnection(rabbitmq_host, rabbitmq_user, rabbitmq_password) channel = connection.channel() producer = Producer(channel, exchange=vcpeservice_exchange, routing_key='notifications.info') def publish_vcpe_stats(): msg = {'event_type': 'vcpe', 'message_id':six.text_type(uuid.uuid4()), 'publisher_id': cpe_publisher_id, 'timestamp':datetime.datetime.now().isoformat(), 'priority':'INFO', 'payload': {'vcpe_id':'vcpe-382', 'user_id':'1e3ce043029547f1a61c1996d1a531a2', 'tenant_id':'3a397e70f64e4e40b69b6266c634d9d0' } } producer.publish(msg) msg = {'event_type': 'vcpe.dns.cache.size', 'message_id':six.text_type(uuid.uuid4()), 'publisher_id': cpe_publisher_id, 'timestamp':datetime.datetime.now().isoformat(), 'priority':'INFO', 'payload': {'vcpe_id':'vcpe-382', 'user_id':'1e3ce043029547f1a61c1996d1a531a2', 'tenant_id':'3a397e70f64e4e40b69b6266c634d9d0', 'cache_size':150.0 } } producer.publish(msg)
Software Components
The XOS Monitoring service has the following components:
Django data models that define the authoritative state for the service
Front-end APIs for users to interact with the service and update the model’s state
A Synchronizer that keeps the backend system that implements the service in sync with the model state
Ceilometer service tenant instances and Openstack Ceilometer components running inside the XOS slice associated with the service
Figure 6. Software components of XOS Monitoring service.
Models
There are two Django models defined for Ceilometer service: CeilometerService and MonitoringChannel, also known as CeilometerTenant. The relationship of these two models with XOS core models is shown below.
Figure 7. Ceilometer service data model.
The MonitoringChannel model is inherited from the core XOS data model TenantWithContainer and holds list of authorized tenants along with ceilometer proxy URL for a given tenant.
Synchronizer
The role of the Ceilometer synchronizer is to monitor the associated data model for any changes and ensure the state of backend system is up-to-date with latest data model changes. Ceilometer synchronizer uses Ansible playbook to reflect any data model changes to the backend.
When a new Ceilometer tenant data model is created or updated, the synchronizer performs the following operations:
Synchronizer finds a VM that is designated to host ceilometer tenant docker containers. If it does not find any VMs or existing VMs are fully utilized, the synchronizer instantiates a new VM and install any necessary software (python, docker, pipework...etc) on that VM.
Synchronizer launches a new docker container for the tenant on the VM and creates port-forwarding from VM port to docker container port 8000. It also writes the tenant related authorization information (such as authorized tenant list...etc) into the tenant docker container.
The docker image is pre-loaded with a web server that runs on port 8000
The web server is ceilometer proxy server that waits for any Ceilometer queries from the tenant applications and forwards them to the Openstack ceilometer service after applying the configured access controls.
Front-end APIs
Like any other XOS service, Ceilometer provides three front-end mechanisms to interact with this services:
RESTful interface
Admin GUI interface
TOSCA Resource Templates
Summary
The Monitoring service runs in CORD infrastructure, providing real-time instrumentation data from different elements. This enables network operators to run a one or more Analytics applications to monitor and react to various aspects of the network, including anomaly detection and mitigation, congestion avoidance in the fabric, customer care troubleshooting/diagnosis, and so on.