ELK Stack
In order to use Elastic Stack the
logging
helm-chart needs to be installed. That is part of thecord-platform
helm-chart, but if you need to install it, please refer to this guide.
CORD uses ELK Stack for logging information at all levels. CORD’s ELK Stack logger collects information from several components, including the XOS Core, API, and various Synchronizers. Together with logs events and alarms are collected in ELK Stack.
On a running POD, ELK can be accessed at http://<pod-ip>:30601
.
For most purposes, the logs in ELK Stack should contain enough information to diagnose problems. Furthermore, these logs thread together facts across multiple components by using the identifiers of XOS data model objects.
Important!
To start using Kibana, you must create an index under Management > Index
Patterns. Create one with a name of logstash-*
, then you can search for
events in the Discover section.
Examples Query
More information about using Kibana to access ELK Stack logs is available elsewhere, but to illustrate how the logging system is used in CORD, consider the following example quieries.
XOS Related queries
The first example query enlists log messages in the implementation of a particular service synchronizer, in a given time range:
+synchronizer_name:rcord-synchronizer AND +@timestamp:[now-1h TO now]
A second query gets log messages that are linked to the Network data model across all services:
+model_name: RCordSubscriber
The same query can be refined to include the identifier of the specific Network object in question. You can obtain the object id from the object’s page in the XOS GUI.
+model_name: RCordSubscriber AND +pk:68
A final example lists log messages in a service synchronizer that contain Python exceptions, and will usually correspond to anomalous execution:
+synchronizer_name:rcord-synchronizer AND +exception
REST APIs based queries
The first thing you need to do to use the REST APIs is to find the port on which the service is exposed. At the moment the official chart won't let us specify a port, so that will change for every deployment.
To find the correct port, run this command anywhere you have the kubectl
tool installed:
export ELK_PORT=$(kubectl get svc logging-elasticsearch-client -o json | jq .spec.ports[0].nodePort)
You can then query the REST API on that port, for example:
curl -XGET "http://localhost:$ELK_PORT"
{
"name" : "logging-elasticsearch-client-587599fbdc-bkfhn",
"cluster_name" : "elasticsearch",
"cluster_uuid" : "Nwc4HpeBQrOOcL4IWVyRAw",
"version" : {
"number" : "6.5.4",
"build_flavor" : "oss",
"build_type" : "tar",
"build_hash" : "d2ef93d",
"build_date" : "2018-12-17T21:17:40.758843Z",
"build_snapshot" : false,
"lucene_version" : "7.5.0",
"minimum_wire_compatibility_version" : "5.6.0",
"minimum_index_compatibility_version" : "5.0.0"
},
"tagline" : "You Know, for Search"
}
Following are some SEBA related querying example. This examples are executed from within the cluster,
if you want to run them from a client machine replace localhost
with the cluster-ip
.
Get current authentication status for a particular ONU
curl -XGET "http://localhost:$ELK_PORT/_search" -H 'Content-Type: application/json' -d'
{
"size": 1,
"sort": {
"timestamp": "desc"
},
"query": {
"bool": {
"must": [
{
"match": {
"serialNumber": "PSMO12345678"
}
}
],
"filter": {
"term": {
"kafka_topic": "authentication.events"
}
}
}
}
}' | jq .hits.hits[0]
Example response:
{
"_index": "logstash-2019.05.30",
"_type": "doc",
"_id": "Kw_bCWsBPqzdKVIdSbxC",
"_score": null,
"_source": {
"authenticationState": "APPROVED",
"deviceId": "of:0000aabbccddeeff",
"@version": "1",
"portNumber": "128",
"kafka_topic": "authentication.events",
"@timestamp": "2019-05-30T17:48:14.343Z",
"timestamp": "2019-05-30T17:48:14.308Z",
"kafka_key": "%{[@metadata][kafka][key]}",
"serialNumber": "PSMO12345678",
"type": "cord-kafka",
"kafka_timestamp": "1559238494311"
},
"sort": [
1559238494308
]
}
Get all the events regarding a particular ONU
curl -XGET "http://localhost:$ELK_PORT/_search" -H 'Content-Type: application/json' -d'
{
"sort": {
"timestamp": "desc"
},
"query": {
"bool": {
"must": [
{
"match": {
"serialNumber": "PSMO12345678"
}
}
]
}
}
}' | jq .hits.hits
Get all the events regarding a particular action
If you want to list all the authentication events, regardless fo the ONU serial number:
curl -XGET "http://localhost:$ELK_PORT/_search" -H 'Content-Type: application/json' -d'
{
"sort": {
"timestamp": "desc"
},
"query": {
"bool": {
"filter": {
"term": {
"kafka_topic": "authentication.events"
}
}
}
}
}
' | jq .hits.hits
Get Operational status of the RADIUS server
Operational status can be queried from Elastic search using Rest API calls. Three possible values for operational status are: 1) In Use 2) Unknown 3) Unavailable
curl -XGET "http://localhost:$ELK_PORT/_search" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"filter": {
"term": {
"kafka_topic": "radiusoperationalstatus.events"
}
}
}
}
}' | jq .hits.hits[0]._source.radiusoperationalstatus
Example Response:
"In Use"