ELK Stack Elasticsearch and Kibana

Log and line based data is produced in great quantity by operating systems and applications alike. As of the moment, storage of this data is done in separate log files, scattered across different machines and directories. As a result, retrieval of particular information requires a thorough knowledge of the distribution or results in a lengthy search process.
The common approach to the problem of scattered of logfiles is to use centralized logging systems, such as rsyslogd. However, this solution does not solve the problem that log data will often be distributed over several files and might have differing formats.

Since ERST Technology is not the only organisation facing these problems, a number of solutions, both commercial as well as open source have been developed. This paper aims to explore the characteristics of the ELK toolstack, representing a popular open source implementation.


ELK consists of three applications that handle input, storage and analysis of log data. The components are:
  • Elasticsearch
  • Logstash
  • Kibana
Each component can be used without the others, but the combination of the three is common and hence the name ELK.

A typical dataflow in this environment would look as follows:

Illustration 1: Typical ELK workflow




A number of servers are producing log lines in an arbitrary format. Usually this would be syslog-like, though JSON format would reduce the complexity of further processing steps. These log lines are then send to logstash, which offers various way to ingest them, the most interesting ones for ERST being TCP, syslog and reading from actual files.
When Logstash receives these lines it applies filter plugins to them to extract the relevant data fields, such as timestamp, source, etc. Filters can also be used to add, remove or alter additional data fields or tags. After the filters have been applied, logstash outputs the data through output plugins, the most important being “elasticsearch”. The “elasticsearch” plugin produces the JSON format that is required by Elasticsearch.
Once fed into Elasticsearch, the data is stored for fast retrieval. This feature is a consequence of being a wrapper to the lucene database. Queries are expressed in a custom query language and submitted via a HTTP-API.
Kibana, finally uses HTTP requests to the API to provide flexible web-based access to elasticsearch. Searches can be dynamically narrowed down based on fields found in the log data and it is possible to store individual search configurations.


Requirements

Installation of the elk components is not required. For a basic setup it is enough to download and unpack the current versions. However,some of the components have dependencies on other projects:
  • Elasticsearch
    • Java JRE
    • Ruby
  • Logstash
    • Java JRE
    • Ruby
  • Kibana
    • Webserver (e.g. httpd)


Use Case: Reducing complexity for supporters

Objective:
ERST support wants to provide information concerning a customers requests (e.g. a certain file was not transmitted)
Preconditions:
A full ELK stack is in place and is fed by logs of all concerned servers. Log data is sufficiently split into meaningful semantics by Logstash.
Steps:
  1. Customer files a ticket with the support, claiming that a certain file was not received on his side yesterday.
  2. ERST support performs a search in Kibana restricting results to the timeframe of the last 48 hours and filename according to the ticket.
  3. The system delivers a handful of log entries from which the processing of the file in question is visible
  4. ERST support answers the ticket providing evidence about what really happened, without having the need to find the correct logfile or even log into a specific server.

Use Case: Running statistical analysis

Objective:
A customer requires detailed statistics on the number of calls to a certain client system, the number of files sent there per day and hour and how many connection errors occur in average during that period of time
Preconditions:
A complete ELK stack that formats the IntraNect application logs and feeds them into Elasticsearch has to be in place.
Steps:
Three questions have to be answered for the customer, all of which can be discovered using Kibana. If a textual representation of the output is required, the query from Kibana can directly be sent to Elasticsearch via the API.
Number of calls to a certain system:
  1. Filter by the field target_system
  2. Filter all entries that contain the field filename
Number of files sent to this system per day / hour:
  1. Modify the query to incorporate an aggregate of unique filenames. This has to be done manually and might even require config changes in Kibana to be possible, but there are working examples for this case on stackexchange)
Number of connection errors:
  1. Filter the “error message” field for the term “connection error”.

Use Case: Speeding up the Control Panel

Objective:
ERST wants to improve loading times and responsiveness of the IntraNect Control Panel.
Preconditions:
A complete ELK stack is installed, splitting application log data as required.
Steps:
Since this is not exactly a use case, there are no steps that a user takes. Instead the idea of how to implement the speed-up is sketched below:
If the Control Panel loads a specific page it often has to do a SQL query as well as parse log files in order to find the required information. While querying the database is arguably a rather quick operation, file access is not. However, in the current environment it is a necessary evil because MySQL does not work well for retrieving semi-structured data such as logs.
The idea of how to speed up the Control Panel is to dump all required data into the Elasticsearch database, which is especially made for that kind of queries. Now, instead of waiting for a SQL query to finish and then search through files, the whole process will basically be a query in Elasticsearch. Moreover, the results could also be received asynchronously, which means the whole page could start displaying something before all data has been acquired.


If you are interested in the entire Whtiepaper just send us an e-mail Contact Details

No comments :

Post a Comment

Further Readings