Introduction
This is the second in a series of blog posts outlining how to use open source ELK (Elasticsearch, Logstash, Kibana) to visualize performance of a system. If you will be implementing the monitoring described in this post you will first want to follow the steps in the first post: Performance Monitoring w/ ELK - Part I: Installing ELK
In this post we will cover how to visualize an HCP cluster's performance using the HCP REST/S3/Swift gateway access logs. To accomplish this you will need an HCP cluster running version 7.3 or greater. For real-time monitoring you must have network connectivity between your HCP cluster and the server running the ELK software. In this post we will configure ELK to receive HCP access logs over syslog port 514, and we will configure the HCP cluster to send the access logs to the ELK server.
Logstash Configuration
In the previous post we installed the Logstash software. Logstash is a software component which can be configured to receive messages, A.K.A. "events", and process those events in a particular way. Events can be messages received over the network, outputs from running commands, or the result of tailing log files, just to name a few. Logstash parses events according to the instructions in a configuration file. For each event, values parsed from the event message are assigned to named fields which will be passed to an indexing service, in this case Elasticsearch. In this section we break down a Logstash configuration to better understand each of its elements.
A Logstash configuration file has 3 main sections: inputs, filters, and outputs. The input section of the Logstash configuration file defines one or more inputs from which events are received. The filter section defines how that event is processed, and the output section defines one or more outputs to receive the results. This is referred to as the "processing pipeline". The contents of each section are defined using plugins which are described in detail at the following links for input plugins, filter plugins, and output plugins. Let's have a look at examples of each of these sections from a configuration designed to receive access logs from HCP over syslog, parse the access log events, and send them to the Elasticsearch index.
Input Section
The input section specifies where the Logstash events are coming from. In our case we will be listening on UDP port 514 to receive HTTP gateway access logs via syslog:
input {
udp {
port => 514
type => syslog
}
# stdin {
# type => syslog
# }
}
In this example you see two input plugins being used, udp and stdin. The udp plugin will listen on the specified port. The type field will set the event's type to the value specified. Using event type you can have multiple input plugins in the same configuration, and use the type field to distinguish among events received by different input plugins. The stdin plugin is useful for debugging as you can trigger an event by pasting a message into the Logstash terminal without having to wait for some external event. You will notice the stdin plugin is commented out using the # symbol.
Filter Section
The filter section is where the real work happens and will constitute the majority of code in any Logstash configuration. The following filter code is a bit simplified from what you will find in the configuration file attached to this article, in order to allow us to focus on the most meaningful aspects. Take a look at the code and the sample event, and we will break it down below.
filter {
if [type] == "syslog" {
# Sample event:
# 172.18.12.24 - myuser [22/Sep/2018:10:58:16 -0400] "GET /folder/object.png HTTP/1.1" 200 210 MyNamespace.Mytenant 007 0
grok {
match => {
"message" => "%{IPORHOST:clientip} %{HTTPDUSER:ident} +%{HTTPDUSER:auth} +\[%{HTTPDATE:timestamp}\] \"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})\" %{NUMBER:response:int} (?:%{NUMBER:response_bytes:int}|-) ((?[a-zA-Z0-9-]+)\.)?(?[a-zA-Z0-9-]+)(@hs3)? %{NUMBER:latency:int}( %{INT:node_number:int})?( %{INT:request_bytes:int})?.*"
}
}
date {
match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ]
}
}
}
The first thing you will notice is conditional logic if [type] == "syslog". Type is an event field that was set by the input plugin. You can learn more about using conditionals and accessing event fields here.
The next line after the if statement is the beginning of a grok filter plugin definition. Grok plugins are very useful for taking unstructured text like web server logs and turning it into structured data that can be indexed and queried. In this example the match option is matching the content of the message event field1 on the left with the provided expression on the right. The pattern string matches using a combination of predefined Logstash patterns and regular expressions. Predefined Logstash patterns are specified as %{<pattern name>.<event field name>.<data type>}. Grok assigns matched values to the specified event field name and casts the event field datatype if specified. For example:
%{HTTPDATE:timestamp} Finds a date that matches the Apache HTTP access log date format (see grok-patterns) and assigns the text value to the timestamp event field. All fields are text by default.
%{NUMBER:response:int} Matches a number and assigns the value to the response event field, casting the field as an integer.
The other plugin in the filter section is the date filter plugin. The date filter is used for parsing dates from fields, and then using that date or timestamp as the Logstash timestamp for the event. For example:
match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ] Converts the text stored in the timestamp event field into a date using the provided joda-time date pattern, and assigns that timestamp to the Logstash event.
1 Logstash Puts the data from the input event into the message field automatically.
Output Section
Finally the output section describes what to do with the data which has been stored in the event fields. Typically, and in the following example, data is sent to an Elasticsearch index.
output {
elasticsearch {
hosts => ["localhost:9200"]
user => elastic
password => changeme
index => "access_logs-%{+yyMMdd}"
}
# stdout { codec => rubydebug }
}
The Elasticsearch output plugin specifies the host and port where the Elasticsearch index is running, the username and password, and the name of the index to add the record to. If the index does not exist it will be created. In this example if the event timestamp were September 7th 2018 the index name would be access_logs-180907 .
The stdout output plugin is a simple output which prints to the stdout of the shell running Logstash. This is useful for debugging as you can see how the input event data has mapped to the output event.
Elasticsearch Indexes and Managing Indexes w/ Kibana
Elasticsearch is a distributed, RESTful search and analytics engine based on Apache Lucene index and search technology. Elasticsearch is very scalable and can provide near real-time search. Elasticsearch is distributed which means indices can be divided into shards which are distributed among the nodes in a cluster to provide horizontal scaling, while replicas of each shard are distributed to provide high availability.
As you read in the previous section, we use Logstash to receive unstructured log messages, parse the messages, and output a structured index document which we send to the Elasticsearch index. We can then use Kibana to manage the indexes we have created. Kibana has a tab called Dev Tools which offers a console where you can run live commands against the Elasticsearch RESTful API. In this blog I will show you some useful commands for managing your indexes with Kibana.
Visualizing Elasticsearch Data w/ Kibana
Kibana is a data visualization software for Elasticsearch. It lets you visualize the content indexed in Elasticsearch in several formats including: bar, line or pie charts, tables, and heat maps. Following the steps in this post you will use Kibana to create index patterns for Elasticsearch indexes, explore the data in your index, build visualizations to represent your data in a meaningful way, and finally to collect your visualizations on dashboards.
Monitoring HCP with ELK - Step by Step
In order to follow the steps in this section you must have a running HCP, and have installed the ELK stack. Elasticsearch and Kibana should be running as services. To configure a running ELK system you can refer to the fist blog in this series Performance Monitoring w/ ELK - Part I: Installing ELK.
Step 1: Configure HCP for Monitoring
The first thing you will need to do is configure HCP to output HTTP gateway access log event messages via syslog. To do this log into your HCP System Management Console as a user with the "Administrator" role, and go to the Monitoring => Syslog page.
1a: Check the "Enable syslog" and "Send log messages for HTTP-based data access requests" check boxes and click "Update Settings".
input {
# udp {
# port => 514
# type => syslog
# }
stdin {
type => syslog
}
}
...
output {
# elasticsearch {
# hosts => ["localhost:9200"]
# user => elastic
# password => changeme
# index => "access_logs-%{+yyMMdd}"
# }
stdout {
codec => rubydebug
}
}
2c: Now you are ready to run Logstash! SSH to your ELK host and run the following command:
/usr/share/logstash/bin/logstash -f /etc/logstash/manual_configs/logstash.hcpsyslog-allversions.conf -w 6 --path.data /usr/share/logstash/hcpsyslog/
2d: It will take a few seconds for Logstash to start, when it is ready you will see the following message. You can ignore a Warning about the logstash.yml file if you see one.
The stdin plugin is now waiting for input:
2e: Paste the following text into the command shell where you are running Logstash:
<142>Sep 22 10:58:17 HCP_System_Event: 0019025005 172.18.12.24 - - [22/Sep/2018:10:58:16 -0400] "GET /folder/object.png HTTP/1.1" 200 210 MyNamespace.Mytenant 007 0
If everything is working you should see the following output, though the fields may not be listed in the same order. If you do not see this output you will need to troubleshoot following the guidance in the Logstash Troubleshooting section of this article.
{
"response_Mbytes_per_second" => 0,
"request_bytes_per_second" => 0,
"message" => "<142>Sep 22 10:58:17 HCP_System_Event: 0019025005 172.18.12.24 - - [22/Sep/2018:10:58:16 -0400] \"GET /folder/object.png HTTP/1.1\" 200 210 MyNamespace.Mytenant 007 0",
"request_Mbytes_per_second" => 0,
"type" => "apache_access",
"auth" => "-",
"tenant" => "Mytenant",
"node_number" => 0,
"verb" => "GET",
"clientip" => "172.18.12.24",
"host" => "tmpelk.tech.archivas.com",
"response_bytes" => 210,
"response" => 200,
"namespace" => "MyNamespace",
"response_bytes_per_second" => 30000,
"@timestamp" => 2018-09-22T14:58:16.000Z,
"@version" => "1",
"request" => "/folder/object.png",
"latency" => 7
}
2f: At this stage you can edit your input filter to listen to the UDP port, reversing the change made in step 2b. After you do press ctrl-c to stop Logstash, and rerun the Logstash command to restart. Once Logstash is running issue some REST requests to a namespace (GET, PUT, HEAD, DELETE, etc.) using HCP's Namespace Browser, cURL, Cloudberry, or any other method you choose. Every request you issue against the HCP should result in output very much like the output above. If you do not see output when you issue REST requests to a namespace you will need to troubleshoot connectivity between the HCP node and your ELK server. Refer to the General Troubleshooting section of this post for tips.
2g: Now that you have validated your configuration file and you have validated that you are receiving HCP syslog messages it is time to start indexing. Edit your configuration file to comment out the stdout plugin and uncomment the elasticsearch plugin. After you edit the configuration press ctrl-c to stop Logstash, and rerun the Logstash command to restart. Once Logstash is running issue some REST requests to a namespace in HCP. In the next section I will show you how to confirm that your index has been created. If your index is not created when you issue REST requests to a namespace you will need to troubleshoot connectivity between Logstash and your Elasticsearch service. Refer to the General Troubleshooting section of this post for tips.
2h: Optional: You can run Logstash in the background using the following command:
nohup /usr/share/logstash/bin/logstash -f /etc/logstash/manual_configs/logstash.hcpsyslog-allversions.conf -w 6 --path.data /usr/share/logstash/hcpsyslog/ &
Step 3: Confirm Index Creation
Before we move on we must confirm we are successfully indexing. All of the steps from this point forward will be done within the Kibana UI. Make sure you issued some REST requests after enabling the elasticsearch output plugin as described in step 2g above.
3a: Log into your Kibana web application at http://172.18.12.24:5601/ where 172.18.12.24 is your ELK server's IP address. If you followed my instructions from the first blog post you will not need a username or password.
3b: In the left pane you will see a link for Dev Tools. Click the link to be taken to a page where you have a console to execute queries against Elasticsearch.
3c: In the left hand pane of the console enter the following query
GET _cat/indices?v
To the right of your query (or the selected query if there are multiple) you will see a green play button.
Click the play button to execute the query against Elasticsearch. If you are indexing successfully you will see output similar to the following where you have an index named access_logs-yymmdd. If you do not have an index named access_logs-yymmdd you must go back to step 2g and troubleshoot.
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
yellow open access_logs-180723 wSbM0h_WRqex8vDZPsOntQ 5 1 2 0 51.8mb 51.8mb
yellow open .kibana muTxANaNSlqhpdvsU6TuTg 1 1 45 1 70.1kb 70.1kb
3d: Now that you have confirmed everything is working it will be good to have some data in the index to look at. If your HCP is already busy handling REST requests you can skip this step. If not, and if you have a way to drive load to your HCP, go ahead and kick that off now, that way when you get to the final step you will actually have some data to look at.
Step 4: Create Your Index Pattern
Once you are successfully indexing HCP access log records in Elasticsearch you will want to create an index pattern which is required for our visualizations.
4a: Click the Management link in the left Kibana navigation pane.
4b: Click the Index Patterns link to go to the Create index pattern page in Kibana.
4c: In the field labeled Index pattern enter access_logs-*
4d: If you entered the correct text you should see a success message indicating your pattern matches at least 1 index. Click the Next step > button to proceed to the Configure settings page for your index pattern.
4e: In the Time Filter field name dropdown selector, select @timespamp
as your your time field. You may recall capturing this time with the date filter plugin in your configuration. This is an important field as most of the visualizations you will use are time series visualizations.
4f: Expand the advanced option by clicking the Show advanced options link to reveal a field labeled Custom index pattern ID. Enter the value access-logs-index-pattern
4g: Click the Create index pattern button to create your index pattern.
4h: You should now find yourself at a page displaying the details of your index pattern. You will see all the fields Kibana discovered in the index and their data types.
One thing to note at this point in the process. The configuration file is designed to work with 7.3+ HCP systems. If you create your index pattern with index documents from an HCP 7.3.x system, and you later begin indexing 8.0+ access log data you will need to recreate your index pattern. This is because certain fields did not exist in 7.3 and would not be discovered in Kibana unless you recreate the index pattern after you have indexed some 8.0+ documents. In this case, just delete your pattern and follow the steps above again.
Step 5: Import The Visualizations and the Dashboard
The final step before you are up and running.
5a: Download the attached kibana-access-log-exports.zip and extract the 3 enclosed files to the workstation where you have Kibana loaded in the browser.
5b: Click the Management link in the left Kibana navigation pane.
5c: Click the Saved Objects link to go to the Edit Saved Objects page
5d: Import your visualizations. In the upper right corner click the Import link and select the access-log-index-visualizations.json file that you extracted in step 5a. If prompted click Yes to overwrite existing objects.
5e: Import your dashboard. In the upper right corner click the Import link and select the access-log-index-performance-dashboard.json file that you extracted in step 5a. If prompted click Yes to overwrite existing objects.
5f: Optionally import Timelion Visualizations. In the upper right corner click the Import link and select the access-log-index-timelion-visualizations.jsonfile that you extracted in step 5a. If prompted click Yes to overwrite existing objects.
Step 6: View The Dashboard
6a: Click the Dashboard link in the left Kibana navigation pane.
6b: Click the AL Performance Dashboard link to open the Access Log Performance Dashboard.
6c: Don't see your data? Click the time selector in the top right and expand the time window to include your data.
On this dashboard you will see a number of visualizations that I find useful for example Latency by Count and Size below. This visualization allows you to see the impact object size may have on latency (total request time), or increased latency may have on count of operations.
DELETE access_logs-180919
DELETE access_logs-*
2. You can also query the index directly in the Dev Tools tab. For example this query shows a count of documents where the verb is GET or HEAD:
GET /access_logs-*/_count
{
"query" : {
"bool" : {
"must_not" : {
"terms" : { "verb" : ["GET", "HEAD"]}
}
}
}
}
You can also use direct queries to delete specific rows from your indexes. This will require you read the documentation to learn more about it.
Kibana
1. When in Kibana, hover over the white-space in any time-series visualization until you see a large + symbol. Click and drag across your visualization to zoom in on the selected time window.
Logstash Troubleshooting
When your Logstash configuration isn't working as expected it can be a bit tricky to pinpoint exactly what the error is, here are some pointers to help get you started. Effectively troubleshooting or writing Logstash configurations may require that you know regular expressions or are willing to learn.
- Start with the error message. These are not super helpful but sometimes help identify the line where things went wrong. Sometimes it is helpful to copy to a text editor like notepad++ to replace "\n" with actual newline characters for readability. Look for where it says "expected <some character> after <some location>", often the issue is in the line preceding the specified location.
- Run the Logstash configuration interactively. Comment out the external input plugins and uncomment the stdin input plugin. Comment out the external output plugins and uncomment the stdout output plugin. Now run the configuration and copy paste messages/events into the terminal window.
- Recursively modify the filter element of your configuration. This is where the bugs pretty much always are and how debugging gets done. Typically the issue will be in a grok match expression.
- Eliminate from the filter the bit you think is causing the problem. You can comment out the entire filter body, a block of code, an individual line, or selectively remove parts of a grok match expression. For example if this is your grok expression:
match => {
"access_log" => "%{IPORHOST:clientip} %{HTTPDUSER:ident} +%{HTTPDUSER:auth}"
}
Remove the suspected troublesome bits of the expression and replace them with .*
match => {
"access_log" => "%{IPORHOST:clientip}.*"
}
- Save your changes and rerun Logstash. If it works without error this confirms that you have removed the troublesome code.
- Make small incremental changes to add the code back that you removed. Each time you make a small change you save it, restart logstash, and process an event.
- Continue doing step 3 until you pinpoint specifically what code addition causes the error to return.
- Fix the broken code following the same process outlined above.
General Troubleshooting
- Make sure you have disabled your firewall or opened the ports on the firewall as described in the first post in this series.
- Make sure you have configured Kibana and Elasticsearch to listen on the public IP address of the ELK server as described in the previous post.
- Verify your Kibana and Elasticsearch service are running as described in the previous post.
- Verify network connectivity between the HCP and the ELK server.
- Verify that you have configured HCP as described in step 1 of this blog post, and that the IP address is correct, and the port is the same as the port specified in your Logstash configuration.
- Be sure that your Logstash configuration input and output plugins are configured properly and that you did not accidentally leave something commented out. For example, Elasticsearch indexes will not be created if the elasticsearch output plugin has been commented out.
Conclusion
The ELK tools are very flexible, powerful, and reasonably intuitive. While there is much more to know about these tools than can be covered by these guides, I hope that this series gives you a head start to set up your own ELK instance and configure it to monitor HCP.
If you do follow the instructions in this series I would love to hear from you. Please comment below to say you tried and it worked, tried and failed, or to provide any feedback which I can incorporate to improve the content for future readers.
Thanks!