A QA Team should be able to add a monitoring system to their machines with a bit of effort. I did this on my own, on my own time. It took me a few weeks and a few retries with getting ELK setup. But once I had it working I was able to consume data on Memory, CPU and DISK of each QA machine.
Better Than Top
The problem with top, is that it shows what’s happening now. What if code comes in that appears to show a bit more memory usage, or CPU is spiking while the automation is running? Top can show the issue as it’s happening, but can it compare to a result from the last known good (master), from 2 weeks prior?
These dashboards are also very visual. Clearly we can see the impact of what’s in memory and how much % it consumes, or what process is having the heaviest hit on CPU. Then we can scroll back through the history and compare at any other given moment.
But ELK doesn’t just deal with performance metrics….
ELK – what it is and what it does
ELK consumes logs. Any logs. It has recipes for parsing the most common log varieties (Apache, Tomcat, Suricata, HAProxy, etc.) The logs are parsed and each bit of data is a data object. These data objects are easily added to a visual display to form relationship studies. An analyst can dig into the processes running on a machine, and can spot an incident of compromise. Or someone might discover a memory leak that surfaced several weeks ago but went unnoticed until now. ELK is about log data consumption and analysis.
ELK stands for Elastic, Logstash, Kibana. Those are the three primary utilities that need to be installed.
To set up a monitor for QA, you’ll need to add a performance monitor. ELK has many monitoring add ons called “beats”, which run on a monitored system and report back to ELK. To monitor CPU/Mem./Disk, we’ll need to add Metricbeat. There are many other beats, a full list can be found at : https://www.elastic.co/beats
Installing ELK as a QA Monitor
To simply monitor QA machines for CPU/Memory and Disk load, we’ll need to install:
- Elastic (the core backend)
- Kibana (the front end)
- Filebeat (primary log shipper)
- Metricbeat (metric data shipper)
ELK should be installed in a central location – that is, on its own server. I run it at home, and at work, on its own VM. Each machine you are monitoring, will have metricbeat installed, which is configured (through a .yml file) to talk to the ELK VM/Server.
Installing Elastic 7.17*
The following instructions will walk through the commands to setup and configure ELK version 7.17*. Keep in mind, ELK is on version 8 at this time. To install the most recent version of ELK, I would refer you to their official documentation.
# most of this is from https://www.digitalocean.com/community/tutorials/how-to-build-a-siem-with-suricata-and-elastic-stack-on-debian-11
curl -fsSL https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add -
echo "deb https://artifacts.elastic.co/packages/7.x/apt stable main" | sudo tee -a /etc/apt/sources.list.d/elastic-7.x.list
sudo apt update
sudo apt install elasticsearch=7.17.8
sudo apt install kibana=7.17.8
sudo nano /etc/elasticsearch/elasticsearch.yml
## Modify the network section to add:
network.bind_host: ["127.0.0.1", "your_private_ip"]
## At end of file append:
discovery.type: single-node
xpack.security.enabled: true
# without the following daemon-reload I couldn't start the elasticsearch service.
sudo systemctl daemon-reload
sudo systemctl enable elasticsearch.service
sudo systemctl start elasticsearch.service
# elastic search needs to be running in order to do the next part:
cd /usr/share/elasticsearch/bin
sudo ./elasticsearch-setup-passwords auto
# Log the output passwords someplace, it will be used later on
cd /usr/share/kibana/bin/
sudo ./kibana-encryption-keys generate -q
# save the xpack output for next step
sudo nano /etc/kibana/kibana.yml
# paste the xpack generated data at the end of this file
# while in Kibana.yml look for server.host: "your_private_ip" and set it to your private ip.
sudo ./kibana-keystore add elasticsearch.username
# when prompted for the elastic search.username, input:
> kibana_system
sudo ./kibana-keystore add elasticsearch.password
# when prompted for a password, enter the kibana_user password that was generated earlier on.
sudo systemctl start kibana.service
Installing Filebeat
Filebeat is a log shipper. It ships logs from a monitored machine to the ELK server, and needs to be installed on each individual machine you wish to monitor. In this case, QA machines.
I would attempt this on one QA machine first and verify the data is flowing into ELK (under the Discovery link in the left menu). Once it works on one machine, move on to the rest.
### Filebeat install
curl -fsSL https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add -
echo "deb https://artifacts.elastic.co/packages/7.x/apt stable main" | sudo tee -a /etc/apt/sources.list.d/elastic-7.x.list
sudo apt update
sudo apt install filebeat=7.17.8
sudo nano /etc/filebeat/filebeat.yml
# Depending on your OS and the logs you want to modify, you may need to change paths.
# my paths value is below:
paths:
- /var/log/*.log
# search for the Kibana host section and add:
host: "IP of your ELK server:5601"
# search for the output.elasticsearch section and add a line:
hosts: ["IP of your ELK server:9200"]
# after this add the username for elastic and the password generated during the elastic search install
username: "elastic"
password: "the password generated during the elastic search install"
# save and exit the yml file
sudo filebeat modules enable suricata
sudo filebeat setup
sudo systemctl start filebeat.service
Filebeat Modules
By default Filebeat is looking in your OS log folder (/var/log), but there are specialty software it supports. There are recipes for fortinet, squid, sonic wall, tomcat, suricata, Kafka, juniper, activemq, apache, mysql, nginx, Cisco and more. To enable these modules, you will do the following:
sudo filebeat modules enable [module name]
# after it's setup you need to edit the module config.
# module configs are in a location like /etc/filebeat/modules.d/
sudo nano /etc/filebeat/modules.d/apache.yml
# The basic setup here is to update the config file with pathing to the logs specific to this software. For example to check nginx:
- module: nginx
# Access logs
access:
enabled: true
# Set custom paths for the log files. If left empty,
# Filebeat will choose the paths depending on your OS.
var.paths:
- /var/log/nginx/access_post.log
- /var/log/nginx/access.log
# Error logs
error:
enabled: true
# Set custom paths for the log files. If left empty,
# Filebeat will choose the paths depending on your OS.
var.paths:
- /var/log/nginx/error.log
# Ingress-nginx controller logs. This is disabled by default. It could be use$
ingress_controller:
enabled: false
# After updating all config files, restart filebeat:
sudo systemctl restart filebeat
If you check the filebeat, elastic and Kibana services (sudo systemctl status [service name]) they should show up active and without errors.
By going to the IP of the VM/server this stack was installed to, and port 5601 (i.e. 192.168.1.1:5601) you should see the Elastic login page. Now you can log in as the user generated during install.
If all works out, and you login, you should see a landing page like so:
The left-side menu expands outwards, revealing options like those below:
When the log data starts getting consumed, it will appear under the “Discover” link in the menu. This shows raw data. Nothing is too usable here, but it’s a good way of quickly spotting that your ELK installation is working and filebeat is consuming logs.
Installing MetricBeat
Metricbeat needs to be installed on the machines being monitored. Let’s assume you have 10 QA machines (qa01-qa10), then on each you would install metricbeat. You need to keep the metricbeat version the same as your ELK version. If you use ELK 8, follow the instructions for installing Metricbeat 8.
As your QA machines may be a different OS’s, I would refer the reader to the official install instructions. Chose “Self Managed” (as you are managing ELK on your own server and not their cloud). Choose the OS your QA machine is (Debian, Redhat, Mac, Windows). on step 2 (connect to elastic stack), click “Self-Managed” and make sure to edit that line your metricbeat.yml file to the URL of your ELK server, and the username/password to access it.
Like filebeat, metricbeat has a variety of modules. You can list them with
metricbeat modules list
You can enable a module by running:
metricbeat modules enable [module name]
If you add a module, be sure to update the yml file in /etc/metricbeat/modules.d/ with the required path to log files.
Metric Modules I Use:
- linux
- system
My /etc/metricbeat/modules.d/linux.yml looks like this:
- module: linux
period: 10s
metricsets:
- "pageinfo"
- "memory"
# - ksm
# - conntrack
# - iostat
# - pressure
enabled: true
My /etc/metricbeat/modules.d/system.yml looks like this:
- module: system
period: 10s
metricsets:
- cpu
- load
- memory
- network
- process
- process_summary
- socket_summary
#- entropy
#- core
#- diskio
#- socket
#- service
#- users
process.include_top_n:
by_cpu: 5 # include top 5 processes by CPU
by_memory: 5 # include top 5 processes by memory
# Configure the mount point of the host?^`^ys filesystem for use in monitoring $
#system.hostfs: "/hostfs"
- module: system
period: 1m
metricsets:
- filesystem
- fsstat
processors:
- drop_event.when.regexp:
system.filesystem.mount_point: '^/(sys|cgroup|proc|dev|etc|host|lib|snap)$
- module: system
period: 15m
metricsets:
- uptime
If all worked out, you should see data populating the Discovery link in ELK (left side menu).
Building a Dashboard
From the Left nav, clicking Dashboard will show a list of installed Dashboards. Most of this will not work, as they are demo dashboards for software you likely don’t have installed. There’s a button “Create Dashboard” on the top right. Click that and you will have an empty dashboard. Click “Create Visualization”.
From the Create Visualization screen, the left side has a nav like the image to the left.
System or log data comes into ELK and is stored into the index as data objects. You can search for these data fields in the search box, or scroll through them. If you click a field it will show example data. This is useful for getting an idea of what a field represents.
Dragging a field into the empty area to the right will display that data against a default timeline graph.
So let’s populate some data. If you have Linux and System data configured on Metricbeat, you should be able to find system.process.cpu.total.cpt – if you have data on that field, drag it into the empty space to the right.
You should get a view of all the CPU load across all your machines.
On the far right are some options, like “Break down by type”. If you drag agent.name to that filter, it will split all the CPU usage by machine.
A second CPU graph can be created to filter by process. Processes can be funny. Although I don’t have/use mongodb, my process data is under the field mongodb.status.process which I use as a Break down by Type filter.
This takes care of CPU. The same can be done with Memory and Disk.
For Memory, I use a field called system.memory.free and use the agent.name as the filter to break it down by. For memory I usually use a line graph.
As you fill this data out – the dashboard will start to come together.
My QA dashboards are somewhat complex. I capture data from NGINX as well, showing graphs on browser/os, NGINX errors, database slow logs, IDS data and a whole lot more.