In my first ELK post, we followed up “how to setup Elasticsearch/Kibana” environment. And if you go through that document may be you are now bit curious on how on earth we store data inside the Elasticsearch node and have them displayed or query back from the Kibana interface. This is exactly what we trying to achieve here and even beyond explore Logstash capabilities.
At very basic, Logstash is an log shipping agent that support various “Input” and “output” plugins. Input is where it takes data to be sourced from and from its Output it send those input data to the configured destinations. The supported input & output plugins list go over and over, but to list down few important, I think its a worth a note.
Input plugins;
- a text file
- a network socket
- redis database
- log4g/ log4net appenders
output plugins;
- a text file
- a network socket
- redis database
- elasticsearch database
Did you saw? Logstash can send data to a Elasticsearch node. Enough discussions.. lets get started.
Deploy your Logstash:
Like in my previous ELK post, we will follow up tar ball sources for the installation.
01. Download the source archive
wget https://artifacts.elastic.co/downloads/logstash/logstash-6.3.0.tar.gz
02. Move the Download tarball & change the directory to required location and extract it
mv logstash-6.3.0.tar.gz /opt/ && cd /opt/ && tar -xvf logstash-6.3.0.tar.gz
03. Create two more directories inside the extracted path
cd /opt/logstash-6.3.0/ && mkdir logs && mkdir conf.d
04. Further, a .conf extension file needed to be there under /opt/logstash-6.3.0/conf.d/. So let’s create that as well.
cd /opt/logstash-6.3.0/conf.d/ && touch main.conf
05. While Logstash starting up it check all configuration files that are live in /opt/logstash-6.2.0/conf.d/ *.conf with at least following context inside.
input { file { path => "/var/log/apache.log" tags => [“tag-name”] } } filter {} output { elasticsearch { hosts => [ "192.168.0.1:9200" ] index => testindex-%{+YYYY.MM.dd} } }
If I brief about what happen when this context of logstash start up, whenever a message appear on /var/log/apache.log it fetch that line and send over to elastic node which locates at 192.168.0.1. The index => testindex-* instruct logstash to create a new database called “testindex-*” and store the input data inside of it. The stanza {+YYYY.MM.dd} instruct Logstash such that the databases should be segregated in a daily basis, for example testindex-2019.01.02, testindex-2019.01.02, so on..
To start the service up, lets create a SystemD unit file inside
vim /etc/systemd/system/logstash.service
[Unit] Description=logstash [Service] Type=simple User=elastic Group=elastic Environment="JAVA_HOME=/opt/elastic/jdk1.8.0_131" ExecStart=/opt/logstash-6.3.0/bin/logstash "--path.settings" "/opt/logstash-6.3.0/config" "--path.logs" "/opt/logstash-6.3.0/logs" "--path.config" "/opt/logstash-6.3.0/conf.d" "--config.reload.automatic" Restart=always WorkingDirectory=/opt/logstash-6.3.0/ Nice=19 LimitNOFILE=16384 [Install] WantedBy=multi-user.target
Note that, as did in our previous post, Oracle Java is required. So be mind on the directory path that JDK lives.
Reload the SystemD unit database
systemctl daemon-reload
To invoke the service
systemctl start logstash
To verify whether the service started up, you can check upon the Logstash own logs
tail -f /opt/logstash-6.3.0/logs/logstash-plain.log
If you see “all the pipelines are started” such message, well done.. You did it.
Logstash Filters:
In my first post of ELK, we found bit of background about Elasticsearch, But, do you know that Elasticsearch is a Json store. Or on other words, Elasticsearch will always be stored any message as a Json data regardless of the incoming message type. This means it always preferred as a Json body for any incoming data.
Ok now you probably will ask, what happens if we want to deal with a data that is anything other than Json format. Cool, Logstash FILTER takes your back. Well, where do we get to work with these FILTERs? Just move back and go through the configuration file that we discussed above. In the same configuration file you will find a section called “Filter” in between Input & Output.
I think you already sensed it. INPUT takes the data, and then FILTER do the transformation for anything that is being input and finally OUTPUT send the structured data to the destination. Yes, absolutely.
Like Input/ Output plugins, there are tons of FILTER plugin available for message transformation. Here is a some list.
- grok => can be useful to convert lines of text message to a structured Json
- date => help to parsed a timestamp on a text message and output a time/date value that is compatible with Elasticsearch
- mutate => can be helpful to remove/add cetain key-value pairs within a Json
- geoip => Adds geographical information about an IP address
- dns => Performs a standard or reverse DNS lookup
“Message Transformation via Logstash FILTERs is a another interesting area. We hope to cover that area in-detail as well in our upcoming ELK posts. In the meantime, you can comment on this page for that to be get prioritized”