Centralized Logging — EFFK — Elastic Search + Fluent Bit + Fluentd + Kibana
Introduction
EFK Stack is an open-source centralized logging solution based on Elastic search for collecting, parsing and storing logs. Elastic Search, Fluentd, and Kibana, when used together, form an end-to-end stack (EFK Stack) which provides real-time data analytics tool from any type of structured/unstructured data.
Fluentd is a Cloud Native Computing Foundation (CNCF) graduated project hence it becomes the obvious choice in the industry.
Fluentd acts both Log Collector and Log Aggregator. As you are aware Fluent Bit is lightweight compared to Fluentd. So instead of running heavyweight Fluentd for log collection, it makes sense to make use of Fluent Bit as Log Collectors and Fluentd as Log Aggregator.
In this article, we will explore on how we can enable Fluent Bit along with Fluentd for centralized logging. We will store the logs in Elastic Search and we will make use of Kibana for visualization.
The below diagram depicts the flow/setup.
Pre-requisite
- Cluster Environment created with EKS or KOPS (Minikube for local setup)
- Ensure that you have enough storage in the Nodes
- Kubectl installed
Note:
# The YAML files required for this setup is available in the GitHUB repository
# Please do check and update the latest version of the docker images for Fluentd, Fluent Bit, Elastic Search and Kibana in the YAML files while implementing
Implementation
STEP 1: Verify Basic Environment Setup
Ensure that kubectl is enabled in your environment and you are able to run kubectl commands. Make sure that you have downloaded the YAML files from GitHub .
Note: You need to run the kubectl commands from within the folder where YAML files are present. Or else please do provide the appropriate folder path while running the kubectl commands.
STEP 2 : Create Namespace
Create a new Namespace. The namespace which we will be creating is “efk-fluentbit”
kubectl apply -f namepace.yaml
Verify that Namespace is created properly by executing the following command
kubectl get namespaces
STEP 3: Enable Service and Role
Next, we need to create the Service Account followed by Role and Role Binding.
As you are aware, a Role always sets permissions within a particular namespace; when we create a Role, we have to specify the namespace it belongs in. And appropriately we need to create the RoleBinding for the Role.
Based on the need, whether we need to enables a role across Cluster or only to a specific namespace, we need to create and execute the appropriate YAML definitions. In our case we will be using ClusterRole and ClusterRoleBinding.
Run the following
kubectl apply –f service-account.yamlkubectl apply –f cluster-role.yamlkubectl apply –f cluster-role-binding.yaml
STEP 4: Create Persistent Data To Store Log Data
As highlighted, we will be storing the logs using Elastic Search. We will make use of Persistence Volume to store the log data
For demonstration purpose we may use emptyDIR volumes. Anyhow keep in mind that emptyDir volumes get deleted when a Pod gets deleted. It’s better to use Persistent Volume for logs. Kubernetes persistent volumes remain available outside of the POD lifecycle — this means that the volume will remain even after the POD is deleted. It is available to claim by another POD if required, and the data is retained.
Run the following command to create the Persistence Volume and Persistence Volume Claim. You need to be aware that Persistence Volume Claim will be for a particular Namespace.
kubectl apply –f persistence-volume.yamlkubectl apply –f persistence-volume-claim.yaml
Execute the following commands to verify that the Persistence Volume and Persistence Volume Claim is created properly.
kubectl get pvkubectl get pvc -n efk-fluentbit
STEP 5: Enable Elastic Search
PURPOSE
* Install Elastic Search
* Persistent Volume created in STEP 4 will be used by Elastic Search
Now we have created the necessary base for our environment/setup. We will start implementing the required docker images. We will start with Elastic Search which we will deploy as StatefulSet.
For the Volume section, we will mention the Persistence Volume Claim which we have created in earlier step.
volumes:
- name: data
persistentVolumeClaim:
claimName: fluentbit-volumeclaim
Run the following commands to create the Elastic Search and then to verify that it has been created properly
kubectl apply -f stateful-elasticsearch.yaml
To check the functionality of Elastic Search, do port forwarding.
kubectl port-forward es-cluster-0 9200:9200 --namespace=efk-fluentbit
Run curl command to ensure that Elastic Search is set properly and working fine.
curl http://localhost:9200/_cluster/state?pretty
STEP 6: Enable Kibana UI
PURPOSE
* Install Kibana
* Kibana will point to the Elastic Search created in STEP 5 to get
the logs details in the UI
Now we will enable Kibana which we need to visualize the logs. Run the Kibana YAML file to create the Deployment and Service for Kibana. In the deployment we will denote the Elastic Search which is our data store for logs. Kibana will pick the details from Elastic Search and we will be able to visualize the same in the Kibana UI.
kubectl apply -f kibana.yaml
To access the Kibana UI we need to enable Kibana service as Load Balancer. Execute the following command to get the URL/External-IP for Kibana.
kubectl patch service kibaba --patch '{"spec":{"type":"LoadBalancer"}}' — n efk-fluentbit
Use the External-IP to access the Kibana application in browser. Please ensure that you provide the port 5601 in the URL in which Kibana service is being enabled.
STEP 7: Deploy Fluent Bit
PURPOSE
* Fluent Bit deployment to collect the log data from Cluster environment
* Use Configuration file to mention
# From where to collect the log
# And where to share it. In our case we need to pass on the logs
to Fluentd which is our Log Aggregator
Now we have enabled Elastic Search for log storage and Kibana to view the log details. We will go ahead and the implement Fluent Bit which will act as our Log Collector.
We will be deploying Fluent Bit as DaemonSet so that we will be having Fluent Bit Pods in all available Nodes to collect the logs from all Pods running in each Node.
We will be using a ConfigMap for Fluent Bit.
[SERVICE] — what service is being enabled and what kind of parser we will be using. The parsing logic can be found in the appropriate entry under the [PARSER] section of the configmap file.
[INPUT] — Location from where we need to fetch the log files.
Note: You need to modify this based on your application which you have deployed in the Cluster environment
[FILTER] — Filter the logs to get the desired data. In our case we are reading ALL (*) data
[OUTPUT] — Destination where we need to send the logs. In our case we will be sending it to Fluentd which will be our Log Aggregator.
Note:- You need to modify this entry based on your application which you have deployed in the Cluster environment
[PARSER] — How we need to parse the log data which we have got.
Execute the following commands to enable the Fluent Bit
kubectl apply -f fluent-bit-configmap.yamlkubectl apply -f fluent-bit-daemonset.yaml
Execute the following command to verify that Fluent Bit PODs are created and are healthy.
kubectl get pod -n ekf-fluentbit
Note: While we run Fluent Bit service we may encounter some errors in the Fluent Bit log. This is because Fluentd is not yet set and Fluent Bit will be trying to share logs to Fluentd.
STEP 8: Enable Fluentd
* Install Fluentd
* Make use of a Configuration file to
# Mention from where we are going to get the data from
# Denote what kind of log data we are going to fetch and where
it will be stored. In our case, we will share it to Elastic
Search
We have now enabled Fluent Bit to collect the logs from the Nodes (PODs within each Node) across cluster and transport it to Fluentd. Now, its time to enable Fluentd which will be our Log Aggregator and is responsible to store the data in our Elastic Search. In the Fluentd deployment we will specify details about Elastic Search where the data need to be stored in the env section. We will make use of configuration file fluentdconfig. This config file entry, we will specify in the volumeMounts and volumes section.
The following are the primary tags we need to use in the ConfigMap.
The source is to say that its a forwarded log from another resource, here in our case Fluent Bit. In the match, we are reading the logs which matches the rule we set and then to pass it on to Elastic Search with a required Prefix.
In the file we have mentioned two matches to read logs for two different applications i.e hello-world and hello-istio.
Note: You need to modify this based on your application which you have deployed in the Cluster environment
Please note that the service name Fluentd is what we have provided for Host value in the OUTPUT section of Fleunt Bit Config file.
Run the following commands to implement Fluentd.
kubectl apply -f fluentd-configmap.yamlkubectl apply -f fluentd-deployment.yamlkubectl apply -f fluentd-service.yaml
Ensure that all required Deployments and Services are working fine
kubectl get all -n efk-fluentbit
If we don’t see any errors, then we are all set with the setup.
STEP 9: Access Logs In The Kibana UI
Access the Kibana UI to check whether you are able to get the desired logs.
Please make sure that you create proper Index in Kibana UI Console. You will be able to create index based on the Logstash_prefix which we have mentioned in our Fluentd ConfigMap configuration file. Here in our example the index will be based on “hello-world*”
Note: There could be slight delay in log data getting generated in the Kibana console. Please do provide some time and refresh the screen to see the latest logs.
Conclusion
We have successfully enabled EFFK stack in a Cluster environment. Trust the article would have given a good understanding on how to enable Fluent Bit along with Fluentd for Centralized Logging mechanism. You need to add necessary Role, Security features, proper storage mechanism while enabling this in higher environments.