Normally log automatically should grow as big possible. But we use logrotate which does the opposite action as running cat and other tools on SSH over big files consumes memory, time and often difficult to do what we want. Also, old files are compressed. Here are the ways to join/merge multiple log files for big data analysis, store them to openstack based cloud storage and delete old files.
Join/Merge Multiple Log Files For Big Data Analysis
To test this method, you can go to /var/log/
and your web server’s directory like /var/log/nginx
or /var/log/apache2
. There will be access.log
files. In order to get a list of filenames sorted by time listed by newest first, you can run this command :
1 | ls -t |
Normally we can run cat
to concat files in this fashish :
---
1 | cat access.log access.log.1 >> all_access.log |
But more that access.log.1
become compressed, which we can list by running this command :
1 | ls -t | grep access |
Which will give this kind of output :
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | access.log access.log.1 access.log.2.gz access.log.3.gz access.log.4.gz access.log.5.gz access.log.6.gz access.log.7.gz access.log.8.gz access.log.9.gz access.log.10.gz access.log.11.gz access.log.12.gz access.log.13.gz access.log.14.gz other_vhosts_access.log |
To make it practical, we will create a directory named access_combo
:
1 | mkdir /var/log/apache2/access_combo |
Then move all the files to that directory :
1 | mv /var/log/apache2/access.log* /var/log/apache2/access_combo |
Move the current access.log
back, properly chown and restart web server :
1 2 3 4 | cp /var/log/apache2/access_combo/access.log /var/log/apache2/access.log chown www-data:www-data /var/log/apache2/access.log echo " " > /var/log/apache2/access.log systemctl restart apache2 |
echo " " > /var/log/apache2/access.log
command empties a file.
For the compressed files, we can use zcat
instead of cat like :
1 | zcat access.log.2.gz access.log.3.gz >> all_access.log |
As only two files are uncompressed in our case, we can compress them and delete the original :
1 2 3 4 5 | cd /var/log/apache2/access_combo/ tar -cvzf access.log.gz access.log tar -cvzf access.log.1.gz access.log.1 rm access.log rm access.log.1 |
Now if you run ls -t
, you’ll get a sorted, same format list of files :
1 2 3 4 5 6 7 | cd /var/log/apache2/access_combo/ ls -t # output access.log.1.gz access.log.4.gz access.log.8.gz access.log.12.gz access.log.gz access.log.5.gz access.log.9.gz access.log.13.gz access.log.2.gz access.log.6.gz access.log.10.gz access.log.14.gz access.log.3.gz access.log.7.gz access.log.11.gz |
We can simply merge them them with a time stamp :
1 2 | cd /var/log/apache2/access_combo/ zcat access.log.* > merged_access_`date +"%d-%m-%Y"`.log |
The command will take some time. However, I will suggest a manual work as possibly you need a manual check of dates of each files, access.log.1.gz
not newest. access.log.14.gz
is oldest in out case.
1 2 3 4 5 6 7 | cd /var/log/apache2/access_combo/ zcat access.log.gz > merged_access.log zcat access.log.1.gz > merged_access.log zcat access.log.2.gz > merged_access.log ... ... zcat access.log.14.gz > merged_access.log |
We can rename with date stamp :
1 | mv merged_access.log merged_access_`date +"%d-%m-%Y"`.log |
Now, we do not need the old compressed files :
1 2 | cd /var/log/apache2/access_combo/ rm access.log.* |
The total thing appearing many commands but we can create a small bash script to make it automated except the last step to merge :
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | #! /bin/bash mkdir /var/log/apache2/access_combo mv /var/log/apache2/access.log* /var/log/apache2/access_combo cp /var/log/apache2/access_combo/access.log /var/log/apache2/access.log chown www-data:www-data /var/log/apache2/access.log echo " " > /var/log/apache2/access.log systemctl restart apache2 tar -cvzf /var/log/apache2/access_combo/access.log.gz /var/log/apache2/access_combo/access.log tar -cvzf /var/log/apache2/access_combo/access.log.1.gz /var/log/apache2/access_combo/access.log.1 rm /var/log/apache2/access_combo/access.log rm /var/log/apache2/access_combo/access.log.1 echo "Ready to merge the files:" ls -t |