Remote Logging With Rocket Powered Rsyslog
A problematic scenario, sometimes caused as a result of an application’s logging misconfiguration, is when the “/var/log/messages” file fills up, meaning that the “/var” partition becomes full. Due to how systems work there are almost always times when your system’s logging causes unexpected issues. It’s therefore key that you understand and have the ability to control how your logging works and where your logs are saved.
Over the years when using Unix-like systems I’ve been exposed to three or four different versions of the Operating System’s default logging tool, known as “Syslog”. We will look at the logging daemon called “rsyslog” (http://www.rsyslog.com) a superfast Syslog product.
Before looking at the package itself we’ll start by exploring the configuration of your system logging locally. Then we will use that knowledge to go one step further and configure a remote Syslog server. We will also explore certain aspects which might cause you problems. With some background information in addition we will also improve your troubleshooting.
As mentioned we will look at rsyslog. We will focus on a long-established variety of Syslog, dating back to 2004, which has grown to become a favourite amongst Linux distributions. As a result rsyslog is now bundled as the default Syslog daemon on a number of the popular Unix-like flavours. It’s a substantial piece of software which is super fast, extensible and reportedly available on around ten Unix-like distributions. First an introduction into how your Linux system thinks of its logging at a basic level.
For years Sysadmins have debated at what level of detail to log their system data. There are a number of settings to affect the amount of logging detail which your server generates. We will look at how to configure the varying levels shortly. It’s a tradeoff between using up disk space too quickly versus not having enough information in your logs.
Let’s look at some of the detail settings which we can choose between. In Listing One we can see how the kernel ranks its errors, from zero to seven.
#define KERN_EMERG "<0>" /* system is unusable */
#define KERN_ALERT "<1>" /* action must be taken immediately */
#define KERN_CRIT "<2>" /* critical conditions */
#define KERN_ERR "<3>" /* error conditions */
#define KERN_WARNING "<4>" /* warning conditions */
#define KERN_NOTICE "<5>" /* normal but significant condition */
#define KERN_INFO "<6>" /* informational */
#define KERN_DEBUG "<7>" /* debug-level messages */
Listing One: How the system file “kernel.h” defines logging levels in some cases
The Rsyslog Basics
When first looking at a new package it is usually a good indication of how tricky that package might be to pick up from the syntax of its configuration. The following syntax is the type that “sysklogd” used in the past and the modern rsyslog uses too:
mail.err @ server.chrisbinnie.tld
Thankfully it is very simple once you understand the basics. The function of this logging example is also known as forwarding. The “info” logging mentioned (or in other words “informational errors”) is simply dropped locally into the file “/var/log/mail.log”. Whereas the “err”, or error condition logs, are sent off somewhere else to another server. In this case that server is called “server.chrisbinnie.tld”. Incidentally the “INFO” and “info” settings are not case-sensitive and its case is frequently interchanged for clarity within Syslog config.
The writers of the powerful Syslog software, rsyslog, (a gentleman called “Rainer Gerhards” apparently being the main author) refer to it as “The Rocket-fast System For Log Processing”.
This high performance package can seemingly manage a million messages per second (when asked to drop local logging events to disk if “limited processing” is applied)! That’s impressive by any measure.
The home page of the rsyslog package's website (http://www.rsyslog.com) includes a few of its features:
- TCP, SSL, TLS, RELP
- MySQL, PostgreSQL, Oracle and more
- Filter any part of syslog message
- Fully configurable output format
- Suitable for enterprise-class relay chains
That is exactly the functionality that you might look for in a best-of-breed Syslog daemon. Let usspend a moment now looking at the main config file.
Main Config File
Something to note, in case it causes you issues in the future, is that although the entries found in our main config file (“/etc/rsyslog.conf”) are read from the top down the order in which they are presented does in fact make a difference.
Run this command inside the “/etc” directory:
# ls rsys*
The main config file is called “rsyslog.conf” whereas the “rsyslog.d/” is the directory where you save your other configuration files. Looking in the “rsyslog.conf” file we use a type of “$IncludeConfig” statement to pick up any files with the “.conf” extension that reside in that directory as follows:# Include all config files in /etc/rsyslog.d/
These user-defined configs might include remote logging to a rsyslog server.
You can only include an individual file too or indeed a whole directory (no matter the file extensions), as follows:
What does our main config file look like inside though? Listing Two shows that the file includes lots of useful comments and the heart of our rsyslog config in addition. Bear in mind that this file defines local logging in most cases. However if you’re turning your local server into a recipient Syslog server too then you also add config to set that live there too. Note that after each config change you will need to restart the daemon as we will see shortly.# /etc/rsyslog.conf Configuration file for rsyslog.
# For more information see
#### MODULES ####
$ModLoad imuxsock # provides support for local system logging
$ModLoad imklog # provides kernel logging support
#$ModLoad immark # provides --MARK-- message capability
# provides UDP syslog reception
# provides TCP syslog reception
#### GLOBAL DIRECTIVES ####
# Use traditional timestamp format.
# To enable high precision timestamps, comment out the following line.
# Set the default permissions for all log files.
# Where to place spool and state files
# Include all config files in /etc/rsyslog.d/
#### RULES ####
# First some standard log files. Log by facility.
# Logging for the mail system. Split it up so that
# it is easy to write scripts to parse these files.
# Logging for INN news system.
# Some "catch-all" log files.
# Emergencies are sent to everybody logged in.
# I like to have messages displayed on the console, but only on a virtual
# console I usually leave idle.
# *.=notice;*.=warn /dev/tty8
# The named pipe /dev/xconsole is for the `xconsole' utility. To use it,
# you must invoke `xconsole' with the `-file' option:
# $ xconsole -file /dev/xconsole [...]
# NOTE: adjust the list below, or you'll go crazy if you have a reasonably
# busy site..
Listing Two:The default Debian “rsyslog.conf” is shown; other flavours vary, some more than others
Careful consideration has been programmed into this excellent software. It’s worth noting that there might be subtle config differences between versions so employ lookup differences online to compare notes between the older or newer syntax.
We’ll start by separating our view of rsyslog’s config into three parts. These parts would be any imported modules first of all, our config second and finally our rules.
First of all we will look at using modules. There’s a number of different types of modules but here’s an example to get you started. Simply think of the Input Modules as a way of collecting information from different sources. Output Modules are essentially how the logs are written, whether that be to files or a network socket. Another module type which can be used to filter the received message’s content is called a Parser Module.
As we can see from the top of our config file in Listing Two we are able to load up our default modules as so:
$ModLoad imuxsock # provides support for local system logging
$ModLoad imklog # provides kernel logging support
The comments are hopefully self-explanatory. The first module allows the dropping of logs to our local disks and if I’m reading the docs correctly the second module picks up events and drops them to “dmesg” after a system boot has completed and kernel logging has been taken over by the Syslog daemon.
The following commented-out line is for the “immark” module. This module can be very useful in some circumstances:
#$ModLoad immark # provides --MARK-- message capability
For example I’ve used it frequently when I’m filling the “/var/log/messages” file up with several entries a second whilst testing something. In addition to using the functionality in scripts I like to be able to type a Bash Alias super-quickly in the file “~/.bashrc” during my testing:
alias mes=’/usr/bin/logger xxxxxxxxx’
If you add that alias then you can simply type “mes” at the command prompt, as your user, to add a separator in the “messages” file. If you haven’t altered your “.bashrc” file in the past then after changing it you need to do this to refresh it.
# cd ~
# . .bashrc
I’m not sure but I suspect that the “--MARK--” separators, alluded to in the comment after the module’s config entry, were first introduced to add a line to a log file to show you that Syslog was still running if there has been no logging entries present for a little while.
You could add the markers to your logs every 20 minutes for example if your logs are quiet (using this entry “$MarkMessagePeriod 1200”).
I imagine too that it might be useful functionality if you have rotated your logs in the middle of the night and later on need to see that Syslog was still paying attention to the task in hand shortly after that point in time.
We can see the other modules are commented out. We’ll briefly mention modules again a little later but let’s continue on through our config file in the meantime.
The Global Directives section in Listing Two isn’t too alien hopefully. Have a look at this for example, the top entry:
# Use default timestamp format
Directives start with a dollar sign, as a variable, and then have an associated property. From that entry can see that we’re still wearing 1970s flared trousers and opting to go traditional with the format of the our of logging.
The “permissions” entries in amongst that config probably isn’t too tricky to translate either:
When rsyslog runs we can alter who owns what and which file-creation mask is used. The working directory and “$IncludeConfig” entries are hopefully easy enough to follow so let’s keep moving forwards.
I'm A Lumberjack
Now for the juicy stuff as we get our hands a bit dirtier with some logfile rules. Listing Three shows us the rules included by default with rsyslog on my Debbie-and-Ian machine:
Listing Three: The standard rules included on the Debian Jessie Operating System
Having covered the syntax earlier hopefully there’s no nasty surprises in Listing Three. If you wanted to add lots of content to one log file in particular (the following example is from a Red Hat box) then you would separate entries as so:
As you can see we’re throwing a fair amount at the “messages” log file in the example above. Each entry, let’s use “mail.none” as our example, follows a “facility.priority” format.
In other words in the Red Hat example above for the “mail” facility the config “mail.none” speaks volumes whereas conversely in order to capture “all” mail logs would be “mail.*” as seen in Listing Three.
The “none” may be merrily be replaced with any of the 0-7 error codes shown in Listing One, such as “INFO”.
The docs talk about both the “facility” and the “priority” being case-insensitive and how they can also receive decimal numbers for arguments. Take note from the manual however, that’s generally a bad idea: “but don’t do that, you have been warned”.
And, news just in (not really), is that the documentation is explicit about the “priority” settings “error”, “warn” and “panic” no longer being used as they are deprecated. Note that this is not visible in other docs that I have read so is likely to apply to newer versions I suspect.
One final point would be on the way that rsyslog deals with its error levels (a reminder of what we saw in Listing One and also to take heed that some of those are now deprecated in newer versions). The manual is typically very helpful in its order of “priority” and discusses them as is displayed in Listing Four.emerg (panic may also be used)
error (err may also be used)
warn (warning may also be used)
Listing Four: Preferred rsyslog error levels with those now deprecated struck-through (version v8-stable as of writing)
Onwards we cheerily go. From a “facility” perspective you can use the options as displayed in Listing Five.auth
security (equivalent to “auth”)
local0 ... local7
Listing Five: Available options for the “facility” setting, abbreviated and missing “local1” to “local6”
With your newfound knowledge I’m sure that it goes without saying that if you see any asterisks mentioned then it simply means that “all” of the available “facility” options or all of the “priority” options are included.
Note the configurable “local” settings from zero to seven missing from the abbreviated content in Listing Five. This brings us nicely onto our next section, namely how to configure a remote rsyslog server
Ladies & Gentlemen
Hopefully you’ll agree that the above configs are all relatively easy to follow. What about setting your logs live so that they are recorded onto a remote rsyslog server? If you’re sitting comfortably, here’s how to do just that.
Firstly let’s think about a few things.
Consider how busy your logs are. If you’re simply pushing a few errors (because of an attack or a full drive) over to your remote syslog server then your network isn’t going to be under much pressure. Imagine however a very busy Web server and you’re going to want to analyse the hits that it receives, using something like the logging analysis tool “Kibana”, via “Elasticsearch”, for example. That busy server might be pushing any number of uncompressed Gigabytes of data across your Local Area Network (LAN) and it’s important to bear in mind on a popular website that these hits are 24/7.
In such a scenario it is clearly key that your logs are all received without fail, to ensure the integrity of your log analysis. The challenge is that the logs grow continually, unremittingly and are generated every second of every day as visitors move around your website.
There’s also a pretty serious gotcha in relation to the rotation of logs (which there may well be a way of circumventing that am I yet to discover on the version (v5.8.10) of Syslog I was using). When you’re creating hillock-sized logs, the sizes can grow so large that you feel like you might begin to encroach on your nearest landfill site. As a result at some point your disks start to creak at the seams (no matter how big they are) and you have to slice up your logs and preferably compress them in addition.
One of the most popular tools for rotating logs is the truly excellent “logrotate” (http://www.linux-magazine.com/Issues/2015/172/Logrotate) of which I’m a big fan. The clever “logrotate” is well-built, feature filled and most importantly highly reliable (logs are valuable commodities after all; especially for forensic analysis following an attack or after an expensive Web infrastructure investment to ensure that the bang-for-buck ratio is satisfactory).
The gotcha, which I referred to a moment ago, surfaces in a fairly simple guise. When a log is rotated the usually-reliable rsyslog stops logging at the remote server end, even though the local logs continue to grow. It looks like some people have had problems on other distributions: https://bugs.launchpad.net/ubuntu/+source/rsyslog/+bug/940030
When faced with such a pickle, from what I could see at least, there simply weren’t config options to provide a workaround (even having tried different “Polling” configs and “$InputFilePersistStateInterval” workarounds; these might make more sense shortly). However, and I hold my hands up, it’s quite possible that I may have missed something. In my defence I was stuck with an older version that couldn’t be upgraded (it’s a long story) and possibly that made a difference. Before we see the solution which I opted for let’s look at how to create the remote logging config.
Remember the directory which we looked at in addition to the config file? I’m referring to the “/etc/rsyslog.d” directory.
Well, that’s where we insert our remote rsyslog server config. We dutifully create a file called something like “www-syslog.chrisbinnie.tld.conf” to refer to our logging server’s hostname, appending a “.conf” on the end and a “www-” for the service being in question which is logged. I’m using the hostname as an example in case your sanity is truly questionable and you want to push different application logs off to various servers. This naming convention should serve you well if so.
Have a look at Listing Six which shows the entirety of our file “/etc/rsyslog.d/www-rsyslog.chrisbinnie.tld.conf”.
Listing Six: Contents of our remote application’s config using the trusty “local3” within the file “/etc/rsyslog.d/www-rsyslog.chrisbinnie.tld.conf”
The key lines to focus on in Listing Six, starting from the top, are as follows.
We need to load up a module using “$ModLoad”. Step forward the outstanding “imfile” module which has the magical ability to convert any normal text content into a rsyslog message. The manual says that it will gratefully consume any printable characters which have a Line Feed (LF) at the end of each line to break up the otherwise monotonous content. Pretty clever I’m sure you’ll agree.
The next important line is obvious. The line starting “$InputFileName” tells rsyslog which log file that you’re interested in sending off to your remote logging server. The following line helps classify the log type with a “Tag” (which if you have multiple servers of the same application type sending logs to one remote server you might alter slightly per server to “apache-apache-ww1:” etc). Ignore the “$InputFileStateFile” log file for now and skim through the remaining lines.
We are collecting an “info” level of logging detail and pushing that out to the user-configurable “local3” facility and onto the IP Address “10.10.1.3”. The emoji that you can see, the two “@” signs, stands for “TCP”, only one “@” sign would signify transfers via the “UDP” networking protocol.
Come In Tokyo
What about our remote rsyslog server’s config?
Have a quick look at the two lines below which will sit in your remote rsyslog server’s “/etc/rsyslog.conf” file. For the sake of clarity this is the recipient rsyslog server, the one that’ll receive all the logs from multiple places. Unless you’ve lost control of your senses these two lines are easy to follow:
Incidentally if an IP Address is omitted (it can be explictly stated using something like “$TCPServerAddress 10.10.10.10”) then rsyslog will attempt to open up all IP Addresses on the port in question.
You might be pleasantly surprised at how easy it is to finish off the remote rsyslog server config. We use something called “templates” in rsyslog. They are powerful, extensible and worth a read about in more detail: http://www.rsyslog.com/doc/v8-stable/configuration/templates.html
At the foot of our “/etc/rsyslog.conf” file we simply add these lines:
$template ChrisBinnie, "/var/log/%HOSTNAME%/%PROGRAMNAME%.log"
I’ve just used an arbitrary template reference in this case, my name, for ease of distinction. You will need to restart the service after making these changes.
We can see that we’re going to drop our logs into a directory off of “/var/log” which has a hostname and then the application name. Appended to the end “.log” makes sense of the resulting filename. You can see which facilities and priorities are being added to this log file. In this case “all” and “all” thanks to the asterisks.
TCP or Not to TCP?
You might need to spend a second thinking back over some of the remote logging considerations I mentioned, such as network congestion if you’re pushing lots of logging data across your LAN. I mention this because (as we’ve seen with the emoji above) the clever rsyslog can use both TCP and UDP for pushing your logs around. TCP being the option which is best suited to most scenarios thanks to its ability to error check against network failures. It also doesn’t require an additional plugin to be loaded up as it’s a built into rsyslog and the reverse is true for the “UDP” protocol.
There’s two minor connection points to note however. Firstly, avoid using hostnames via DNS. Use an IP Address for greater reliability (CNAMEs being somewhere in the middle if you change machines around every now and again). Secondly, as with all things that might need debugged at some point, you should try and use explicit port numbers on both server and client ends so that there’s no ambiguity introduced. Incidentally without a port being explicitly specified both protocols default to “514”.
Little Dutch Boy
A reminder, if you’re using properly configured systems, that you might need to punch a hole in the dyke. Looking at the example below, clearly you need to alter the port number after “--dport” to your port number of choice. You can then save the setting to make it persistent with something like “/sbin/service iptables save” or whatever your distribution prefers.
If you need to allow access using the default TCP option and you’re using “iptables” you can use this command:
# iptables -A INPUT -p tcp -m state --state NEW -m tcp --dport 514 -j ACCEPT
And, for good, old UDP:
# iptables -A INPUT -p udp -m state --state NEW -m udp --dport 514 -j ACCEPT
Back to the dreaded gotcha. So that we’re clear: it’s on the client or sender machine we make a little fix to our log rotation bug/problem. Not the remote rsyslog server.
At the risk of repeating myself there may well be better ways of fixing the log rotation problem (with a version upgrade for example) however read on to see what fixed it for me. A reminder of the issue: after “logrotate” had run the remote logging stopped. The solution was a simple script set to run via a cron job after “logrotate” had run in the middle of the night.
There is some method to the madness of my script. By running it we effectively insist that after a log rotation has taken place the faithful rsyslog pays attention to its “state” file. Let’s peek into an example of such a file now:
Listing Seven: Our rsyslog “state” file example
We shouldn’t profess to understanding what’s going on in Listing Seven, I would hazard a guess that rsyslog is counting lines it’s processed amongst other things to keep it operating correctly. Let’s promptly move onto Listing Eight. This is the cron job and script solution that fixed the issue for me.#!/bin/bash
# Deletes stale rsyslog “state” files, appends a timestamp to the new filename in /tmp & restarts rsyslog.
# After rsyslog restart the remote logging node should catch up any missed logs fairly quickly.
# Declare var
# Delete all “state” files in /tmp
/bin/rm -f /tmp/apache_state_file*
# Edit rsyslog file which sends data remotely in order to show newly named “state” file
/bin/sed -i.bak 's/apache_state_file-[0-9]*/apache_state_file-'"$timestamp"'/g' /etc/rsyslog.d/www-syslog.chrisbinnie.tld.conf
# Apply changes to “state” file in use
/sbin/service rsyslog restart
Listing Eight: Our quick-fix script to get log rotations to speak to rsyslog satisfactorily after “logrotate” has finished doing its business
If you read the comments at the top of the script then all going well the scripts raison d’etre should make sense. This script is run sometime around 5am after the excellent “logrotate” has finished its sometimes lengthy business. There’s no point in running the script during or before “logrotate” has finished its run (during testing my remote logging still failed).
The small script works like this (you probably need to run it twice initially to clean up “/tmp” filenames which should be harmless if you do it manually). It deletes the old “state” file upon which rsyslog relies, works out the current time and appends that time to the end of the “state” filename.
As a result the “/tmp” directory ends up having one or two files that look like this “apache_state_file-1321009871”. It then backups the existing remote logging config file and changes the “state” file that it references. Finally a super-quick service restart means that the remote logging data starts up again and the other end (the remote rsyslog server) catches up with any missed logs in a whizz-pop-bang if there’s lots of data.
My experience is that if you tail the recipient log after running this script (or just performing a restart) you’ll see the catch-up taking place super-speedily.
In case you’re wondering I found that sometimes a service restart didn’t pick up the pieces properly but altering the “state” file it referenced was successful without fail. Your mileage might vary of course.
As mentioned I’m a little suspicious of an old version issue that I needed to fix with this script. In the current version you can see there is some information about the current version’s “state” file here: http://www.rsyslog.com/doc/v8-stable/configuration/modules/imfile.html#state-files
Hopefully my solution might give you some pointers and help you out if you encounter a similar scenario however.
I Can't Hear You
If your, already exceptionally noisy, log file “/var/log/messages” begins to fill up with your application’s logs too then here’s another little life-saver. The workaround is simple, you just need a service restart, after applying this “;local3.none“ addition to the relevant destination log file line in “/etc/rsyslog.conf”:
No doubt you get the idea, this disables “local3” logging locally.
End Of Watch
I should have probably put more emphasis on the how configurable and extensible rsyslog is at the start. It would be remiss not to point you at the extensive, well-maintained, documentation which is all linked to from the application’s main website (http://www.rsyslog.com). There’s modules and templates galore for you to explore in high levels of detail. And, as well as user examples on the main site there’s an excellent Wiki with some trickier configuration examples: http://wiki.rsyslog.com/index.php/Configuration_Samples. Also if you’re eager for even more then there’s a list of modules:
Now that you’ve been presented with the solution to a working remote rsyslog server, even with the challenges that log rotation throws into the mix, I hope you’ll think more about your logging infrastructure. I had originally used a Syslog server, back in the day, to pick up salient events from Cisco routers. So universal are the logging formats supported by rsyslog that you can also connect them to all sorts of devices such as Load Balancers and other proprietary devices. You are far from being limited to picking up logs from other Unix-type devices but instead are blessed with the ability to collect logging event information from all over the place.
I hope you enjoy putting your new logging knowledge to creative and productive use.
If you've enjoyed reading the technical content on this site then please have a look at my Linux books which were both published in 2016 and some of my articles in Linux Magazine and Admin Magazine are available on their respective websites.