Logs are everywhere. We need them to understand what the computer system is doing, investigate root cause of any issues, or to do forensics. There’s a multitude of types of logs, spanning event logs, debugging logs, transaction logs, audit logs, anti-virus logs, network logs and many others.
logs: automatically produced and timestamped, human readable documentation of events relevant to a particular computer system
Pretty much every component of a computer system produces logs -- and in many cases, the logs are being made human readable to make them easier to use. The old-school way to work with logs was to login to the server and watch the incoming logs using a tool like tail, which could be time consuming. The more common approach is to use a central logging solution that gets data from multiple systems and provides a dashboard, and often more comprehensive, view of what’s happening across the environment.
Human readability != machine friendliness
While human readability does make the logs easier to use from the end-user perspective, it becomes a limitation when needing to perform deeper analysis. For example, when doing something like aggregations on certain properties of the logs or drilling down on a particular topic, human readability implies that the log format is not machine friendly and therefore cannot be analyzed quickly.
Of course, you can just index the logs, which offers the ability to do lookups or build views showing the frequency of log keywords. Indexing can provide interesting feedback, but it can be cumbersome to sift through the data to fully understand what’s going on in the system.
There are two solutions to that problem. The first one is to use a structured log format. An example of that is Microsoft Windows Event Log - where each log has a well documented ID and a set of common keys.
The second solution is to provide a log parser that is able to convert the human readable part into a machine friendly one, broken down into keys and values. There are parsers for many popular available formats (such as Common Log Format), but there’s still a long tail of less popular services (or custom applications) where no parsers are available. It’s possible to create one, though it’s frequently a complex and time consuming process.
This is the problem we are trying to solve at LogSense. Our team has a significant amount of experience in Natural Language Processing and Machine Learning - and we’ve each had our fair shares of frustrations with existing log management solutions available today. We spent the last year working on ways to apply Natural Language Processing and Machine Learning to the data and came up with a novel - currently patent pending - way for automatic, fast log parsing for unstructured data.
Understand how the logs are created
The first step to understanding a solution is to look at the way computer logs are being generated. Perhaps the most common way is that when a developer sees the need for logging an event, a template is created that describes the event and details parameters describing the properties of the event.
For example, in this Linux kernel log, the template describes that some process was blocked for more than a certain duration. It is parametrized with the process name, process ID and the duration.
INFO: task bonnie++:28623 blocked for more than 120 seconds. INFO: task nfsd:8232 blocked for more than 120 seconds. INFO: task a.out:15872 blocked for more than 120 seconds. INFO: task gcc:6341 blocked for more than 120 seconds.
As we can see when observing the log examples, the template keeps repeating and only the parameters change. Such observation is common for many types of logs. Would it be possible to automatically determine the template by looking at the log examples only? With LogSense, the answer is yes.
Automatic pattern discovery
When logs are sent to LogSense, the system breaks each one into tokens and counts occurrences of specific tokens and token sequences. When sequences of tokens appear frequently together, they create a new pattern. The new pattern is stored internally using a special version of Trie structure, which allows for very fast matching when processing the logs. There are several caveats to that process, such as that certain type of tokens (e.g. timestamps or IP addresses) need to be detected automatically to allow the system to process them effectively. For a log from our example, the automatically discovered pattern might be the following:
INFO: task <PARAM 1>:<PARAM 2> blocked for more than <PARAM 3> seconds.
Now, each time such log is encountered, it is assigned with the pattern_id and its parameters are extracted as key-values, making it easy to build charts or tables on them or use them in drill-downs.
Furthermore, an occurrence of a pattern_id might be interpreted as an occurrence of a previously seen, known event. The frequency of the event or the fact of its occurrence might indicate a problem or an anomaly.
Similarly, each parameter within the scope of a pattern_id describes a certain property of the event (e.g. duration when the process was blocked, name of the process, etc.). Both the pattern_id and the parameters provide a great input for analytics and anomaly detection.
The generic name (param_1, param_2, etc.) can be replaced with a user-friendly one using the pattern editor. Also, since some parameters might contain varying number of tokens (e.g. the name could cover only the first name or: first name, middle name and last name) - the pattern editor allows to mark a parameter as multi-token one.
Using the patterns
As a quick example on how automatically discovered patterns allow us to use the logs to their full extent, consider following Java Garbage Collection log:
The system discovered the pattern automatically and that discovery allows us to create a chart on any of the parameters. In this example, we are selecting the parameter responsible for total user time. With just one click, we can get following view:
This example required no definition of any sort of parsing rules by the user. The structure was found using our patent pending pattern discovery process. Having the ability to automatically convert unstructured data into structured ones is extremely helpful in building anomaly detection models.
If you're interested in seeing how automated pattern discovery can benefit your organization, we'd love to show you. Register for a demo or free trial today.