Once you’ve gotten your application containerized and distributed across multiple containers and compute nodes you realize you have a problem. How do you monitor your application logs in a reasonable way? Let’s take a look at logging services and how they can help.
There’s a lot of competition in this space. In the past we’ve always used either splunk for log management or logstash. Both are on premise applications which is great if you have a high volume (GBs) of logs being generated by devices and applications in your own data center.
In the case we’re looking at today we have exactly the opposite. Our app is in the cloud and it generates a relatively small amount of logging information (10s of MB / day vs 10s GB / day). Our case needs something cheap (free?) and easy that let’s a few developers see what’s going on with our application.
They will have three primary use cases:
- Notification of an error (like a stack trace) through email or some other notification service like Slack.
- Live debugging where we want to see live updates from all application logs (Live Tail).
- Forensic debugging where we search logs for past errors or application usage.
We also want it to be easy to collect information from our containers. More on this later
We’re going to take a look at three services (if you have others to recommend please leave a comment). Loggly, Logentries, and Sumo Logic. Looking at their pricing pages they all have free options with enough space per day and retention to meet our requirements. We only have a few developers in our example use case so individual logins isn’t really important (although as team size grows this will change).
Here’s a feature matrix breakdown of the use cases for the free versions of the three products.
Since Sumo has all the features to support our 3 use cases we’re going to dive deeper with it.
Getting data into the logging service turned out to be pretty interesting. When we started we thought that sending logs directly from the application would be the way to go. It turned out that all three companies have a container that reads from docker’s “/var/run/docker.sock” and because our app is made of four different containers this was a great option.
We’re using docker-compose to run our containers currently so adding a container the docker-compose.yaml file was all that we needed to do. Here’s an article on how to setup Logentries docker container using docker-compose. For Sumo this looks like the following:
sumologic: image: 'sumologic/collector:latest' restart: always environment: - SUMO_ACCESS_ID=MYACCESSID - SUMO_ACCESS_KEY=MYACCESSKEY volumes: - '/var/run/docker.sock:/var/run/docker.sock'
The docker hub repo for the sumo collector has more configuration information options.
On the service side this creates a collector configuration automatically when started and starts feeding logs from all the running containers on the host to Sumo. The default setup includes docker stats which don’t seem very useful in the UI format that Sumo provides so we deleted this data source. We also enabled multi-line processing since our example app has multi-line stacktraces.
We were surprised at how many bumps there were in the road getting basic container monitoring setup. Log management has always been a pain in the butt but seems like the UI of all three products should have been much more mature given their age. Logentries visualization of the data coming out of their container was a bunch of JSON junk with the actual log line buried in value inside the JSON data. Sumo does a better job on the UI but the container setup required some manual configuration in the UI that might be better done in the runtime for the container (multi-line setup and stats collection control). We didn't get to setting up Loggly since they don't meet the three use cases and they don't seem to support multi-line logs (their system is based around syslog, yuck).
Anyway, for now Sumo wins and we'll keep moving forward with it unless we run into problems.
That's it for now. In another post we'll setup notifications from Sumo to slack (which looks harder than it should to be).