These type checks were not doing anything - which was surfaced by a `go vet` run on the more recent
Go version. errors.As with the second argument of type *error will always return true.
I played a little bit with trying to get a working typecheck in this test, but it didn't really work
because of issues with the fork github.com/Clever/syslogparser. Since that parser didn't update
self-references to point to itself instead of the upstream, our errors returned by this are already
not really working correctly - sometimes they are Clever/syslogparser.ParseError and sometimes
jeromer/syslogparser.ParseError. Fixing the underlying problem is out of scope for right now.
Adds a lowest-common-denominator function, KPLDeaggregate, for
handling records that might be KPL aggregated. Also adds a function,
DeaggregateAndSplitIfNecessary, to wrap the existing functionality of
SplitMessageIfNecessary with KPL deaggreation.
These functions are handy for non-KCL consumers, like Lambda
functions. KCL automatically applies deaggreation for you.
This change is backwards compatible - the previously exposed function
SplitMessageIfNecessary still does the same things.
Being in the batchconsumer package means it will work for anything
using KCL, but lambdas that subscribe to these log streams do not use
batchconsumer at all; instead they invoke the splitter package
directly. As such, if we want this functionality to be available to
lambda log consumers, it can't be in batchconsumer.
There are no functionality changes here, just moving code from an
unexported method in one place to an exported function in another
place. The tests also get moved along with it.
ParseAndEnhance used to be:
- Try to parse line as a syslog, extracting the log itself and other
fields from syslog format
- If that succeeds, try to parse the log as either a Kayvee log or
an RDS slow query log.
- Combine all these fields, and add on some "derived"
fields (container_task|env|app).
- Not a syslog => error
Now it will be:
- Try to parse line as a syslog, same as before, including the
Kayvee/RDS part
- If syslog parsing failed, try to parse as a Fluent log and extract
some fields from the Fluent format (the log, timestamp, etc)
- If that succeeds, try to parse the log itself as a Kayvee log.
- Combine Kayvee fields (if found) and derived fields)
- If BOTH formats fields, it is an error.
The decoding makes a lot of assumptions:
- The names of the log field and timestamp field (even though,
theoretically, they are customizable in the fluentbit config.
- The timestamp format (again)
- The format of the Task Definition name (or at least part of it)
- All fluentbit logs should have hostname set to `aws-fargate`.
Perhaps these can be relaxed if necessary. They could probably be
replaced by some kind of config. As there is currently no config I
wanted to keep things simple as possible. If we need to re-evaluate
(for example if we start getting JSON logs that don't want to use the
same handling for container_task|env|app) we can reevaluate.