Golang library for consuming Kinesis stream data
Find a file
2018-06-05 11:49:51 -07:00
checkpoint Make what aws error to trigger retry decided by caller (#52) 2018-06-04 20:07:58 -07:00
examples Make what aws error to trigger retry decided by caller (#52) 2018-06-04 20:07:58 -07:00
.gitignore Simplify the consumer experience (#35) 2017-11-20 08:21:40 -08:00
CHANGELOG.md Add change log to summarize repo activity (#38) 2017-11-20 17:27:39 -08:00
client.go Use AWS resource iface, overwrite default dynamodb, more explicit in example about overwrite default AWS resrouce client (#49) 2018-05-31 17:41:14 -07:00
consumer.go Have new func return type 2017-11-26 18:22:09 -08:00
consumer_test.go Introduce Client Interface 2017-11-26 16:00:11 -08:00
CONTRIBUTING.md Rename License file and add Contributing sections 2015-05-23 10:24:53 -07:00
Gopkg.lock DDB uses default AWS config settings to ping table; won't work with WithDyanmoClient. Misc update on example and README (#50) 2018-06-01 16:14:42 -07:00
Gopkg.toml Simplify the consumer experience (#35) 2017-11-20 08:21:40 -08:00
MIT-LICENSE Rename License file and add Contributing sections 2015-05-23 10:24:53 -07:00
README.md DDB uses default AWS config settings to ping table; won't work with WithDyanmoClient. Misc update on example and README (#50) 2018-06-01 16:14:42 -07:00

Golang Kinesis Consumer

Kinesis consumer applications written in Go. This library is intended to be a lightweight wrapper around the Kinesis API to read records, save checkpoints (with swappable backends), and gracefully recover from service timeouts/errors.

Alternate serverless options:

Installation

Get the package source:

$ go get github.com/harlow/kinesis-consumer

Overview

The consumer leverages a handler func that accepts a Kinesis record. The Scan method will consume all shards concurrently and call the callback func as it receives records from the stream.

Important: The default Log, Counter, and Checkpoint are no-op which means no logs, counts, or checkpoints will be emitted when scanning the stream. See the options below to override these defaults.

import(
	// ...

	consumer "github.com/harlow/kinesis-consumer"
)

func main() {
	var stream = flag.String("stream", "", "Stream name")
	flag.Parse()

	// consumer
	c, err := consumer.New(*stream)
	if err != nil {
		log.Fatalf("consumer error: %v", err)
	}

	// start
	err = c.Scan(context.TODO(), func(r *consumer.Record) bool {
		fmt.Println(string(r.Data))
		return true // continue scanning
	})
	if err != nil {
		log.Fatalf("scan error: %v", err)
	}

	// Note: If you need to aggregate based on a specific shard the `ScanShard`
	// method should be leverged instead.
}

Checkpoint

To record the progress of the consumer in the stream we use a checkpoint to store the last sequence number the consumer has read from a particular shard.

This will allow consumers to re-launch and pick up at the position in the stream where they left off.

The uniq identifier for a consumer is [appName, streamName, shardID]

kinesis-checkpoints

Note: The default checkpoint is no-op. Which means the scan will not persist any state and the consumer will start from the beginning of the stream each time it is re-started.

To persist scan progress choose one of the following checkpoints:

Redis Checkpoint

The Redis checkpoint requries App Name, and Stream Name:

import checkpoint "github.com/harlow/kinesis-consumer/checkpoint/redis"

// redis checkpoint
ck, err := checkpoint.New(appName)
if err != nil {
	log.Fatalf("new checkpoint error: %v", err)
}

DynamoDB Checkpoint

The DynamoDB checkpoint requires Table Name, App Name, and Stream Name:

import checkpoint "github.com/harlow/kinesis-consumer/checkpoint/ddb"

// ddb checkpoint
ck, err := checkpoint.New(appName, tableName)
if err != nil {
	log.Fatalf("new checkpoint error: %v", err)
}

// Override the Kinesis if any needs on session (e.g. assume role)
myDynamoDbClient := dynamodb.New(session.New(aws.NewConfig()))

// For versions of AWS sdk that fixed config being picked up properly, the example of
// setting region should work.
//    myDynamoDbClient := dynamodb.New(session.New(aws.NewConfig()), &aws.Config{
//        Region: aws.String("us-west-2"),
//    })

ck, err := checkpoint.New(*app, *table, checkpoint.WithDynamoClient(myDynamoDbClient))
if err != nil {
    log.Fatalf("new checkpoint error: %v", err)
}

To leverage the DDB checkpoint we'll also need to create a table:

Partition key: namespace
Sort key: shard_id
screen shot 2017-11-22 at 7 59 36 pm

Options

The consumer allows the following optional overrides.

Client

Override the Kinesis client if there is any special config needed:

// client
client := kinesis.New(session.New(aws.NewConfig()))

// consumer
c, err := consumer.New(streamName, consumer.WithClient(client))

Metrics

Add optional counter for exposing counts for checkpoints and records processed:

// counter
counter := expvar.NewMap("counters")

// consumer
c, err := consumer.New(streamName, consumer.WithCounter(counter))

The expvar package will display consumer counts:

"counters": {
    "checkpoints": 3,
    "records": 13005
},

Logging

The package defaults to ioutil.Discard so swallow all logs. This can be customized with the preferred logging strategy:

// logger
logger := log.New(os.Stdout, "consumer-example: ", log.LstdFlags)

// consumer
c, err := consumer.New(streamName, consumer.WithLogger(logger))

Contributing

Please see CONTRIBUTING.md for more information. Thank you, contributors!

License

Copyright (c) 2015 Harlow Ward. It is free software, and may be redistributed under the terms specified in the LICENSE file.

www.hward.com  ·  GitHub @harlow  ·  Twitter @harlow_ward