* Release shard lease after shutdown
Currently, only local cached shard info has been removed when worker losts the
lease. The info inside checkpointer (dynamoDB) is not removed. This causes
lease has been hold until the lease expiration and it might take too long
for shard is ready for other worker to grab. This change release the lease
in checkpointer immediately.
The user need to ensure appropriate checkpointing before return from
Shutdown callback.
Test:
updated unit test and integration test to ensure only the shard owner
has been wiped out and leave the checkpoint information intact.
Signed-off-by: Tao Jiang <taoj@vmware.com>
* Add code coverage reporting
Add code coverage reporting for unit test.
Signed-off-by: Tao Jiang <taoj@vmware.com>
Currently, only local cached shard info has been removed when worker losts the
lease. The info inside checkpointer (dynamoDB) is not removed. This causes
lease has been hold until the lease expiration and it might take too long
for shard is ready for other worker to grab. This change release the lease
in checkpointer immediately.
The user need to ensure appropriate checkpointing before return from
Shutdown callback.
Signed-off-by: Tao Jiang <taoj@vmware.com>
* Update worker to let it inject checkpointer and kinesis
Add two functions to inject checkpointer and kinesis for custom
implementation or adding mock for unit test.
This change also remove the worker_custom.go since it is no longer
needed.
Test:
Update the integration tests to cover newly added functions.
Signed-off-by: Tao Jiang <taoj@vmware.com>
* Fix typo on the test function
Signed-off-by: Tao Jiang <taoj@vmware.com>
Update the unit test and move integration test under test folder.
Update retry logic by switching to AWS's default retry.
Signed-off-by: Tao Jiang <taoj@vmware.com>
* Add credential configuration for resources
Add credentials for Kinesis, DynamoDB and Cloudwatch. See the worker_test.go
to see how to use it.
Signed-off-by: Tao Jiang <taoj@vmware.com>
* Add support for providing custom checkpointer
Provide a new constructor for adding checkpointer instead of alway using
default dynamodb checkpointer.
The next step is to abstract out the Kinesis into a generic stream API and
this will be bigger change and will be addressed in different PR.
Test:
Use the new construtor to inject dynamodb checkpointer and run the existing
tests.
Signed-off-by: Tao Jiang <taoj@vmware.com>
* Add support for providing custom checkpointer
Provide a new constructor for adding checkpointer instead of alway using
default dynamodb checkpointer.
The next step is to abstract out the Kinesis into a generic stream API and
this will be bigger change and will be addressed in different PR.
Fix checkfmt error.
Test:
Use the new construtor to inject dynamodb checkpointer and run the existing
tests.
Signed-off-by: Tao Jiang <taoj@vmware.com>
1. No functional change just upgrade to go1.11.
2. Add go mod support.
3. Make vendored copy of dependencies
Test
1. hmake
2. run worker_test.go in GoLand IDE
Update the readme and contributing doc before publishing
to github repo.
https://github.com/vmware/vmware-go-kcl
Jira CNA-2036
Change-Id: Idd8cfd8c89d3202613ff1d3018a584945ad30e4a
Current, KCL doesn't release shard when returning on error
which causes the worker cannot get any shard because it has
the maximum number of shard already. This change makes sure
releasing shard when return.
update the log message.
Test:
Integration test by forcing error on reading shard to
simulate Kinesis Internal error and make sure the KCL
will not stop processing.
Jira CNA-1995
Change-Id: Iac91579634a5023ab5ed73c6af89e4ff1a9af564
After a few days of shard splitting, the parent shard will be
deleted by Kinesis system. KCL should ignore the error caused
by deleted parent shared and move on.
Test:
Manuall split shard on kcl-test stream in photon-infra account
Currently, shard3 is the parent shard of shard 4 and 5. Shard 3
has a parent shard 0 which has been deleted already. Verified
the test can run and not stuck in waiting for parent shard.
Jira CNA-2089
Change-Id: I15ed0db70ff9836313c22ccabf934a2a69379248
gas is now gosec. Need to update security scan and fix
security issue as needed.
No functional change.
Jira CNA-2022
Change-Id: I36f2a204114f3f13e2ed05579c04a9c89f528f9a
All source should be prepared in a manner that reflects
comments that VMware would be comfortable sharing with
the public.
Documentation only. No functional change.
Update the license to MIT to be consistent with approved
OSSTP product tracking ticket:
https://osstp.vmware.com/oss/#/upstreamcontrib/project/1101391
Jira CNA-1117
Change-Id: I3fe31f10db954887481e3b21ccd20ec8e39c5996
The processing Kinesis gets stuck after splitting shard. The
reason is that the app doesn't do mandatory checkpoint.
KCL document states:
// When the value of {@link ShutdownInput#getShutdownReason()} is
// {@link com.amazonaws.services.kinesis.clientlibrary.lib.worker.ShutdownReason#TERMINATE} it is required that you
// checkpoint. Failure to do so will result in an IllegalArgumentException, and the KCL no longer making progress.
Also, fix shard lease to prevent one host takes more shard than
its configuration allowed.
Jira CNA-1701
Change-Id: Icbdacaf347c7a67b5793647ad05ff93cca629741
There might be verious reason for shard iterator to
expire, such as: not enough data in shard or process
even takes more than 5 minutes which cause shard
iterator not refreshing enough.
This change removes log.Fatal which causes panic.
Panic inside go routine will bring down the whole
app. Therefore, just log error and exit the go routine
instead.
Jira ID: CNA-1072
Change-Id: I34a8d9af7258f3ea75465e2245bbc25c2fafee35
cascade-kinesis-client will be used as a submodule of other projects,
so it should not have "src/vmware.com/cascade-kinesis-client" in
its path. To build this project locally, please manually create
the parent folders.
Change-Id: I8844e6a0e32aae65b28496915d8507e9fb1058c6
Need to remove lease entry in dynamodb table when shard has been removed
by Kinesis. This happens when doing shard splitting and parent shard will be moved
by Kinesis after its retention period (normally after 24 hours).
Change-Id: I70a5836436ac0698110085d46d9438fcaf539cd2
Organize the folder structure in order to support imported as
submodule for other services.
Jira CNA-701
Change-Id: I1dda27934642bb8a7755df07dc4a5048449afc86
This changes fixed cloudwatch metrics publishing by adding long
running go routine to periodically publish cloudwatch metrics.
Also, shutdown metrics publishing when KCL is shutdown.
Test:
Run hmake test and verified cloudwatch metrics has been
published via AWS cloudwatch console.
Jira CNA-702
Change-Id: I78b347cd12939447b0daf93f51acf620d18e2f49
This change enables metrics reporting and fixes a few bug in metrics reporting.
The current metrics reporting is quite limited. Will add more metrics in
next cr.
Tested with both prometheus and cloudwatch.
Jira CNA-702
Change-Id: I678b3f8a372d83f7b8adc419133c14cd10884f61
go languaage doesn't like all-caps on const. Since KCL is mainly from
Amazon's KCL, we'd like the constant to have the exactly same name as
Amazon's KCL. Thefore, skip the lint check.
Change-Id: Ib8a2f52a8f4b44d814eda264f62fdcd53cccc2a7
Add support for handling child/parent shard. When processing
child shard, it has to wait until parent shard finished before
processing itself.
Change-Id: I8bbf104c22ae93409d856be9c6829988c1b2d7eb
This change fixed the bug of not finding checkpoint when process
restart. It also adds missing call to record processor for notifying
the shard info and checkpoint when application first started.
Test:
Run hmake test and verify the log.
Change-Id: I4bdf21ac10c5ee988a0860c140991f7d05975541
This is the core part of KCL by implementing worker.
It has exactly the same interface as Amazon's KCL. Internally,
it uses code from GoKini in order to get the library
functionaly quickly.
This is a working version. The test code worker_test.go
shows how to use this library.
Dynamic resharding feature is out of the scope of M4.
Test:
1. A Kinesis stream named "kcl-test" has been created under photon-infra
account.
2. Download your AWS Credential from IAM user page.
3. Modify the worker_test.go to fill in your aws credential.
4. hmake test
Jira CNA-637
Change-Id: I886d255bab9adaf7a13bca11bfda51bedaacaaed
This is the first part of implementing shard lease for Kinesis
Client library. It creates dynamoDB table for managing
Kinesis stream shard lease.
https://jira.eng.vmware.com/browse/CNA-636
Adjust error code value range.
Change-Id: I16565fa15332843101235fb14545ee69c2599f2f
This is to create configuration and client interface in order to give
user an overview on how the Kinesis client library works.
In order not to reinvent wheel, the api is designed closely aligned with
Amazon Kinesis Client Library in Java.
add errors.
remove @throws and use @error instead.
https://jira.eng.vmware.com/browse/CNA-614
Change-Id: I78a269b328c14df37f878eccef192ff022a669cc