Git Product home page Git Product logo

Comments (9)

dvassallo avatar dvassallo commented on July 30, 2024

The "enhancement" issue here is that the CFN template should be starting the consumer application as a service and have it start automatically after a host reboot. This is currently happening for Elasticsearch and Nginx, but not for the KCL application.

@matthewaaronthacker - Note that the consumer application will restart from where it left off[1], so if it has been offline for several hours it might still be catching up. You may want to check if data is being ingested around the time when your EC2 instance restarted.

Another few other things to check:

  • First make sure that the subscription consumer process is running:
ps -ef | grep "ElasticsearchConnector"
  • And if it is, you may want to ensure that there are no errors in /var/log/cloudwatch-logs-subscription-consumer.log. Healthy activity should look something like this:
2015-08-31 19:12:18,300 INFO  Worker - Current stream shard assignments: shardId-000000000000
2015-08-31 19:12:18,300 INFO  Worker - Sleeping ...
2015-08-31 19:12:44,981 INFO  ElasticsearchEmitter - Emitted 128 records to Elasticsearch

[1] The Kinesis buffer length is 24 hours.

from cloudwatch-logs-subscription-consumer.

matthewaaronthacker avatar matthewaaronthacker commented on July 30, 2024

It's definitely running. Here's what the log looks like - just seems to be sleeping and nothing else. I hope none of this stuff needs to be redacted.

2015-08-31 19:32:39,585 INFO LeaseTaker - Worker 7c6c6e8cc7fccafe:158d4b2e:14f852bcfcf:-8000 saw 1 total leases, 1 available leases, 1 workers. Target is 1 leases, I have 0 leases, I will take 1 leases
2015-08-31 19:32:40,645 INFO LeaseTaker - Worker 7c6c6e8cc7fccafe:158d4b2e:14f852bcfcf:-8000 successfully took 1 leases: shardId-000000000000
2015-08-31 19:32:47,049 INFO LeaseCoordinator - With failover time 5000ms and epsilon 25ms, LeaseCoordinator will renew leases every 1641ms and take leases every 10050ms
2015-08-31 19:32:47,052 INFO KinesisConnectorExecutorBase - ElasticsearchConnector worker created
2015-08-31 19:32:47,052 INFO KinesisConnectorExecutorBase - Starting worker in ElasticsearchConnector
2015-08-31 19:32:47,052 INFO Worker - Initialization attempt 1
2015-08-31 19:32:47,053 INFO Worker - Initializing LeaseCoordinator
2015-08-31 19:32:47,845 INFO LeaseManager - Table Elasticsearch-cloudwatch already exists.
2015-08-31 19:32:47,882 INFO Worker - Syncing Kinesis shard info
2015-08-31 19:32:48,135 INFO Worker - Starting LeaseCoordinator
2015-08-31 19:32:58,156 INFO Worker - Initialization complete. Starting worker loop.
2015-08-31 19:32:58,255 INFO LeaseTaker - Worker 0ff2051dce615aaa:-19eac647:14f853f5b9c:-8000 saw 1 total leases, 1 available leases, 1 workers. Target is 1 leases, I have 0 leases, I will take 1 leases
2015-08-31 19:32:58,281 INFO LeaseTaker - Worker 0ff2051dce615aaa:-19eac647:14f853f5b9c:-8000 successfully took 1 leases: shardId-000000000000
2015-08-31 19:32:58,392 INFO ElasticsearchEmitter - ElasticsearchEmitter using elasticsearch endpoint 127.0.0.1:9300
2015-08-31 19:32:58,483 INFO plugins - [Ajax] loaded [], sites []
2015-08-31 19:32:59,319 INFO Worker - Created new shardConsumer for : ShardInfo [shardId=shardId-000000000000, concurrencyToken=dfd20417-2ba3-44c0-aef2-43dd728217af, parentShardIds=[]]
2015-08-31 19:32:59,321 INFO BlockOnParentShardTask - No need to block on parents [] of shard shardId-000000000000
2015-08-31 19:32:59,543 INFO KinesisDataFetcher - Initializing shard shardId-000000000000 with 49553910973532647634987963264196455354009713534901944322
2015-08-31 19:33:47,774 INFO Worker - Current stream shard assignments: shardId-000000000000
2015-08-31 19:33:47,774 INFO Worker - Sleeping ...
2015-08-31 19:35:09,900 INFO Worker - Current stream shard assignments: shardId-000000000000
2015-08-31 19:35:09,900 INFO Worker - Sleeping ...
2015-08-31 19:36:32,986 INFO Worker - Current stream shard assignments: shardId-000000000000
2015-08-31 19:36:32,986 INFO Worker - Sleeping ...
2015-08-31 19:37:46,571 INFO Worker - Current stream shard assignments: shardId-000000000000
2015-08-31 19:37:46,571 INFO Worker - Sleeping ...
2015-08-31 19:38:49,378 INFO Worker - Current stream shard assignments: shardId-000000000000
2015-08-31 19:38:49,378 INFO Worker - Sleeping ...
2015-08-31 19:39:10,305 INFO LeaseRenewer - getCurrentlyHeldLease not returning lease with key shardId-000000000000 because it is expired
2015-08-31 19:41:20,183 INFO Worker - Current stream shard assignments: shardId-000000000000
2015-08-31 19:41:20,183 INFO Worker - Sleeping ...
2015-08-31 19:43:08,053 INFO Worker - Current stream shard assignments: shardId-000000000000
2015-08-31 19:43:08,053 INFO Worker - Sleeping ...
2015-08-31 19:44:33,462 INFO Worker - Current stream shard assignments: shardId-000000000000
2015-08-31 19:44:33,462 INFO Worker - Sleeping ...
2015-08-31 19:46:08,135 INFO Worker - Current stream shard assignments: shardId-000000000000
2015-08-31 19:46:08,135 INFO Worker - Sleeping ...

from cloudwatch-logs-subscription-consumer.

matthewaaronthacker avatar matthewaaronthacker commented on July 30, 2024

If I leave it running long enough I eventually get this:
2015-08-31 20:14:03,776 INFO ProcessTask - ShardId shardId-000000000000: getRecords threw ExpiredIteratorException - restarting after greatest seqNum passed to customer
com.amazonaws.services.kinesis.model.ExpiredIteratorException: Iterator expired. The iterator was created at time Mon Aug 31 20:08:43 UTC 2015 while right now it is Mon Aug 31 20:14:04 UTC 2015 which is further in the future than the tolerated delay of 300000 milliseconds. (Service: AmazonKinesis; Status Code: 400; Error Code: ExpiredIteratorException; Request ID: d37e15b3-501c-11e5-93e2-c7eee25b0007)
at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1160)
at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:748)
at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:467)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:302)
at com.amazonaws.services.kinesis.AmazonKinesisClient.invoke(AmazonKinesisClient.java:2472)
at com.amazonaws.services.kinesis.AmazonKinesisClient.getRecords(AmazonKinesisClient.java:1126)
at com.amazonaws.services.kinesis.clientlibrary.proxies.KinesisProxy.get(KinesisProxy.java:149)
at com.amazonaws.services.kinesis.clientlibrary.proxies.MetricsCollectingKinesisProxyDecorator.get(MetricsCollectingKinesisProxyDecorator.java:72)
at com.amazonaws.services.kinesis.clientlibrary.lib.worker.KinesisDataFetcher.getRecords(KinesisDataFetcher.java:67)
at com.amazonaws.services.kinesis.clientlibrary.lib.worker.ProcessTask.getRecords(ProcessTask.java:240)
at com.amazonaws.services.kinesis.clientlibrary.lib.worker.ProcessTask.call(ProcessTask.java:117)
at com.amazonaws.services.kinesis.clientlibrary.lib.worker.MetricsCollectingTaskDecorator.call(MetricsCollectingTaskDecorator.java:48)
at com.amazonaws.services.kinesis.clientlibrary.lib.worker.MetricsCollectingTaskDecorator.call(MetricsCollectingTaskDecorator.java:23)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
2

from cloudwatch-logs-subscription-consumer.

dvassallo avatar dvassallo commented on July 30, 2024

Hmm... It looks like everything is working fine on the consumer side, but it seems that it is finding nothing to read from Kinesis.

You may want to check that log data is still flowing to CloudWatch Logs and to the Kinesis stream. You can ssh to your EC2 instance and run the following commands with the AWS CLI (which should be already installed on your EC2 instance):

aws logs describe-subscription-filters \
       --log-group-name <log-group-name> --region <aws-region>

aws logs describe-log-streams \
       --log-group-name <log-group-name> --order-by LastEventTime \
       --descending --region <aws-region>

The <log-group-name> should be the CloudWatch Logs log group name that you're currently ingesting into Elasticsearch. The <aws-region> should be the AWS region (in "us-east-1" format) where your CloudWatch Logs log group resides.

The result of the describe-subscription-filters should tell you whether there is still a subscription between the log group and your Kinesis stream. The presence of a subscription filter should be enough evidence that the link still exists.

The result of the describe-log-streams command should tell you if there is data still flowing into your CloudWatch Logs log group. The first log stream in the result set should be the most recently active. You can pick its lastEventTimestamp value and convert it to a readable form with the date command. Note that you have to remove the last three digits since lastEventTimestamp is returned in milliseconds:

date -d @1437978781

If the date is recent, then there is still data flowing to your CloudWatch Logs log group. If that's the case the problem might be with the configuration of the subscription filter.

One other thing to check would be the Kinesis metrics from the AWS Management Console. You may want to confirm that there is still "Write Throughput" activity. You can see these metrics by picking "Kinesis" from the "Services" toolbar and clicking on the Kinesis stream that was created by the CFN template.

from cloudwatch-logs-subscription-consumer.

dvassallo avatar dvassallo commented on July 30, 2024

The ExpiredIteratorException is not concerning. Kinesis shard iterators expire after 5 minutes[1], but the Kinesis Client Library should be handling those properly by grabbing a new shard iterator.


[1] http://docs.aws.amazon.com/kinesis/latest/APIReference/API_GetShardIterator.html

from cloudwatch-logs-subscription-consumer.

matthewaaronthacker avatar matthewaaronthacker commented on July 30, 2024

That all seems to be normal. There is write activity in the Kinesis stream.

(redacted)
{
"subscriptionFilters": [
{
"filterPattern": "[version, account_id, interface_id, srcaddr, dstaddr, srcport, dstport, protocol, packets, bytes, start, end, action, log_status]",
"filterName": "cwl-cfn-es-Elasticsearch-cloudwatch-KinesisSubscriptionStream-",
"roleArn": "arn:aws:iam::**
:role/Elasticsearch-cloudwatch-CloudWatchLogsKinesisRole-**
",
"creationTime": 1440685573566,
"logGroupName": "CloudTrail/FlowLogGroup",
"destinationArn": "arn:aws:kinesis:us-east-1:**
_:stream/Elasticsearch-cloudwatch-KinesisSubscriptionStream-_
**"
}
]
}

"logStreams": [
    {
        "firstEventTimestamp": 1434050750000,
        "lastEventTimestamp": 1441053097000,

date -d @1441053097
Mon Aug 31 20:31:37 UTC 2015

matthew

from cloudwatch-logs-subscription-consumer.

matthewaaronthacker avatar matthewaaronthacker commented on July 30, 2024

Maybe I did something wrong in the command to start the consumer? This is what I ran (redacted but the stream matches my kinesis instance).

{ nohup java -DkinesisInputStream=Elasticsearch-cloudwatch-KinesisSubscriptionStream-****** -DregionName=us-east-1 -DappName=Elasticsearch-cloudwatch -Dlog4j.configuration=log4j-prod.properties -DelasticsearchClusterName=elasticsearch -cp /root/cloudwatch-logs-subscription-consumer-1.2.0/cloudwatch-logs-subscription-consumer-1.2.0.jar com.amazonaws.services.logs.connectors.samples.elasticsearch.ElasticsearchConnector > /dev/null 2>&1 & } && disown -h %1

matthew

from cloudwatch-logs-subscription-consumer.

matthewaaronthacker avatar matthewaaronthacker commented on July 30, 2024

Daniel,
You need any other logs or anything from me on this? If not I'm going to go ahead and kill the stack and recreate it. I'm not deep enough in to be losing anything of value and it's easy enough to start over.
Thanks for all your help!
matthew

from cloudwatch-logs-subscription-consumer.

dvassallo avatar dvassallo commented on July 30, 2024

@matthewaaronthacker - Unfortunately it is still not clear why your consumer app is not finding anything in Kinesis. Feel free to tear down the cluster and bring it back up. Let us know if you run into the same issue again.

I would leave this issue open as an enhancement request because the CFN template should automatically set up the consumer to restart automatically after an EC2 instance reboot.

from cloudwatch-logs-subscription-consumer.

Related Issues (19)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.