A specialized Amazon Kinesis stream reader (based on the Amazon Kinesis Connector Library) that can help you deliver data from Amazon CloudWatch Logs to any other system in near real-time using a CloudWatch Logs Subscription Filter.

License: Other

Java 97.35% Nginx 2.65%

cloudwatch-logs-subscription-consumer's Introduction

CloudWatch Logs Subscription Consumer

The CloudWatch Logs Subscription Consumer is a specialized Amazon Kinesis stream reader (based on the Amazon Kinesis Connector Library) that can help you deliver data from Amazon CloudWatch Logs to any other system in near real-time using a CloudWatch Logs Subscription Filter.

The current version of the CloudWatch Logs Subscription Consumer comes with built-in connectors for Elasticsearch and Amazon S3, but it can easily be extended to support other destinations using the Amazon Kinesis Connector Library framework.

One-Click Setup: CloudWatch Logs + Elasticsearch + Kibana

This project includes a sample CloudFormation template that can quickly bring up an Elasticsearch cluster on Amazon EC2 fed with real-time data from any CloudWatch Logs log group. The CloudFormation template will also install Kibana 3 and Kibana 4.1, and it comes bundled with a few sample Kibana 3 dashboards for the following sources of AWS log data:

You can also connect your Elasticsearch cluster to any other custom CloudWatch Logs log group and then use Kibana to interactively analyze your log data with ad-hoc visualizations and custom dashboards.

If you already have an active CloudWatch Logs log group, you can launch a CloudWatch Logs + Elasticsearch + Kibana stack right now with this launch button:

You can find the CloudFormation template in: configuration/cloudformation/cwl-elasticsearch.template

NOTE: This template creates one or more Amazon EC2 instances, an Amazon Kinesis stream and an Elastic Load Balancer. You will be billed for the AWS resources used if you create a stack from this template.

Your CloudFormation stack may take about 10 minutes to create. Once its status transitions to CREATE_COMPLETE you can navigate to the Outputs tab to get the important URLs for your stack.

Sample Kibana 3 Dashboards (Click to Expand)

The following are snapshots of the sample Kibana 3 dashboards that come built-in with the provided CloudFormation stack. Click on any of the screenshots below to expand to a full view.

Amazon VPC Flow Logs

AWS Lambda

AWS CloudTrail

Setting up Kibana 4 for CloudWatch Logs

The CloudFormation template sets up Kibana 3 with the correct Elasticsearch index patterns for this application, but Kibana 4 needs to be configured manually. When you visit the Kibana 4 URL for the first time you will be prompted to configure an index pattern where you have to:

Turn on "Index contains time-based events"
Turn on "Use event times to create index names"
Pick "Daily" for the "Index pattern interval" field
Enter [cwl-]YYYY.MM.DD for the "Index name or pattern" field
Choose @timestamp for the "Time-field name"

Then you can go ahead and create the index pattern and start using Kibana 4 with data from CloudWatch Logs. Once the index pattern is configured, you can use the Discover, Visualize and Dashboards sections to interact with your CloudWatch Logs data.

Kibana 4 Discover section with VPC Flow Logs

Kibana 4 Dashboard section with VPC Flow Logs

Elasticsearch Administration

The CloudFormation template also installs the kopf plugin which allows you to monitor and manage your Elasticsearch cluster from a web interface.

Getting CloudWatch Logs data indexed in Elasticsearch

JSON Data

The CloudWatch Logs Subscription Consumer will automatically put log event messages that are valid JSON as Object fields in Elasticsearch. Elasticsearch is able to understand these Object types and their inner hierarchies, providing query support for all the inner fields. You do not have to specify anything beyond the source log group in the CloudFormation input parameters to have JSON data indexed in Elasticsearch.

Fixed-Column Data

Other log events that have a fixed-column format (such as traditional web server access logs) can get indexed easily in Elasticsearch by defining the field names in the CloudWatch Logs subscription filter pattern using the Filter Pattern Syntax. For example, if you had log data in this format:

127.0.0.1 user-identifier frank [10/Oct/2000:13:55:36 -0700] "GET /apache_pb.gif" 200 2326

... then you can use the following subscription filter pattern:

[ip, user_identifier, user, timestamp, request, status_code, response_size]

... and its fields would get automatically indexed in Elasticsearch using the specified column names.

Filter patterns can also be used to restrict what flows from CloudWatch Logs to Elasticsearch. You can set conditions on any of the fields. Here are a few examples that would match the sample log event from above:

Equals condition on one field:

[ip, user_identifier, user, timestamp, request, status_code = 200, response_size]

Prefix match on one field:

[ip, user_identifier, user, timestamp, request, status_code = 2*, response_size]

OR condition on one field:

[ip, user_identifier, user, timestamp, request, status_code = 200 || status_code = 400, response_size]

BETWEEN condition on one field:

[ip, user_identifier, user, timestamp, request, status_code >= 200 && status_code <= 204, response_size]

Conditions on multiple fields:

[ip != 10.0.0.1, user_identifier, user, timestamp, request, status_code = 200, response_size]

Compound conditions:

[(ip != 10.* && ip != 192.*) || ip = 127.*, user_identifier, user, timestamp, request, status_code, response_size]

Other Less-Structured Data

The last field in a subscription filter pattern is always greedy, and in case the log event message has more fields than what is expressed in the filter, all additional data would get assigned to the last field. For example, the following filter pattern:

[timestamp, request_id, event]

... would match this log event message:

2015-07-08T01:42:25.679Z 8bd492bcaede Decoded payload: Hello World

... and the indexed fields in Elasticsearch would be:

{
  "timestamp": "2015-07-08T01:42:25.679Z",
  "request_id": "8bd492bcaede",
  "event": "Decoded payload: Hello World"
}

If one of the fields were to contain a valid JSON string, it would get put as an Object field in Elasticsearch rather than as an escaped JSON string. For example, using the [timestamp, request_id, event] filter pattern against the following log event:

2015-07-08T01:42:25.679Z 8bd492bcaede { "payloadSize": 100, "responseCode": "HTTP 200 OK" }

... would result in the following Elasticsearch document:

{
  "timestamp": "2015-07-08T01:42:25.679Z",
  "request_id": "8bd492bcaede",
  "event": {
      "payloadSize": 100, 
      "responseCode": "HTTP 200 OK" 
    }
  }
}

Indexing Amazon VPC Flow Logs

The sample Kibana dashboard for Amazon VPC Flow Logs that comes built-in with the provided CloudFormation stack expects a CloudWatch Log subscription with the following filter pattern.

[version, account_id, interface_id, srcaddr, dstaddr, srcport, dstport, protocol, packets, bytes, start, end, action, log_status]

If you choose "Amazon VPC Flow Logs" in the LogFormat parameter of the CloudFormation template, the subscription filter will get created with the above filter pattern automatically.

If you prefer to only have a subset of the VPC Flow Logs going to Elasticsearch, you can choose "Custom" for the LogFormat parameter, and then specify the above filter pattern with conditions on some of the fields. For example, if you are only interested in analyzing rejected traffic, you can add the condition action = REJECT on the action field.

Indexing AWS Lambda Logs

The sample Kibana dashboard for AWS Lambda that comes built-in with the provided CloudFormation stack expects a CloudWatch Log subscription with the following filter pattern.

[timestamp=*Z, request_id="*-*", event]

If you choose "AWS Lambda" in the LogFormat parameter of the CloudFormation template, the subscription filter will get created with the above filter pattern automatically.

In your JavaScript Lambda functions you can use the JSON.stringify method for logging structured data that would get automatically indexed in Elasticsearch. For example, the following is a slightly modified example from the Kinesis Process Record Lambda template that logs some JSON structured data using JSON.stringify. The first one simply logs the entire Kinesis record, and the second one logs some statistics on the function's activity and performance:

exports.handler = function(event, context) {
    var start = new Date().getTime();
    var bytesRead = 0;

    event.Records.forEach(function(record) {
        // Kinesis data is base64 encoded so decode here
        payload = new Buffer(record.kinesis.data, 'base64').toString('ascii');
        bytesRead += payload.length;

        // log each record
        console.log(JSON.stringify(record, null, 2));
    });

    // collect statistics on the function's activity and performance
    console.log(JSON.stringify({ 
        "recordsProcessed": event.Records.length,
        "processTime": new Date().getTime() - start,
        "bytesRead": bytesRead,
    }, null, 2));
    
    context.succeed("Successfully processed " + event.Records.length + " records.");
};

The following snapshot from Kibana shows how the recordsProcessed and bytesRead got indexed in Elasticsearch and rendered as a graph in Kibana. Click on the image to expand to a full view of the sample dashboard for AWS Lambda that comes built-in with the provided CloudFormation stack:

Indexing AWS CloudTrail Logs

The sample Kibana dashboard for AWS CloudTrail that comes built-in with the provided CloudFormation stack does not require a subscription filter pattern because CloudTrail data is always valid JSON. If you prefer to only have a subset of the CloudTrail Logs going to Elasticsearch, you can choose "Custom" for the LogFormat parameter of the CloudFormation template, and then specify a JSON filter using the Filter Pattern Syntax. Here's an example of a valid JSON filter pattern that would filter the subscription feed to events that were made by Root accounts against the Autoscaling service:

{$.userIdentity.type = Root && $.eventSource = autoscaling*}

Elasticsearch Access Control

The current version of the CloudFormation template allows you to configure two basic methods for controlling access to the Elasticsearch API and Kibana UI:

IP Address restrictions configured with EC2 security groups.
HTTP Basic Auth configured through an nginx proxy that sits in front of the Elasticsearch endpoint.

This is generally considered an insufficient level of access control for clusters holding confidential data. It is highly recommended that you put additional security measures and access control mechanisms before you use this stack with production data.

The nginx setup can be easily modified to enable other security and access control features, such as:

Adding HTTPS support to authenticate the endpoint and protect the Basic Auth credentials.
Adding HTTPS Client Authentication to restrict access to authorized users only.

You can find the nginx configuration used by the CloudFormation template in: configuration/nginx/nginx.conf

Building from source

Once you check out the code from GitHub, you can build it using Maven. To disable the GPG-signing in the build, use:

mvn clean install -Dgpg.skip=true

Running locally

After building from source you can run the applicaiton locally using any of these three Maven profiles: Stdout, Elasticsearch or S3. For example:

mvn exec:java -P Elasticsearch

The Maven profile defines which connector destination you would use.

You can configure your application by updating the relevant properties file for the Maven profile you choose. You can also override any setting in the properties file using JVM system properties as in the following example:

mvn exec:java -P Stdout -DkinesisInputStream=application-log-stream -DregionName=us-west-2

Related Resources

cloudwatch-logs-subscription-consumer's People

Contributors

Stargazers

Watchers

Forkers

djschny k001 tribloom dmreiland infra-dev jani93 wolfeidau silvax conormullen krishnagade18 frohoff joningham andrewmcveigh localytics iheartjosh waysofeating ilyae1301 my-janala stsheldon dametrain iheartchef rmanus-shopatron treelineinteractive pinehead mandeepbal gdm rojomisin stevear22 lews versent devonartis naveenlj sujoysaha-work srivastava-palash amos6224 gvsurenderreddy mordorozzo up1 iadknet deb14 rbramwell thepastelsuit zhangbofrank ajkerrigan adrianoaguiar cmccartin darindeters devops-dad cz-moteki-takahiro jonbrouse linearregression toadkicker reckon-limited etsangsplk pranavsethi durmo11 apolloclark agentbond007 prateeknayak zuomang pjr0009 khushi06 bonclay7 univrs cgulliver sweetaz nagraj494 gurayops maggie-caf pierreliddle seeebiii fprieur star3am birkoff chardos roshpr elias-p vimalrkp duonghanu elchesco jmhale vaibhavttnd gaybro8777 wilburdsouza shashank901 optionalg cooper2008 kate-dubbs fgiroult praveenraonp kalyanamp tied gem2016 rosshosman fame-issrani zfq308 dghadge frankfangtl mr-brody apatel2848

cloudwatch-logs-subscription-consumer's Issues

Cloudformation doesn't start sample template

Hi, I am aws begginer operator

I want kinesis, elasticsearch and kibana
So google search and this sample discover.
I want Earnestly this sample(Cloudformation) start.
But, sample template for make Cloudformation, error issue.

19:12:55 UTC+0900 ROLLBACK_IN_PROGRESS AWS::CloudFormation::Stack fff The following resource(s) failed to create: [WaitCondition]. . Rollback requested by user.
19:12:53 UTC+0900 CREATE_FAILED AWS::CloudFormation::WaitCondition WaitCondition WaitCondition received failed message: 'failed to run cfn-init' for uniqueId: i-c227a060
Physical ID:arn:aws:cloudformation:ap-northeast-1:432113265161:stack/fff/f9925e20-5090-11e5-911f-5001aba75438/WaitHandle
19:09:45 UTC+0900 CREATE_COMPLETE AWS::AutoScaling::AutoScalingGroup ElasticsearchServerGroup
19:07:57 UTC+0900 CREATE_IN_PROGRESS AWS::AutoScaling::AutoScalingGroup ElasticsearchServerGroup Resource creation Initiated
19:07:56 UTC+0900 CREATE_IN_PROGRESS AWS::CloudFormation::WaitCondition WaitCondition Resource creation Initiated

Please, help me

CloudTrail to CloudWatch Issues

Hi,

I'm setting this up manually and trying to point it at our existing ES cluster (not using Amazon ES). I have CloudTrail logging to CloudWatch and a subscription from Kinesis and all that is working great, the consumer connects to Kinesis fine too, that's not the issue.

The issue I'm seeing is this:

I get these WARN and ERROR messages from the logs about the nested JSON array that is returned with the requestParameters field:

2015-10-30 19:12:46,022 WARN  ElasticsearchEmitter - Returning 1 records as failed
2015-10-30 19:12:56,028 ERROR ElasticsearchEmitter - Record failed with message: MapperParsingException[failed to parse [requestParameters.iamInstanceProfile]]; nested: ElasticsearchIllegalArgumentException[unknown property [arn]];

In Elasticsearch, I do get data, but it looks like it's coming in as bulk, and the JSON isn't being parsed properly (I've cut this output down a lot for readability):

failed to execute bulk item (index) index {[cwl-2015.10.30][CloudTrail [3225203955xxxxxxxxxxxxxxxxxxxx1130203408060579867], source[{"eventID":"e95e5.....
.....06xx","@owner":"xxxxxxxxxxxxxx","@id":"322xxxxxxxxxxxxxxxxxxxxxxx071130203408060579867"}]}
org.elasticsearch.index.mapper.MapperParsingException: failed to parse [requestParameters.iamInstanceProfile]
at org.elasticsearch.index.mapper.core.AbstractFieldMapper.parse(AbstractFieldMapper.java:411)
at org.elasticsearch.index.mapper.object.ObjectMapper.serializeObject(ObjectMapper.java:554)
at org.elasticsearch.index.mapper.object.ObjectMapper.parse(ObjectMapper.java:487)
at org.elasticsearch.index.mapper.object.ObjectMapper.serializeObject(ObjectMapper.java:554)
at org.elasticsearch.index.mapper.object.ObjectMapper.parse(ObjectMapper.java:487)
at org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:544)
at org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:493)
at org.elasticsearch.index.shard.IndexShard.prepareCreate(IndexShard.java:466)
at org.elasticsearch.action.bulk.TransportShardBulkAction.shardIndexOperation(TransportShardBulkAction.java:418)
at org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:148)
at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$PrimaryPhase.performOnPrimary(TransportShardReplicationOperationAction.java:574)
at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$PrimaryPhase$1.doRun(TransportShardReplicationOperationAction.java:440)
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:36)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.elasticsearch.ElasticsearchIllegalArgumentException: unknown property [arn]
at org.elasticsearch.index.mapper.core.StringFieldMapper.parseCreateFieldForString(StringFieldMapper.java:331)
at org.elasticsearch.index.mapper.core.StringFieldMapper.parseCreateField(StringFieldMapper.java:277)
at org.elasticsearch.index.mapper.core.AbstractFieldMapper.parse(AbstractFieldMapper.java:401)
... 15 more

I'm not a java developer so I hesitate to jump into the code

Thanks!

How can I update Kibana version?

Can't be used with Amazon Elasticsearch service

This connector can't be used in conjunction with the Amazon Elasticsearch service because it requires the ES transport protocol, which Amazon ES doesn't expose. Would it be possible for this connector to use the REST protocol?

keeps getting roll_back

I don't know if any has try this but I kept getting roll_back failures. Also the roll_back eventually failed due to dependencies.

Support for multiple log groups

I have multiple log groups that I want to stream into this cloud formation stack. So far I've only been able to stream one of my log groups. Is there a way to support multiple log groups?

Kopf-Elasticsearch Version Mismatch

We were able to successfully deploy this CF template but Kopf is complaining that:

This version of kopf is not compatible with your ES version

How do you import these dashboard JSON templates into Amazon Elasticsearch Service?

Trying to import any of the JSON templates in this repo doesn't seem to do anything.
I'm running Amazon ES 2.3 with Kibana 4.

Thanks

Trying to set this up.

I have cloudwatch logs and an elasticsearch cluster w/kibana. I'm trying to setup cloudwatch-logs-subscription-consumer found from this repo.

But unfortunately it doesn't really tell me HOW this is setting this up. I know there is a cloudformation template, however that template is not an option.
Do you have any other instructions that can help?

SSL options for Elasticsearch consumer needed

There needs to be an ability to configure SSL for connections to the ES node, if there is it isn't documented here.

Errors using consumer with ES 2.0

Hi - I'm attempting to use the consumer with our ES 2.0 cluster and am receiving errors. I know there were some breaking changes with the Java API in ES 2.0, so I'm assuming that is the issue here, but I'm hoping to give as much info as possible in order to get this resolved.

The consumer was compiled as instructed in the Readme with default configuration other than the ES cluster details and TRACE logging.

The error I am receiving on the consumer side is:

2015-11-13 19:53:47,418 INFO  transport - [Blockbuster] failed to get local cluster state for [#transport#-1][localhost][inet[elasticsearch-int.x.com/10.xx.xx.xx:9300]], disconnecting...
org.elasticsearch.transport.RemoteTransportException: Failed to deserialize exception response from stream
Caused by: org.elasticsearch.transport.TransportSerializationException: Failed to deserialize exception response from stream
    at org.elasticsearch.transport.netty.MessageChannelHandler.handlerResponseError(MessageChannelHandler.java:178)
    at org.elasticsearch.transport.netty.MessageChannelHandler.messageReceived(MessageChannelHandler.java:130)
    at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
    at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
    at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
    at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:296)
    at org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:462)
    at org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:443)
    at org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
    at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
    at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
    at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
    at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:268)
    at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:255)
    at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
    at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
    at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
    at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
    at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
    at org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
    at org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.StreamCorruptedException: Unsupported version: 1
    at org.elasticsearch.common.io.ThrowableObjectInputStream.readStreamHeader(ThrowableObjectInputStream.java:46)
    at java.io.ObjectInputStream.<init>(ObjectInputStream.java:299)
    at org.elasticsearch.common.io.ThrowableObjectInputStream.<init>(ThrowableObjectInputStream.java:38)
    at org.elasticsearch.transport.netty.MessageChannelHandler.handlerResponseError(MessageChannelHandler.java:175)
    ... 23 more

The error that I am receiving on the Elasticsearch side is:

[2015-11-13 19:56:03,117][WARN ][transport.netty          ] [Mister X] exception caught on transport layer [[id: 0x45211071, /10.xx.xx.xx:26266 => /10.xx.xx.xx:9300]], closing connection
java.lang.IllegalStateException: Message not fully read (request) for requestId [24], action [cluster/state], readerIndex [34] vs expected [49]; resetting
        at org.elasticsearch.transport.netty.MessageChannelHandler.messageReceived(MessageChannelHandler.java:120)
        at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
        at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
        at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
        at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
        at org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:462)
        at org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:443)
        at org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
        at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
        at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
        at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
        at org.elasticsearch.common.netty.OpenChannelsHandler.handleUpstream(OpenChannelsHandler.java:75)
        at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
        at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
        at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
        at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
        at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
        at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
        at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
        at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
        at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
        at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
        at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)

Both the ES cluster and the consumer are using Java 1.7.0_91.

Please let me know if there is anything else I can add.

Thanks.

not consuming existing flowlogs

I have about 2 weeks of flow logs that I've collected into a log group, and I am able to spin this stack up successfully (thank you great work on this!)

But it will only index and visualize data that came into the log group after I spun up the stack, how do I get this system to grab everything in the log group for X number of days backwards?

Setup in an existing VPC

I intend to set this up in an existing VPC. I have an existing VPC, subnets, IGW, Route table and Network Access list.
It will be nice not to have a template that one can use and a step-by-step guide to get it done.

@dvassallo any work around for this?

Fails to send JSON with a string prefix and no extracted fields to elasticsearch

There was a change here 9eef3cd to support adding JSON extracted fields with a non-JSON prefix. However, it misses the case where there are no extracted fields but still a string with a JSON prefix:
https://github.com/awslabs/cloudwatch-logs-subscription-consumer/blob/master/src/main/java/com/amazonaws/services/logs/connectors/elasticsearch/CloudWatchLogsElasticsearchDocument.java#L100

For example, putting a log event with a message:

test json: {"hello": "world"}

Fails with the following exception in /var/log/cloudwatch-logs-subscription-consumer.log

2015-12-23 18:47:39,825 ERROR KinesisConnectorRecordProcessor - Failed to transform record com.amazonaws.services.logs.subscriptions.CloudWatchLogsEvent@732d486e to output type
java.io.IOException: Error serializing the Elasticsearch document to JSON: A JSONObject text must begin with '{' at 1 [character 2 line 1]
        at com.amazonaws.services.logs.connectors.elasticsearch.ElasticsearchTransformer.fromClass(ElasticsearchTransformer.java:65)
        at com.amazonaws.services.logs.connectors.elasticsearch.ElasticsearchTransformer.fromClass(ElasticsearchTransformer.java:33)
        at com.amazonaws.services.kinesis.connectors.KinesisConnectorRecordProcessor.transformToOutput(KinesisConnectorRecordProcessor.java:150)
        at com.amazonaws.services.kinesis.connectors.KinesisConnectorRecordProcessor.processRecords(KinesisConnectorRecordProcessor.java:135)
        at com.amazonaws.services.kinesis.clientlibrary.lib.worker.V1ToV2RecordProcessorAdapter.processRecords(V1ToV2RecordProcessorAdapter.java:42)
        at com.amazonaws.services.kinesis.clientlibrary.lib.worker.ProcessTask.call(ProcessTask.java:172)
        at com.amazonaws.services.kinesis.clientlibrary.lib.worker.MetricsCollectingTaskDecorator.call(MetricsCollectingTaskDecorator.java:48)
        at com.amazonaws.services.kinesis.clientlibrary.lib.worker.MetricsCollectingTaskDecorator.call(MetricsCollectingTaskDecorator.java:23)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
Caused by: com.amazonaws.util.json.JSONException: A JSONObject text must begin with '{' at 1 [character 2 line 1]
        at com.amazonaws.util.json.JSONTokener.syntaxError(JSONTokener.java:422)
        at com.amazonaws.util.json.JSONObject.<init>(JSONObject.java:196)
        at com.amazonaws.util.json.JSONObject.<init>(JSONObject.java:323)
        at com.amazonaws.services.logs.connectors.elasticsearch.CloudWatchLogsElasticsearchDocument.getFields(CloudWatchLogsElasticsearchDocument.java:101)
        at com.amazonaws.services.logs.connectors.elasticsearch.CloudWatchLogsElasticsearchDocument.<init>(CloudWatchLogsElasticsearchDocument.java:43)
        at com.amazonaws.services.logs.connectors.elasticsearch.ElasticsearchTransformer.fromClass(ElasticsearchTransformer.java:47)
        ... 11 more

Cloudformation template failure

Hi,

My colleague and I have been trying to begin a testing environment to see if this will be useful enough for our daily working. We have come across an error when initially deploying the cloudfromation template supplied /configuration/cloudformation/cwl-elasticsearch.template. Is this something you have come across before? We've logged into the created instance and cfn-init is indeed accessible on the instance. below details the error;

16:27:58 UTC+0100 CREATE_FAILED AWS::CloudFormation::WaitCondition WaitCondition WaitCondition received failed message: 'failed to run cfn-init' for uniqueId: i-c8939d0a
Physical ID:arn:aws:cloudformation:us-west-1:532578262755:stack/test-stack-cf/ade64480-3abc-11e5-82fb-500c335a70e0/WaitHandle

How to map a geo_point in indices that are created automatically by elasticsearch-kopf?

I would really appreciate any help with the following issue:

Take a look at the mapping for the following index

{
"_default_":{
"properties":{
"@timestamp":{
"format":"dateOptionalTime",
"doc_values":true,
"type":"date"
},
"@message":{
"type":"string"
},
"@id":{
"type":"string"
}
},
"_all":{
"enabled":false
}
},
"development":{
"properties":{
"@timestamp":{
"format":"dateOptionalTime",
"doc_values":true,
"type":"date"
},
"@log_stream":{
"type":"string"
},
"@message":{
"type":"string"
},
"Context":{
"properties":{
"LocationId":{
"type":"string"
},
"SubCategoryId":{
"type":"string"
},
"HttpServerName":{
"type":"string"
},
"HttpRequestUri":{
"type":"string"
},
"CategoryId":{
"type":"string"
},
"RequestId":{
"type":"string"
},
"Coordinate":{
"type":"string"
},
"ServiceId":{
"type":"string"
},
"UserId":{
"type":"string"
},
"HttpMethod":{
"type":"string"
}
}
},
"Message":{
"type":"string"
},
"@id":{
"type":"string"
},
"Thread":{
"properties":{
"Name":{
"type":"string"
},
"Id":{
"type":"long"
},
"Priority":{
"type":"long"
}
}
},
"Timestamp":{
"format":"dateOptionalTime",
"type":"date"
},
"Marker":{
"type":"string"
},
"@log_group":{
"type":"string"
},
"@owner":{
"type":"string"
}
},
"_all":{
"enabled":false
}
}
}

From the mapping above, you can see that the Coordinate property type is a stringtype but it would be nice if I can find a way to ensure that this property is of type geo_point.

Keep in mind that if I manually change the mapping for Coordinate to geo_point, it will work and Kibana will recognize it as a geo_point type. However, when kopf automatically creates another daily index, it will map Coordinate as a string type and Kibana will get a mapping conflict.

https://cdn.discourse.org/elastic/uploads/default/optimized/2X/9/99d5ed421e580787948626382ab547ec17bbbd56_1_690x339.png

Issues with splitting string fields

@dvassallo : Am I correct to assume that if I add something like :

    "eventSource": {
     "type": "string",
     "index": "not_analyzed"
    }

to the cwl_template.json elasticsearch won't bother me with splitting the string fields as seen in the attached screen capture ?

Record failed with message: UnavailableShardsException

It seems I'm not able to index logs from cloudwatch:

Record failed: {"index":"cwl-2016.12.19","type":"logging","source":"{\"timestamp\":\"10.0.0.107\",\"@log_stream\":\"production.10.0.1.87.cows.http_access\",\"@timestamp\":1482185012343,\"@message\":\"10.0.0.107 - - [19/Dec/2016:22:03:32 +0000] \\\"HEAD /health HTTP/1.1\\\" 200 - \\\"-\\\" \\\"lua-resty-http/0.08 (Lua) ngx_lua/10005\\\"\",\"request_id\":\"-\",\"event\":\"- [19/Dec/2016:22:03:32 +0000] \\\"HEAD /health HTTP/1.1\\\" 200 - \\\"-\\\" \\\"lua-resty-http/0.08 (Lua) ngx_lua/10005\\\"\",\"@id\":\"33053830297342209648292451756418042022894691916625608704\",\"@log_group\":\"logging\",\"@owner\":\"412642013128\"}","id":"33053830297342209648292451756418042022894691916625608704","version":null,"ttl":null,"create":true}
2016-12-19 22:01:36,404 ERROR ElasticsearchEmitter - Record failed with message: UnavailableShardsException[[cwl-2016.12.19][7] Not enough active copies to meet write consistency of [QUORUM] (have 1, needed 2). Timeout: [1m], request: org.elasticsearch.action.bulk.BulkShardRequest@7ac0d0eb]
2016-12-19 22:01:36,404 INFO  ElasticsearchEmitter - Emitted 0 records to Elasticsearch
2016-12-19 22:01:36,405 WARN  ElasticsearchEmitter - Cluster health is YELLOW.
2016-12-19 22:01:36,405 WARN  ElasticsearchEmitter - Returning 86 records as failed

I suspect this is simply not understanding the proper configuration for Kibana 4, ES, and using a Cloudwatch logging group. Here's my params:

Consumer doesn't restart after instance reboot

The elasticsearch instance got (accidentally) rebooted and afterwards no data was being put into elasticsearch. I checked Kinesis on the AWS Console and it showed 0 get requests. Emailed Daniel who got back to me super quick (thanks Daniel!) and suggested I rerun this portion of the CF script. https://github.com/awslabs/cloudwatch-logs-subscription-consumer/blob/master/configuration/cloudformation/cwl-elasticsearch.template#L675-L682

Dug out the variables from the CF resources and ran that script. Looks like something is happening but I'm still not getting any data in ES. Daniel said this was probably worthy of an issue. So here it is.

I'll gladly send any logs that you need, but I don't even know what to send at this point.

Thanks,
matthew

amazon-archives / cloudwatch-logs-subscription-consumer Goto Github PK