License: Apache License 2.0

Ruby 96.95% Shell 3.05%

logstash-input-elasticsearch's People

Contributors

Stargazers

Watchers

Forkers

jsvd suyograo ph talevy nellicus atnak calispqr dadoonet dmcewan asatsi lucianprecup eliasah chandlermelton mcascallares semyazz kingigi markwalkom colinsurprenant gmoskovicz ericamick alenegro81 cmclay dedemorton magnusbaeck jaredj723 7lima untergeek geekpete fanlinjie happyoli andrewvc dlyoungerman spacewrangler vktor18 mzubok orhiee consulthys jacoblemperle xanadonf omribromberg vitali-zevako yaauie adrianlzt flefebure abdulhaseebhussain rlwmmw rolfchess patmanaa sanachatbouri codecastbv robbavey rezabaharlooei lucabelluccini blackkensai sergey34 changhengliou mpedroc90 karenzone shanlashari sryeddula jorelcb shivharsh-singh kares alogishetty ifmol slimm609 andsel sheffercool kaisecheng matonanthony mjpowersjr jandritsch rashmiram isabella232 naveentatikonda douniwan5788 mashhurs voute edmocosta insukcho mandyl mikhailms vincetrumental retaildevcrews

logstash-input-elasticsearch's Issues

Disable ssl certificate validation

Would be good to have the same option which is available for logstash-output-elasticsearch to be able to disable ssl certificate verification:

ssl_certificate_verification: false

Proxy path config as in logstash-output-elasticsearch

Hi!

Is it possible to use this input plugin to work with an ES which is behind a proxy which uses a path to deliver the HTTP query?

https://user:[email protected]/eclipse

In logstash-output-elasticsearch you have the "path" option for this:

https://www.elastic.co/guide/en/logstash/current/plugins-outputs-elasticsearch.html#plugins-outputs-elasticsearch-path

I have tried to use:

input {
  elasticsearch {
   hosts => ["elasticsearch.bitergia.com/eclipse:443"]
   ssl => true
   # path => "eclipse"
   user => "user"
   password => "passwd"
   index => "gerrit_git.eclipse.org"
  }
}

which connects correctly, but uses ":443" as the index (not gerrit_git.eclipse.org).

acs@nintendo:~/devel$ logstash-2.2.2/bin/logstash  -f copy-index
A plugin had an unrecoverable error. Will restart this plugin.
  Plugin: <LogStash::Inputs::Elasticsearch index=>"gerrit_git.eclipse.org", hosts=>["elasticsearch.bitergia.com/eclipse:443"], ssl=>true, user=>"bitergia", password=><password>, codec=><LogStash::Codecs::JSON charset=>"UTF-8">, query=>"{\"query\": { \"match_all\": {} } }", scan=>true, size=>1000, scroll=>"1m", docinfo=>false, docinfo_target=>"@metadata", docinfo_fields=>["_index", "_type", "_id"]>
  Error: [404] {"error":{"root_cause":[{"type":"index_not_found_exception","reason":"no such index","resource.type":"index_or_alias","resource.id":":443","index":":443"}],"type":"index_not_found_exception","reason":"no such index","resource.type":"index_or_alias","resource.id":":443","index":":443"},"status":404} {:level=>:error}

Restart behavior seems faulty - Logstash 6.0.0

I put on a topic on Elastic.co for this, but I'm creating an issue here because I believe this is a bug.

You can see the topic here:
https://discuss.elastic.co/t/elasticsearch-input-plugin-for-logstash-6-0-0-issue-indexing-data/110054

In brief, while the input plugin is trying to retrieve data from an index that doesn't exist, it keeps on giving a 404 error, but once there is data on that index, it doesn't give the error anymore, though it doesn't get the data or does anything.
Once I restart Logstash with the data existing in the ES index, it gets the data and transforms it into whatever I want it to be, so I believe the plugin does not restart properly.

Document use of doc metadata in `add_field`

We added some nice new features in https://github.com/logstash-plugins/logstash-input-elasticsearch/pull/78/files but never altered the docs to reflect that. They should be updated.

Should run on a schedule

This plugin should run on a schedule like the JDBC input, using cron syntax.

/cc @ctindel

Deprecate `size` setting

The size setting is how many records are returned per scroll request, and it can easily be confused with "how many records to return, period" which is not what it does.

In practice, users should not care about this setting, and I find no reason why a user would want to set it.

The default (1000) is probably fine for every use case.

Update docs to remove "port =>" config option.

Need to remove mention of "port =>" option from documentation. Port is now set in the hosts option #17

[feature] Support configurations so it behaves like JDBC input

Currently, the ES input plugin has no way to pick up from where it left off last time it executed, just as the JDBC input does with the sql_last_value value.

I propose a set of configurations that makes ES input plugin query from a past known @timestamp (by default)

"each host can be either IP, HOST," statement not correct for hosts array scenario

Documentation on hosts setting for the input is not clear:

https://www.elastic.co/guide/en/logstash/current/plugins-inputs-elasticsearch.html#plugins-inputs-elasticsearch-hosts

It says

List of elasticsearch hosts to use for querying. each host can be either IP, HOST, IP:port or HOST:port port defaults to 9200

But if you have multiple hosts specified as an array, it doesn't know what to default the port to.

So you end up with something like the following for the case of ssl=>true.

[2017-05-01T18:12:36,315][ERROR][logstash.agent           ] Pipeline aborted due to error {:exception=>#<URI::InvalidURIError: the scheme https does not accept registry part: 127.0.0.1:https (or bad hostname?)>,

In others, the statement "each host can be either IP, HOST," is not true if multiple hosts are specified in an array.

port config is ignored

Although the doc lists a "port" option, it looks like it is ignored entirely by the input.
Using logstash 1.5.0 RC2

Can suffix the hosts config with :port but only works for http, not https

For example
hosts => ["myhost:9243"]
post => 9243
ssl => true

The error reported is:
the scheme https does not accept registry part: my-host:9243:9200 (or bad hostname?)

Support for client certificate authentication

I wanted to create a pipeline from one cluster to another with logstash. Both clusters are protected behind reverse proxies, which only speak SSL and require client certificates

The logstash-output-elastic nicely support that (keystore/truststore etc. properties) while the input only provides the ssl boolean config and the ca_certs PEM file. I did not find a way to configure proper client certificate authentication.

Could this feature be added or is there a fundamental issue which prevents that?

Version: 5.4.0
Operating System: Linux

Out of Memory: Plugin retains too much memory per scan request

By re-using the variable r in the run method, JRuby does not efficiently clean enough of the data out of memory:

just by refactoring the request to be handled within a more inner and temporary scope of another function, memory usage is reduced radically:

def run_next(output_queue, scroll_id)
    r = scroll_request(scroll_id)
    r['hits']['hits'].each do |hit|
      event = LogStash::Event.new(hit['_source'])
      decorate(event)

      if @docinfo
        event[@docinfo_target] ||= {}

        unless event[@docinfo_target].is_a?(Hash)
          @logger.error("Elasticsearch Input: Incompatible Event, incompatible type for the `@metadata` field in the `_source` document, expected a hash got:", :metadata_type => event[@docinfo_target].class)

          raise Exception.new("Elasticsearch input: incompatible event") 
        end

        @docinfo_fields.each do |field|
          event[@docinfo_target][field] = hit[field]
        end
      end
      output_queue << event
    end

    [r['hits']['hits'].any?, r['_scroll_id']]
  end

This function is hacky, but it paints a picture in terms of memory performance.

before this, runtime memory usage of a specific config reached 1457 MB. after this refactor, that same config's memory usage was reduced to a max of 642 MB on my local computer.

re: https://github.com/logstash-plugins/logstash-input-elasticsearch/blob/master/lib/logstash/inputs/elasticsearch.rb#L152

Add HTTPS and Basic auth

Two options:

a) Keep FTW and implement these features in it. For SSL, FTW requires a change because it's not possible to pass a .pem file to FTW::Agent
b) Rewrite the plugin to use es-ruby and implement the new features in the selected transport.

Addresses elastic/logstash#1996

ca_file should accept an array for certificate chains.

For all general issues, please provide the following details for fast resolution:

Version: 5.2
Operating System: Any
Steps to Reproduce:

Based on documentation

ca_file should be a valid filesystem path, This needs to be an array accepting paths to multiple files where necessary for certificate chains.

Inconsistencies with ES output plugin

In 6.0.1 you are able to include the https:// in the hosts field of the elasticsearch input but if you do the same on the hosts field of the elasticsearch input plugin you get a very cryptic error message.

Since these are all related to pieces of our stack it seems like there should be absolute consistency between the way options are presented in the pipeline configuration files between the es input, filter, and output plugins.

Support Aggregations

Not sure if this would be better supported by a totally different plugin, but I think it would make sense to support aggregation results.

As an example, say we wanted to use a sum aggregation, by terms.

GET myserverlogs*/_search
{

   "size": 0,
   "aggs": {
      "types": {
         "terms": {
            "field": "hostname"
         },
         "aggs": {
            "connections": {
               "sum": {
                  "field": "connections"
               }
            }
         }
      }
   }
}

Results:

  "aggregations": {
      "types": {
         "doc_count_error_upper_bound": 0,
         "sum_other_doc_count": 0,
         "buckets": [
            {
               "key": "hostA",
               "doc_count": 90310,
               "total_connections": {
                  "value": 2344
               }
            },
            {
               "key": "hostB",
               "doc_count": 485,
               "total_connections": {
                  "value": 233
               }
            },
            {
               "key": "hostC",
               "doc_count": 485,
               "total_connections": {
                  "value": 123
               }
            }
         ]
      }
   }

Then in Logstash we could say something like:

input {
  elasticsearch {
    ...
    results_mode => "aggregation"
    source_array => "types.buckets"
  }
}

This could even be used to create aggregation based summary indices on a regular interval. I remember @polyfractal was working on an ES plugin that did something similar last year.

ES 1.X Support

This is a question, but it has been asked in the forums and went unanswered, and there is no concrete answer anywhere. Does this plugin support reading inputs from an ES 1.X Index? The earliest mentioned version is 2.2, and when I attempt to hit a 1.X index, I get the following error: ElasticsearchIllegalArgumentException[Failed to decode scrollId]. If you explicitly don't support indexes that old, it is fine, but PLEASE specify it somewhere in the documentation. If you do, please specify how to correct this issue.

Fail to test the plugin,plugin restart again and again.

Please post all product and debugging questions on our forum. Your questions will reach our wider community members there, and if we confirm that there is a bug, then we can open a new issue here.

For all general issues, please provide the following details for fast resolution:

Version:5.4.0
Operating System:rhel6.5_64
Config File (if you have sensitive info, please remove it):
input {
elasticsearch {
hosts => "192.168.144.65"
index => "soatest"
query => '{"query": {"match": {"flowno": 1}}}'
docinfo => false
enable_metric => false
}
}
output {
stdout { codec => rubydebug }
}
Sample Data:{"flowno":"1","name":"qcc"}
Steps to Reproduce:
I just use the basic function.but logstash repeat again and again:

[2017-10-26T14:27:27,679][ERROR][logstash.pipeline ] A plugin had an unrecoverable error. Will restart this plugin.
Plugin: <LogStash::Inputs::Elasticsearch hosts=>["192.168.144.65"], index=>"soatest", query=>"{"query": {"match": {"flowno": 1}}}", docinfo=>false, enable_metric=>false, id=>"48ac806bfab03e422319b3e4bbbfe7de704b74e3-1", codec=><LogStash::Codecs::JSON id=>"json_ace8aefe-1f24-4733-a557-aaeb32005eea", enable_metric=>true, charset=>"UTF-8">, size=>1000, scroll=>"1m", docinfo_target=>"@metadata", docinfo_fields=>["_index", "_type", "_id"], ssl=>false>
Error: [400] {"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"Failed to parse request body"}],"type":"illegal_argument_exception","reason":"Failed to parse request body","caused_by":{"type":"json_parse_exception","reason":"Unrecognized token 'DnF1ZXJ5VGhlbkZldGNoBQAAAAAAAASOFklGT3M2S3U1U0pXMlA1S21ST1RTTlEAAAAAAAAEhhZVcjVncWhhMVNpNm5ITnRwaG9HLWRRAAAAAAAABIcWVXI1Z3FoYTFTaTZuSE50cGhvRy1kUQAAAAAAAASNFklGT3M2S3U1U0pXMlA1S21ST1RTTlEAAAAAAAAEjxZJRk9zNkt1NVNKVzJQNUttUk9UU05R': was expecting ('true', 'false' or 'null')\n at [Source: org.elasticsearch.transport.netty4.ByteBufStreamInput@4e3f57c0; line: 1, column: 457]"}},"status":400}
{
"@timestamp" => 2017-10-26T06:27:27.669Z,
"name" => "qcc",
"@Version" => "1",
"flowno" => "1"
}
[2017-10-26T14:27:28,933][ERROR][logstash.pipeline ] A plugin had an unrecoverable error. Will restart this plugin.
Plugin: <LogStash::Inputs::Elasticsearch hosts=>["192.168.144.65"], index=>"soatest", query=>"{"query": {"match": {"flowno": 1}}}", docinfo=>false, enable_metric=>false, id=>"48ac806bfab03e422319b3e4bbbfe7de704b74e3-1", codec=><LogStash::Codecs::JSON id=>"json_ace8aefe-1f24-4733-a557-aaeb32005eea", enable_metric=>true, charset=>"UTF-8">, size=>1000, scroll=>"1m", docinfo_target=>"@metadata", docinfo_fields=>["_index", "_type", "_id"], ssl=>false>
Error: [400] {"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"Failed to parse request body"}],"type":"illegal_argument_exception","reason":"Failed to parse request body","caused_by":{"type":"json_parse_exception","reason":"Unrecognized token 'DnF1ZXJ5VGhlbkZldGNoBQAAAAAAAASIFlVyNWdxaGExU2k2bkhOdHBob0ctZFEAAAAAAAAEkRZJRk9zNkt1NVNKVzJQNUttUk9UU05RAAAAAAAABJAWSUZPczZLdTVTSlcyUDVLbVJPVFNOUQAAAAAAAASJFlVyNWdxaGExU2k2bkhOdHBob0ctZFEAAAAAAAAEihZVcjVncWhhMVNpNm5ITnRwaG9HLWRR': was expecting ('true', 'false' or 'null')\n at [Source: org.elasticsearch.transport.netty4.ByteBufStreamInput@7392d1bc; line: 1, column: 457]"}},"status":400}
{
"@timestamp" => 2017-10-26T06:27:28.924Z,
"name" => "qcc",
"@Version" => "1",
"flowno" => "1"
}
^C[2017-10-26T14:27:29,699][WARN ][logstash.runner ] SIGINT received. Shutting down the agent.
[2017-10-26T14:27:29,711][WARN ][logstash.agent ] stopping pipeline {:id=>"main"}
{
"@timestamp" => 2017-10-26T06:27:30.018Z,
"name" => "qcc",
"@Version" => "1",
"flowno" => "1"
}

Set Content-Type and Accept headers

This is necessary to support elastic/elasticsearch#22691 when it lands. Without fixing this, this input will (likely?) not work against new Elasticsearch releases.

JRUBY 9K faillures

document info to the @metadata field
     Failure/Error: (output_func(event) || []).each do |output|
     
     NoMethodError:
       undefined method `each' for #<Queue:0x23c72128>
     # /home/travis/.rvm/gems/jruby-9.1.9.0/gems/logstash-core-5.4.0-java/lib/logstash/pipeline.rb:400:in `block in output_batch'
     # /home/travis/.rvm/gems/jruby-9.1.9.0/gems/logstash-core-5.4.0-java/lib/logstash/util/wrapped_synchronous_queue.rb:224:in `block in each'
     # /home/travis/.rvm/gems/jruby-9.1.9.0/gems/logstash-core-5.4.0-java/lib/logstash/util/wrapped_synchronous_queue.rb:223:in `each'
     # /home/travis/.rvm/gems/jruby-9.1.9.0/gems/logstash-core-5.4.0-java/lib/logstash/pipeline.rb:394:in `output_batch'
     # /home/travis/.rvm/gems/jruby-9.1.9.0/gems/logstash-core-5.4.0-java/lib/logstash/pipeline.rb:352:in `worker_loop'
     # /home/travis/.rvm/gems/jruby-9.1.9.0/gems/logstash-core-5.4.0-java/lib/logstash/pipeline.rb:317:in `block in start_workers'
  4) LogStash::Inputs::Elasticsearch with Elasticsearch document information when defining docinfo merges the values if the `docinfo_target` already exist in the `_source` document
     Failure/Error: (output_func(event) || []).each do |output|
     
     NoMethodError:
       undefined method `each' for #<Queue:0x6eb00cbe>
     # /home/travis/.rvm/gems/jruby-9.1.9.0/gems/logstash-core-5.4.0-java/lib/logstash/pipeline.rb:400:in `block in output_batch'
     # /home/travis/.rvm/gems/jruby-9.1.9.0/gems/logstash-core-5.4.0-java/lib/logstash/util/wrapped_synchronous_queue.rb:224:in `block in each'
     # /home/travis/.rvm/gems/jruby-9.1.9.0/gems/logstash-core-5.4.0-java/lib/logstash/util/wrapped_synchronous_queue.rb:223:in `each'
     # /home/travis/.rvm/gems/jruby-9.1.9.0/gems/logstash-core-5.4.0-java/lib/logstash/pipeline.rb:394:in `output_batch'
     # /home/travis/.rvm/gems/jruby-9.1.9.0/gems/logstash-core-5.4.0-java/lib/logstash/pipeline.rb:352:in `worker_loop'
     # /home/travis/.rvm/gems/jruby-9.1.9.0/gems/logstash-core-5.4.0-java/lib/logstash/pipeline.rb:317:in `block in start_workers'
  5) LogStash::Inputs::Elasticsearch with Elasticsearch document information when defining docinfo should move the document information to the specified field
     Failure/Error: (output_func(event) || []).each do |output|
     
     NoMethodError:
       undefined method `each' for #<Queue:0xc30f26d>
     # /home/travis/.rvm/gems/jruby-9.1.9.0/gems/logstash-core-5.4.0-java/lib/logstash/pipeline.rb:400:in `block in output_batch'
     # /home/travis/.rvm/gems/jruby-9.1.9.0/gems/logstash-core-5.4.0-java/lib/logstash/util/wrapped_synchronous_queue.rb:224:in `block in each'
     # /home/travis/.rvm/gems/jruby-9.1.9.0/gems/logstash-core-5.4.0-java/lib/logstash/util/wrapped_synchronous_queue.rb:223:in `each'
     # /home/travis/.rvm/gems/jruby-9.1.9.0/gems/logstash-core-5.4.0-java/lib/logstash/pipeline.rb:394:in `output_batch'
     # /home/travis/.rvm/gems/jruby-9.1.9.0/gems/logstash-core-5.4.0-java/lib/logstash/pipeline.rb:352:in `worker_loop'
     # /home/travis/.rvm/gems/jruby-9.1.9.0/gems/logstash-core-5.4.0-java/lib/logstash/pipeline.rb:317:in `block in start_workers'
  6) LogStash::Inputs::Elasticsearch with Elasticsearch document information when defining docinfo should allow to specify which fields from the document info to save to the @metadata field
     Failure/Error: (output_func(event) || []).each do |output|
     
     NoMethodError:
       undefined method `each' for #<Queue:0x16944b58>
     # /home/travis/.rvm/gems/jruby-9.1.9.0/gems/logstash-core-5.4.0-java/lib/logstash/pipeline.rb:400:in `block in output_batch'
     # /home/travis/.rvm/gems/jruby-9.1.9.0/gems/logstash-core-5.4.0-java/lib/logstash/util/wrapped_synchronous_queue.rb:224:in `block in each'
     # /home/travis/.rvm/gems/jruby-9.1.9.0/gems/logstash-core-5.4.0-java/lib/logstash/util/wrapped_synchronous_queue.rb:223:in `each'
     # /home/travis/.rvm/gems/jruby-9.1.9.0/gems/logstash-core-5.4.0-java/lib/logstash/pipeline.rb:394:in `output_batch'
     # /home/travis/.rvm/gems/jruby-9.1.9.0/gems/logstash-core-5.4.0-java/lib/logstash/pipeline.rb:352:in `worker_loop'
     # /home/travis/.rvm/gems/jruby-9.1.9.0/gems/logstash-core-5.4.0-java/lib/logstash/pipeline.rb:317:in `block in start_workers'

basic config not working in 1.5.0?

Not sure if I'm missing anything here, from the docs this should work

config

abonuccelli@w530 /opt/elk/TEST/logstash-1.5.0 $ cat config/test.conf 
input { 
    elasticsearch {  
            hosts => "w530" 
            port => "9203"
            index => "test"
    } 
}
filter {
}
output {
    stdout{ codec=>rubydebug }
}

run

abonuccelli@w530 /opt/elk/TEST/logstash-1.5.0 $ ./bin/logstash -f config/9661.conf  --debug
Reading config file {:file=>"logstash/agent.rb", :level=>:debug, :line=>"293", :method=>"local_config"}
Compiled pipeline code:
        @inputs = []
        @filters = []
        @outputs = []
        @periodic_flushers = []
        @shutdown_flushers = []

          @input_elasticsearch_1 = plugin("input", "elasticsearch", LogStash::Util.hash_merge_many({ "hosts" => ("w530") }, { "port" => ("9203") }))

          @inputs << @input_elasticsearch_1


          @output_stdout_2 = plugin("output", "stdout", LogStash::Util.hash_merge_many({ "codec" => ("rubydebug") }))

          @outputs << @output_stdout_2

  def filter_func(event)
    events = [event]
    @logger.debug? && @logger.debug("filter received", :event => event.to_hash)

    events
  end
  def output_func(event)
    @logger.debug? && @logger.debug("output received", :event => event.to_hash)
    @output_stdout_2.handle(event)

  end {:level=>:debug, :file=>"logstash/pipeline.rb", :line=>"28", :method=>"initialize"}
Plugin not defined in namespace, checking for plugin file {:type=>"input", :name=>"elasticsearch", :path=>"logstash/inputs/elasticsearch", :level=>:debug, :file=>"logstash/plugin.rb", :line=>"133", :method=>"lookup"}
Using version 0.1.x input plugin 'elasticsearch'. This plugin isn't well supported by the community and likely has no maintainer. {:level=>:info, :file=>"logstash/config/mixin.rb", :line=>"223", :method=>"print_version_notice"}
Plugin not defined in namespace, checking for plugin file {:type=>"codec", :name=>"json", :path=>"logstash/codecs/json", :level=>:debug, :file=>"logstash/plugin.rb", :line=>"133", :method=>"lookup"}
Using version 0.1.x codec plugin 'json'. This plugin isn't well supported by the community and likely has no maintainer. {:level=>:info, :file=>"logstash/config/mixin.rb", :line=>"223", :method=>"print_version_notice"}
config LogStash::Codecs::JSON/@charset = "UTF-8" {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Inputs::Elasticsearch/@hosts = ["w530"] {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Inputs::Elasticsearch/@port = 9203 {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Inputs::Elasticsearch/@debug = false {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Inputs::Elasticsearch/@codec = <LogStash::Codecs::JSON charset=>"UTF-8"> {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Inputs::Elasticsearch/@add_field = {} {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Inputs::Elasticsearch/@index = "logstash-*" {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Inputs::Elasticsearch/@query = "{\"query\": { \"match_all\": {} } }" {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Inputs::Elasticsearch/@scan = true {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Inputs::Elasticsearch/@size = 1000 {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Inputs::Elasticsearch/@scroll = "1m" {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Inputs::Elasticsearch/@docinfo = false {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Inputs::Elasticsearch/@docinfo_target = "@metadata" {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Inputs::Elasticsearch/@docinfo_fields = ["_index", "_type", "_id"] {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Inputs::Elasticsearch/@ssl = false {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
Plugin not defined in namespace, checking for plugin file {:type=>"output", :name=>"stdout", :path=>"logstash/outputs/stdout", :level=>:debug, :file=>"logstash/plugin.rb", :line=>"133", :method=>"lookup"}
Using version 0.1.x output plugin 'stdout'. This plugin isn't well supported by the community and likely has no maintainer. {:level=>:info, :file=>"logstash/config/mixin.rb", :line=>"223", :method=>"print_version_notice"}
Plugin not defined in namespace, checking for plugin file {:type=>"codec", :name=>"rubydebug", :path=>"logstash/codecs/rubydebug", :level=>:debug, :file=>"logstash/plugin.rb", :line=>"133", :method=>"lookup"}
Using version 0.1.x codec plugin 'rubydebug'. This plugin isn't well supported by the community and likely has no maintainer. {:level=>:info, :file=>"logstash/config/mixin.rb", :line=>"223", :method=>"print_version_notice"}
config LogStash::Codecs::RubyDebug/@metadata = false {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Outputs::Stdout/@codec = <LogStash::Codecs::RubyDebug > {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Outputs::Stdout/@type = "" {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Outputs::Stdout/@tags = [] {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Outputs::Stdout/@exclude_tags = [] {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Outputs::Stdout/@workers = 1 {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
Pipeline started {:level=>:info, :file=>"logstash/pipeline.rb", :line=>"86", :method=>"run"}
Logstash startup completed
A plugin had an unrecoverable error. Will restart this plugin.
  Plugin: <LogStash::Inputs::Elasticsearch hosts=>[{:host=>"w530", :port=>9200, :protocol=>"http"}], index=>"logstash-*", query=>"{\"query\": { \"match_all\": {} } }", scroll=>"1m", docinfo_target=>"@metadata", docinfo_fields=>["_index", "_type", "_id"]>
  Error: Connection reset by peer - Connection reset by peer
  Exception: Faraday::ConnectionFailed
  Stack: org/jruby/RubyIO.java:2858:in `read_nonblock'
/opt/elk/TEST/logstash-1.5.0/vendor/jruby/lib/ruby/1.9/net/protocol.rb:141:in `rbuf_fill'
/opt/elk/TEST/logstash-1.5.0/vendor/jruby/lib/ruby/1.9/net/protocol.rb:122:in `readuntil'
/opt/elk/TEST/logstash-1.5.0/vendor/jruby/lib/ruby/1.9/net/protocol.rb:132:in `readline'
/opt/elk/TEST/logstash-1.5.0/vendor/jruby/lib/ruby/1.9/net/http.rb:2571:in `read_status_line'
/opt/elk/TEST/logstash-1.5.0/vendor/jruby/lib/ruby/1.9/net/http.rb:2560:in `read_new'
/opt/elk/TEST/logstash-1.5.0/vendor/jruby/lib/ruby/1.9/net/http.rb:1328:in `transport_request'
org/jruby/RubyKernel.java:1270:in `catch'
/opt/elk/TEST/logstash-1.5.0/vendor/jruby/lib/ruby/1.9/net/http.rb:1325:in `transport_request'
/opt/elk/TEST/logstash-1.5.0/vendor/jruby/lib/ruby/1.9/net/http.rb:1302:in `request'
/opt/elk/TEST/logstash-1.5.0/vendor/jruby/lib/ruby/1.9/net/http.rb:1295:in `request'
/opt/elk/TEST/logstash-1.5.0/vendor/jruby/lib/ruby/1.9/net/http.rb:746:in `start'
/opt/elk/TEST/logstash-1.5.0/vendor/jruby/lib/ruby/1.9/net/http.rb:1293:in `request'
/opt/elk/TEST/logstash-1.5.0/vendor/bundle/jruby/1.9/gems/faraday-0.9.1/lib/faraday/adapter/net_http.rb:82:in `perform_request'
/opt/elk/TEST/logstash-1.5.0/vendor/bundle/jruby/1.9/gems/faraday-0.9.1/lib/faraday/adapter/net_http.rb:40:in `call'
/opt/elk/TEST/logstash-1.5.0/vendor/bundle/jruby/1.9/gems/faraday-0.9.1/lib/faraday/adapter/net_http.rb:87:in `with_net_http_connection'
/opt/elk/TEST/logstash-1.5.0/vendor/bundle/jruby/1.9/gems/faraday-0.9.1/lib/faraday/adapter/net_http.rb:32:in `call'
/opt/elk/TEST/logstash-1.5.0/vendor/bundle/jruby/1.9/gems/faraday-0.9.1/lib/faraday/rack_builder.rb:139:in `build_response'
/opt/elk/TEST/logstash-1.5.0/vendor/bundle/jruby/1.9/gems/faraday-0.9.1/lib/faraday/connection.rb:377:in `run_request'
/opt/elk/TEST/logstash-1.5.0/vendor/bundle/jruby/1.9/gems/elasticsearch-transport-1.0.7/lib/elasticsearch/transport/transport/http/faraday.rb:24:in `perform_request'
org/jruby/RubyProc.java:271:in `call'
/opt/elk/TEST/logstash-1.5.0/vendor/bundle/jruby/1.9/gems/elasticsearch-transport-1.0.7/lib/elasticsearch/transport/transport/base.rb:187:in `perform_request'
/opt/elk/TEST/logstash-1.5.0/vendor/bundle/jruby/1.9/gems/elasticsearch-transport-1.0.7/lib/elasticsearch/transport/transport/http/faraday.rb:20:in `perform_request'
/opt/elk/TEST/logstash-1.5.0/vendor/bundle/jruby/1.9/gems/elasticsearch-transport-1.0.7/lib/elasticsearch/transport/client.rb:115:in `perform_request'
/opt/elk/TEST/logstash-1.5.0/vendor/bundle/jruby/1.9/gems/elasticsearch-api-1.0.7/lib/elasticsearch/api/actions/search.rb:159:in `search'
/opt/elk/TEST/logstash-1.5.0/vendor/bundle/jruby/1.9/gems/logstash-input-elasticsearch-0.1.5/lib/logstash/inputs/elasticsearch.rb:146:in `run'
/opt/elk/TEST/logstash-1.5.0/vendor/bundle/jruby/1.9/gems/logstash-core-1.5.0-java/lib/logstash/pipeline.rb:177:in `inputworker'
/opt/elk/TEST/logstash-1.5.0/vendor/bundle/jruby/1.9/gems/logstash-core-1.5.0-java/lib/logstash/pipeline.rb:171:in `start_input' {:level=>:error, :file=>"logstash/pipeline.rb", :line=>"182", :method=>"inputworker"}
org/jruby/RubyIO.java:2858:in `read_nonblock'
/opt/elk/TEST/logstash-1.5.0/vendor/jruby/lib/ruby/1.9/net/protocol.rb:141:in `rbuf_fill'
/opt/elk/TEST/logstash-1.5.0/vendor/jruby/lib/ruby/1.9/net/protocol.rb:122:in `readuntil'
/opt/elk/TEST/logstash-1.5.0/vendor/jruby/lib/ruby/1.9/net/protocol.rb:132:in `readline'
/opt/elk/TEST/logstash-1.5.0/vendor/jruby/lib/ruby/1.9/net/http.rb:2571:in `read_status_line'
/opt/elk/TEST/logstash-1.5.0/vendor/jruby/lib/ruby/1.9/net/http.rb:2560:in `read_new'
/opt/elk/TEST/logstash-1.5.0/vendor/jruby/lib/ruby/1.9/net/http.rb:1328:in `transport_request'
org/jruby/RubyKernel.java:1270:in `catch'
/opt/elk/TEST/logstash-1.5.0/vendor/jruby/lib/ruby/1.9/net/http.rb:1325:in `transport_request'
/opt/elk/TEST/logstash-1.5.0/vendor/jruby/lib/ruby/1.9/net/http.rb:1302:in `request'
/opt/elk/TEST/logstash-1.5.0/vendor/jruby/lib/ruby/1.9/net/http.rb:1295:in `request'
/opt/elk/TEST/logstash-1.5.0/vendor/jruby/lib/ruby/1.9/net/http.rb:746:in `start'
/opt/elk/TEST/logstash-1.5.0/vendor/jruby/lib/ruby/1.9/net/http.rb:1293:in `request'
/opt/elk/TEST/logstash-1.5.0/vendor/bundle/jruby/1.9/gems/faraday-0.9.1/lib/faraday/adapter/net_http.rb:82:in `perform_request'
/opt/elk/TEST/logstash-1.5.0/vendor/bundle/jruby/1.9/gems/faraday-0.9.1/lib/faraday/adapter/net_http.rb:40:in `call'
/opt/elk/TEST/logstash-1.5.0/vendor/bundle/jruby/1.9/gems/faraday-0.9.1/lib/faraday/adapter/net_http.rb:87:in `with_net_http_connection'
/opt/elk/TEST/logstash-1.5.0/vendor/bundle/jruby/1.9/gems/faraday-0.9.1/lib/faraday/adapter/net_http.rb:32:in `call'
/opt/elk/TEST/logstash-1.5.0/vendor/bundle/jruby/1.9/gems/faraday-0.9.1/lib/faraday/rack_builder.rb:139:in `build_response'
/opt/elk/TEST/logstash-1.5.0/vendor/bundle/jruby/1.9/gems/faraday-0.9.1/lib/faraday/connection.rb:377:in `run_request'
/opt/elk/TEST/logstash-1.5.0/vendor/bundle/jruby/1.9/gems/elasticsearch-transport-1.0.7/lib/elasticsearch/transport/transport/http/faraday.rb:24:in `perform_request'
org/jruby/RubyProc.java:271:in `call'
/opt/elk/TEST/logstash-1.5.0/vendor/bundle/jruby/1.9/gems/elasticsearch-transport-1.0.7/lib/elasticsearch/transport/transport/base.rb:187:in `perform_request'
/opt/elk/TEST/logstash-1.5.0/vendor/bundle/jruby/1.9/gems/elasticsearch-transport-1.0.7/lib/elasticsearch/transport/transport/http/faraday.rb:20:in `perform_request'
/opt/elk/TEST/logstash-1.5.0/vendor/bundle/jruby/1.9/gems/elasticsearch-transport-1.0.7/lib/elasticsearch/transport/client.rb:115:in `perform_request'
/opt/elk/TEST/logstash-1.5.0/vendor/bundle/jruby/1.9/gems/elasticsearch-api-1.0.7/lib/elasticsearch/api/actions/search.rb:159:in `search'
/opt/elk/TEST/logstash-1.5.0/vendor/bundle/jruby/1.9/gems/logstash-input-elasticsearch-0.1.5/lib/logstash/inputs/elasticsearch.rb:146:in `run'
/opt/elk/TEST/logstash-1.5.0/vendor/bundle/jruby/1.9/gems/logstash-core-1.5.0-java/lib/logstash/pipeline.rb:177:in `inputworker'
/opt/elk/TEST/logstash-1.5.0/vendor/bundle/jruby/1.9/gems/logstash-core-1.5.0-java/lib/logstash/pipeline.rb:171:in `start_input'
Plugin is finished {:plugin=><LogStash::Inputs::Elasticsearch hosts=>[{:host=>"w530", :port=>9200, :protocol=>"http"}], index=>"logstash-*", query=>"{\"query\": { \"match_all\": {} } }", scroll=>"1m", docinfo_target=>"@metadata", docinfo_fields=>["_index", "_type", "_id"]>, :level=>:info, :file=>"logstash/plugin.rb", :line=>"61", :method=>"finished"}
^CSIGINT received. Shutting down the pipeline. {:level=>:warn, :file=>"logstash/agent.rb", :line=>"116", :method=>"execute"}
Sending shutdown signal to input thread {:thread=>#<Thread:0xd0578df sleep>, :level=>:info, :file=>"logstash/pipeline.rb", :line=>"258", :method=>"shutdown"}
Plugin is finished {:plugin=><LogStash::Outputs::Stdout >, :level=>:info, :file=>"logstash/plugin.rb", :line=>"61", :method=>"finished"}
Pipeline shutdown complete. {:level=>:info, :file=>"logstash/pipeline.rb", :line=>"100", :method=>"run"}
Logstash shutdown completed
LogStash::ShutdownSignal: LogStash::ShutdownSignal
        sleep at org/jruby/RubyKernel.java:834
  inputworker at /opt/elk/TEST/logstash-1.5.0/vendor/bundle/jruby/1.9/gems/logstash-core-1.5.0-java/lib/logstash/pipeline.rb:195
  start_input at /opt/elk/TEST/logstash-1.5.0/vendor/bundle/jruby/1.9/gems/logstash-core-1.5.0-java/lib/logstash/pipeline.rb:171
abonuccelli@w530 /opt/elk/TEST/logstash-1.5.0 $

same config (except different host parameter name) works in 1.4.2 in same setup

config

abonuccelli@w530 /opt/elk/TEST/logstash-1.4.2 $ cat config/9661.conf 
input { 
    elasticsearch {  
            host => "w530" 
            index => "test"
            port => "9203"
    } 
}

filter {

}
output {
    stdout{ codec=>rubydebug }
}

run

abonuccelli@w530 /opt/elk/TEST/logstash-1.4.2 $ ./bin/logstash -f config/9661.conf  --debug
Reading config file {:file=>"logstash/agent.rb", :level=>:debug, :line=>"301"}
Compiled pipeline code:
@inputs = []
@filters = []
@outputs = []
@input_elasticsearch_1 = plugin("input", "elasticsearch", LogStash::Util.hash_merge_many({ "host" => ("w530".force_encoding("UTF-8")) }, { "index" => ("test".force_encoding("UTF-8")) }, { "port" => ("9203".force_encoding("UTF-8")) }))

@inputs << @input_elasticsearch_1

@output_stdout_3 = plugin("output", "stdout", LogStash::Util.hash_merge_many({ "codec" => ("rubydebug".force_encoding("UTF-8")) }))

@outputs << @output_stdout_3
  @filter_func = lambda do |event, &block|
    extra_events = []
    @logger.debug? && @logger.debug("filter received", :event => event.to_hash)
    newevents = []
    extra_events.each do |event|
      @filter_mutate_2.filter(event) do |newevent|
        newevents << newevent
      end
    end
    extra_events += newevents
    @filter_mutate_2.filter(event) do |newevent|
      extra_events << newevent
    end
    if event.cancelled?
      extra_events.each(&block)
      return
    end

    extra_events.each(&block)
  end
  @output_func = lambda do |event, &block|
    @logger.debug? && @logger.debug("output received", :event => event.to_hash)
    @output_stdout_3.handle(event)

  end {:level=>:debug, :file=>"logstash/pipeline.rb", :line=>"26"}
Using milestone 1 input plugin 'elasticsearch'. This plugin should work, but would benefit from use by folks like you. Please let us know if you find bugs or have suggestions on how to improve this plugin.  For more information on plugin milestones, see http://logstash.net/docs/1.4.2/plugin-milestones {:level=>:warn, :file=>"logstash/config/mixin.rb", :line=>"209"}
config LogStash::Codecs::JSON/@charset = "UTF-8" {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"105"}
config LogStash::Inputs::Elasticsearch/@host = "w530" {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"105"}
config LogStash::Inputs::Elasticsearch/@index = "test" {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"105"}
config LogStash::Inputs::Elasticsearch/@port = 9203 {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"105"}
config LogStash::Inputs::Elasticsearch/@debug = false {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"105"}
config LogStash::Inputs::Elasticsearch/@codec = <LogStash::Codecs::JSON charset=>"UTF-8"> {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"105"}
config LogStash::Inputs::Elasticsearch/@add_field = {} {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"105"}
config LogStash::Inputs::Elasticsearch/@query = "*" {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"105"}
config LogStash::Inputs::Elasticsearch/@scan = true {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"105"}
config LogStash::Inputs::Elasticsearch/@size = 1000 {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"105"}
config LogStash::Inputs::Elasticsearch/@scroll = "1m" {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"105"}
config LogStash::Filters::Mutate/@add_field = {"_id"=>"%{[@metadata][_id]}"} {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"105"}
config LogStash::Filters::Mutate/@type = "" {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"105"}
config LogStash::Filters::Mutate/@tags = [] {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"105"}
config LogStash::Filters::Mutate/@exclude_tags = [] {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"105"}
config LogStash::Filters::Mutate/@add_tag = [] {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"105"}
config LogStash::Filters::Mutate/@remove_tag = [] {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"105"}
config LogStash::Filters::Mutate/@remove_field = [] {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"105"}
config LogStash::Outputs::Stdout/@codec = <LogStash::Codecs::RubyDebug > {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"105"}
config LogStash::Outputs::Stdout/@type = "" {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"105"}
config LogStash::Outputs::Stdout/@tags = [] {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"105"}
config LogStash::Outputs::Stdout/@exclude_tags = [] {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"105"}
config LogStash::Outputs::Stdout/@workers = 1 {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"105"}
Pipeline started {:level=>:info, :file=>"logstash/pipeline.rb", :line=>"78"}
filter received {:event=>{"field"=>"value", "@version"=>"1", "@timestamp"=>"2015-05-19T12:34:02.304Z"}, :level=>:debug, :file=>"(eval)", :line=>"15"}
filters/LogStash::Filters::Mutate: adding value to field {:field=>"_id", :value=>["%{[@metadata][_id]}"], :level=>:debug, :file=>"logstash/filters/base.rb", :line=>"168"}
output received {:event=>{"field"=>"value", "@version"=>"1", "@timestamp"=>"2015-05-19T12:34:02.304Z", "@metadata"=>{}, "_id"=>"%{[@metadata][_id]}"}, :level=>:debug, :file=>"(eval)", :line=>"34"}
Plugin is finished {:plugin=><LogStash::Inputs::Elasticsearch host=>"w530", index=>"test", query=>"*", scroll=>"1m">, :level=>:info, :file=>"logstash/plugin.rb", :line=>"59"}
Plugin is finished {:plugin=><LogStash::Filters::Mutate add_field=>{"_id"=>"%{[@metadata][_id]}"}>, :level=>:info, :file=>"logstash/plugin.rb", :line=>"59"}{
         "field" => "value",
      "@version" => "1",
    "@timestamp" => "2015-05-19T12:34:02.304Z",
}

Plugin is finished {:plugin=><LogStash::Outputs::Stdout >, :level=>:info, :file=>"logstash/plugin.rb", :line=>"59"}
Pipeline shutdown complete. {:level=>:info, :file=>"logstash/pipeline.rb", :line=>"89"}

should we raise exception upon invalid @docinfo_target ?

Per discussion with @guyboertje in PR #35 #35 (comment)

It seems that raising exception in this context is rather extreme? see

logstash-input-elasticsearch/lib/logstash/inputs/elasticsearch.rb

Lines 178 to 183 in a8692e3

 unless docinfo_target.is_a?(Hash) 

 @logger.error("Elasticsearch Input: Incompatible Event, incompatible type for the docinfo_target=#{@docinfo_target} field in the `_source` document, expected a hash got:", :docinfo_target_type => docinfo_target.class, :event => event) 

 # TODO: (colin) I am not sure raising is a good strategy here? 

 raise Exception.new("Elasticsearch input: incompatible event") 

 end

Elasticsearch Input filter needs to be able to limit queries to only the latest index

(This issue was originally filed by @elvarb at elastic/logstash#2912)

It seems that the Elasticsearch Input is designed to pull all data from one cluster.

To be able to only search x amount of minutes/hours/days from now would open up the possibility to use Logstash to query log data stored in Elasticsearch and either alert on them or create a summary.

Using the Kibana Index config you could specify a Index pattern like "[logstash-]YYYY.MM.DD". Then you config it to run every x and then y amount back (most cases it would be the same value). For example, run every 10 minutes, search from "now" to "-10m", with the query "level:error".

Then using count of the returned value and conditionals that if this level is above x then output to email.

Another use case for this would be to create daily reports.

results of the first scroll request aren't sent to the filter queue

The run method (after pr/14 is merged) looks something like:

    # get first wave of data
    r = @client.search(@options)
    has_hits = r['hits']['hits'].any?

    # since 'scan' doesn't return data on the search call, do an extra scroll
    if @scan
      r = run_next(output_queue, r['_scroll_id'])
      has_hits = r['has_hits']
    end

    while has_hits do
      r = run_next(output_queue, r['_scroll_id'])
      has_hits = r['has_hits']
    end

The results of the first search request are never sent to the queue, because that happens in run_next, which by itself starts a new scroll request.

spec suite helper doesn't properly inject output_func

This suite relies on a helper that injects a custom output function in the pipeline to extract all events pushed by the input plugin. This doesn't work because the pipeline will never use the @output_func call this method defines.

Use consistent naming for ca_file/cacert parameters across Elasticsearch input/output plugins

In this input, we use the parameter ca_file for specifying a path to a PEM encoded file holding the CA cert for an SSL-secured ES cluster while we use cacert in the output plugin. As these are the same parameters, they should have the same name.

Interval is needed as config option

Hi there,

I want to query periodically using date range like (now-1h, now)

I saw the code simply, but this plugin excute query once.

Is there plan to add interval option?

Add the ability to use the new Sliced Scroll feature in 5.0

Add the ability to use the new Sliced Scroll feature in Elasticsearch 5.0:
https://www.elastic.co/guide/en/elasticsearch/reference/5.0/search-request-scroll.html#sliced-scroll

This should result in performance improvements when scrolling out of a 5.0 ES endpoint.

If multiple Logstash instances are being used then each host could use a configuration that is pointing at a different slice of the same scroll, though coordination of this would be manual.

'add_field' and 'tags' can't reference metadata fields

The plugin's push_hit method basically looks like this:

def push_hit(hit, output_queue)
    event = LogStash::Event.new(hit['_source'])
    decorate(event)

    # populate metadata fields if docinfo is enabled

    output_queue << event
  end

Since we're calling decorate(event) prior to the population of the metadata fields any add_field or tags options can't reference the metadata fields. Consequently, this configuration won't work:

input {
  elasticsearch {
    ...
    docinfo => true
    add_field => {
      'some-field-name' => "%{[@metadata][_index]}"
    }
  }
}

Postponing the decorate(event) call until after the metadata population should fix this.

Enable ignore_unavailable for elasticsearch input plugin

From @tom-christie elastic/logstash#4696

The REST API provides a parameter called ignore_unavailable which ignored unavailable (including closed) indices, i.e. ignore_unavailable=true.

It would be great if this parameter was available for the Elasticsearch Input plugin for Logstash! It is supported by the elasticsearch-js API. See for example here.

Right now I get an error like this when I try to Input from an Elasticsearch instance with closed indices:

Error: [403] {"error":"IndexClosedException[[MY_INDEX] closed]","status":403}

Typo in example s/host/hosts

There is a typo in Example section, but it looks like the source is correct, so probably it was not re-generated.

input {
  # Read all documents from Elasticsearch matching the given query
  elasticsearch {
    host => "localhost"
    query => '{ "query": { "match": { "statuscode": 200 } } }'
  }
}

host should be changed to hosts

Shutsdown immediately

This config causes Logstash to shutdown immediately after outputing to stdout. I cannot determine why, but the ElasticSearch input seems to be the cause. Thoughts?

input {
    elasticsearch {
        hosts           => "localhost"
        query           => '{"query" : { "match" : {"message" : "test"}}}'
        docinfo         => false
    }

}

filter {
}

output {
    stdout {
        codec           => "json"
    }
}

Additional note:
I originally had two outputs, the second being putting to an S3 bucket. The s3 output plugin would store to the temporary file but Logstash would shutdown before the temporary file could upload to the bucket.

I tried the version packaged with Logstash 1.5.0 and I tried rebuilding the plugin from source.

Connection Issues

Ive recently be attempting to use the newer version of this plugin for the metadata support for reindexing. When I run logstash it runs for a while before displaying a "connection refused" faraday error. I'm assuming its retrying/timing out It could be related to my ES setup but I can run logstash 1.4.2 with its older version of the plugin fine. I also tried backporting the current plugin version to 1.4.2 and got the same error.

Setup
SIngle host with both logstash and ES
Elasticsearch 1.4.4
Logstash 1.5rc2

config
input {
elasticsearch {
hosts => ["localhost"]
docinfo => true
index => "alerts-"
query => ""
}

filter {
}

output {
stdout{
codec => "rubydebug"
}
}

Error from verbose output (-vv)
A plugin had an unrecoverable error. Will restart this plugin.
Plugin: <LogStash::Inputs::Elasticsearch hosts=>[{:host=>"localhost", :port=>9200, :protocol=>"http"}], index=>"alerts-", query=>"", scroll=>"1m", docinfo_target=>"@metadata", docinfo_fields=>["_index", "_type", "_id"]>
Error: Connection refused - Connection refused
Exception: Faraday::ConnectionFailed
Stack: org/jruby/ext/socket/RubyTCPSocket.java:126:in initialize' org/jruby/RubyIO.java:1177:inopen'
/home/admin/logstash1.5/vendor/jruby/lib/ruby/1.9/net/http.rb:763:in connect' org/jruby/ext/timeout/Timeout.java:104:intimeout'
/home/admin/logstash1.5/vendor/jruby/lib/ruby/1.9/net/http.rb:763:in connect' /home/admin/logstash1.5/vendor/jruby/lib/ruby/1.9/net/http.rb:756:indo_start'
/home/admin/logstash1.5/vendor/jruby/lib/ruby/1.9/net/http.rb:745:in start' /home/admin/logstash1.5/vendor/jruby/lib/ruby/1.9/net/http.rb:1293:inrequest'
/home/admin/logstash1.5/vendor/bundle/jruby/1.9/gems/faraday-0.9.1/lib/faraday/adapter/net_http.rb:82:in perform_request' /home/admin/logstash1.5/vendor/bundle/jruby/1.9/gems/faraday-0.9.1/lib/faraday/adapter/net_http.rb:40:incall'
/home/admin/logstash1.5/vendor/bundle/jruby/1.9/gems/faraday-0.9.1/lib/faraday/adapter/net_http.rb:87:in with_net_http_connection' /home/admin/logstash1.5/vendor/bundle/jruby/1.9/gems/faraday-0.9.1/lib/faraday/adapter/net_http.rb:32:incall'
/home/admin/logstash1.5/vendor/bundle/jruby/1.9/gems/faraday-0.9.1/lib/faraday/rack_builder.rb:139:in build_response' /home/admin/logstash1.5/vendor/bundle/jruby/1.9/gems/faraday-0.9.1/lib/faraday/connection.rb:377:inrun_request'
/home/admin/logstash1.5/vendor/bundle/jruby/1.9/gems/elasticsearch-transport-1.0.7/lib/elasticsearch/transport/transport/http/faraday.rb:24:in perform_request' org/jruby/RubyProc.java:271:incall'
/home/admin/logstash1.5/vendor/bundle/jruby/1.9/gems/elasticsearch-transport-1.0.7/lib/elasticsearch/transport/transport/base.rb:187:in perform_request' /home/admin/logstash1.5/vendor/bundle/jruby/1.9/gems/elasticsearch-transport-1.0.7/lib/elasticsearch/transport/transport/http/faraday.rb:20:inperform_request'
/home/admin/logstash1.5/vendor/bundle/jruby/1.9/gems/elasticsearch-transport-1.0.7/lib/elasticsearch/transport/client.rb:115:in perform_request' /home/admin/logstash1.5/vendor/bundle/jruby/1.9/gems/elasticsearch-api-1.0.7/lib/elasticsearch/api/actions/search.rb:159:insearch'
/home/admin/logstash1.5/vendor/bundle/jruby/1.9/gems/logstash-input-elasticsearch-0.1.4/lib/logstash/inputs/elasticsearch.rb:146:in run' /home/admin/logstash1.5/lib/logstash/pipeline.rb:174:ininputworker'
/home/admin/logstash1.5/lib/logstash/pipeline.rb:168:in `start_input' {:level=>:error, :file=>"logstash/pipeline.rb", :line=>"179", :method=>"inputworker"'}

switch from scan to scroll

Version: v5.0.0-rc1
Operating System: Mac 10.12

Using a simple config:

input { elasticsearch { index => "citibike-trips" } }
filter {}
output { stdout { codec => "rubydebug" } }

The following error is thrown:

[2016-10-12T14:52:39,634][ERROR][logstash.pipeline        ] A plugin had an unrecoverable error. Will restart this plugin.
  Plugin: <LogStash::Inputs::Elasticsearch index=>"citibike-trips", id=>"ca1129a233dae70a186b30f08aa829bac64b30e0-1", enable_metric=>true, codec=><LogStash::Codecs::JSON id=>"json_1d4cf6cb-7ca6-4404-a9d7-96fde965269b", enable_metric=>true, charset=>"UTF-8">, query=>"{\"query\": { \"match_all\": {} } }", scan=>true, size=>1000, scroll=>"1m", docinfo=>false, docinfo_target=>"@metadata", docinfo_fields=>["_index", "_type", "_id"], ssl=>false>
  Error: [400] {"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"No search type for [scan]"}],"type":"illegal_argument_exception","reason":"No search type for [scan]"},"status":400}

This appears to happen because this plugin uses the scan search type, which was removed in Elsaticsearch v5.

Add index pattern / date math support to the index => setting

Currently the index => setting does not seem to support index patterns like logstash-%{+YYYY.MM.dd} as the ES output does.
Having the index pattern or date math support could be useful in some cases. E.g. if LS is running for batch processing.

Config File (if you have sensitive info, please remove it):

input {
        elasticsearch {
                hosts => "localhost"
                index => "logstash-%{+YYYY.MM.dd}"
                query => '{ "_source": ["@timestamp"],
                "query": { "match_all": {} }
            }'
        }
}

output {
    stdout { codec => rubydebug }
}

Running with this config results in:

Plugin: <LogStash::Inputs::Elasticsearch hosts=>["localhost"], index=>"logstash-%{+YYYY.MM.dd}" ....

Elasticsearch Input Get Parent Field

(This issue was originally filed by @yusufcan141 at elastic/logstash#2960)

What is the way of getting "_parent" field of a document using elasticsearch input?

Thanks

Update docs to note that a size of -1 will return all the results from the query

Or we should note that omitting the size parameter will return all results from the query.

One or the other, its just that our docs don't currently answer the question of what to do if you want the full result set.

Handle connection issues better

Migrated from elastic/logstash#1634

I have ES running on a different port than the default (9200) but have not defined a port parameter explicitly in the configuration, the result is the following:
  Error: undefined method `bytesize' for nil:NilClass
  Exception: NoMethodError
  Stack: /Users/<user_name>/ELK/logstash-1.4.1/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/http/message.rb:65:in `body='
/Users/<user_name>/ELK/logstash-1.4.1/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/agent.rb:298:in `request'
/Users/<user_name>/ELK/logstash-1.4.1/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/agent.rb:217:in `post!'
/Users/<user_name>/ELK/logstash-1.4.1/lib/logstash/inputs/elasticsearch.rb:95:in `run'
/Users/<user_name>/ELK/logstash-1.4.1/lib/logstash/pipeline.rb:163:in `inputworker'
/Users/<user_name>/ELK/logstash-1.4.1/lib/logstash/pipeline.rb:157:in `start_input' {:level=>:error, :file=>"logstash/pipeline.rb", :line=>"168"}
Which is a confusing error. This can certainly be addressed by specifying a port parameter for the non-standard port. Would be nice for the plugin to throw a more intuitive error message though :)

@jordansissel confirmed this in the original issue.

Documentation around "query" parameter wrong? Bad escaping/parsing?

Version: 2.3.x
Operating System: Linux

Hey team,

It looks like the documentation around the query parameter is incorrect. It states:

Default value is "{\"query\": { \"match_all\": {} } }"

However, if I just copy and paste that in as the value for the query parameter, Elasticsearch throws a parsing exception. For example, with this input configuration:

input {
  # We read from the "old" index
  elasticsearch {
    hosts => ["http://localhost:9200"]
    index => "public_toilets"
    query => "{\"query\": { \"match_all\": {} } }"
    size => 500
    scroll => "5m"
    docinfo => true
  }
}

Logstash throws the following exception:

A plugin had an unrecoverable error. Will restart this plugin.
  Plugin: <LogStash::Inputs::Elasticsearch hosts=>["http://localhost:9200"], index=>"public_toilets", query=>"{ \\\"query\\\": { \\\"type\\\": { \\\"value\\\": \\\"toilets\\\" } } }", size=>500, scroll=>"5m", docinfo=>true, codec=><LogStash::Codecs::JSON charset=>"UTF-8">, scan=>true, docinfo_target=>"@metadata", docinfo_fields=>["_index", "_type", "_id"], ssl=>false>
  Error: [500] {"error":{"root_cause":[{"type":"json_parse_exception","reason":"Unexpected character ('\\' (code 92)): was expecting either valid name character (for unquoted name) or double-quote (for quoted) to start field name\n at [Source: [B@2fee7767; line: 1, column: 4]"}],"type":"search_phase_execution_exception","reason":"all shards failed","phase":"init_scan","grouped":true,"failed_shards":[{"shard":0,"index":"public_toilets","node":"FQEp7OpWTruaD_eOjiqLYg","reason":{"type":"json_parse_exception","reason":"Unexpected character ('\\' (code 92)): was expecting either valid name character (for unquoted name) or double-quote (for quoted) to start field name\n at [Source: [B@2fee7767; line: 1, column: 4]"}}]},"status":500} {:level=>:error}

And I see the following exception in the Elasticsearch logs:

es_1 | [2016-07-21 01:19:06,334][WARN ][rest.suppressed          ] path: /public_toilets/_search, params: {size=500, scroll=5m, index=public_toilets, search_type=scan}
es_1 | Failed to execute phase [init_scan], all shards failed; shardFailures {[FQEp7OpWTruaD_eOjiqLYg][public_toilets][0]: RemoteTransportException[[escluster1_es_1][172.18.0.2:9300][indices:data/read/search[phase/scan]]]; nested: SearchParseException[failed to parse search source [{\"query\": { \"match_all\": {} } }]]; nested: JsonParseException[Unexpected character ('\' (code 92)): was expecting either valid name character (for unquoted name) or double-quote (for quoted) to start field name
es_1 |  at [Source: [B@5f22c259; line: 1, column: 3]]; }
es_1 |  at org.elasticsearch.action.search.AbstractSearchAsyncAction.onFirstPhaseResult(AbstractSearchAsyncAction.java:206)
es_1 |  at org.elasticsearch.action.search.AbstractSearchAsyncAction$1.onFailure(AbstractSearchAsyncAction.java:152)
es_1 |  at org.elasticsearch.action.ActionListenerResponseHandler.handleException(ActionListenerResponseHandler.java:46)
es_1 |  at org.elasticsearch.transport.TransportService$DirectResponseChannel.processException(TransportService.java:855)
es_1 |  at org.elasticsearch.transport.TransportService$DirectResponseChannel.sendResponse(TransportService.java:833)
es_1 |  at org.elasticsearch.transport.DelegatingTransportChannel.sendResponse(DelegatingTransportChannel.java:68)
es_1 |  at org.elasticsearch.transport.RequestHandlerRegistry$TransportChannelWrapper.sendResponse(RequestHandlerRegistry.java:146)
es_1 |  at org.elasticsearch.shield.transport.ShieldServerTransportService$ProfileSecuredRequestHandler.messageReceived(ShieldServerTransportService.java:182)
es_1 |  at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:75)
es_1 |  at org.elasticsearch.transport.TransportService$4.doRun(TransportService.java:376)
es_1 |  at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
es_1 |  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
es_1 |  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
es_1 |  at java.lang.Thread.run(Thread.java:745)
es_1 | Caused by: [Unexpected character ('\' (code 92)): was expecting either valid name character (for unquoted name) or double-quote (for quoted) to start field name
es_1 |  at [Source: [B@5f22c259; line: 1, column: 3]]; nested: JsonParseException[Unexpected character ('\' (code 92)): was expecting either valid name character (for unquoted name) or double-quote (for quoted) to start field name
es_1 |  at [Source: [B@5f22c259; line: 1, column: 3]];
es_1 |  at org.elasticsearch.ElasticsearchException.guessRootCauses(ElasticsearchException.java:386)
es_1 |  at org.elasticsearch.action.search.SearchPhaseExecutionException.guessRootCauses(SearchPhaseExecutionException.java:152)
es_1 |  at org.elasticsearch.action.search.SearchPhaseExecutionException.getCause(SearchPhaseExecutionException.java:99)
es_1 |  at java.lang.Throwable.printStackTrace(Throwable.java:665)
es_1 |  at java.lang.Throwable.printStackTrace(Throwable.java:721)
es_1 |  at org.apache.log4j.DefaultThrowableRenderer.render(DefaultThrowableRenderer.java:60)
es_1 |  at org.apache.log4j.spi.ThrowableInformation.getThrowableStrRep(ThrowableInformation.java:87)
es_1 |  at org.apache.log4j.spi.LoggingEvent.getThrowableStrRep(LoggingEvent.java:413)
es_1 |  at org.apache.log4j.WriterAppender.subAppend(WriterAppender.java:313)
es_1 |  at org.apache.log4j.WriterAppender.append(WriterAppender.java:162)
es_1 |  at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251)
es_1 |  at org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66)
es_1 |  at org.apache.log4j.Category.callAppenders(Category.java:206)
es_1 |  at org.apache.log4j.Category.forcedLog(Category.java:391)
es_1 |  at org.apache.log4j.Category.log(Category.java:856)
es_1 |  at org.elasticsearch.common.logging.log4j.Log4jESLogger.internalWarn(Log4jESLogger.java:135)
es_1 |  at org.elasticsearch.common.logging.support.AbstractESLogger.warn(AbstractESLogger.java:109)
es_1 |  at org.elasticsearch.rest.BytesRestResponse.convert(BytesRestResponse.java:134)
es_1 |  at org.elasticsearch.rest.BytesRestResponse.<init>(BytesRestResponse.java:96)
es_1 |  at org.elasticsearch.rest.BytesRestResponse.<init>(BytesRestResponse.java:87)
es_1 |  at org.elasticsearch.rest.action.support.RestActionListener.onFailure(RestActionListener.java:60)
es_1 |  at org.elasticsearch.action.support.TransportAction$1.onFailure(TransportAction.java:95)
es_1 |  at org.elasticsearch.shield.action.ShieldActionFilter$SigningListener.onFailure(ShieldActionFilter.java:203)
es_1 |  at org.elasticsearch.action.support.TransportAction$FilteredActionListener.onFailure(TransportAction.java:242)
es_1 |  at org.elasticsearch.action.search.AbstractSearchAsyncAction.raiseEarlyFailure(AbstractSearchAsyncAction.java:294)
es_1 |  ... 14 more
es_1 | Caused by: com.fasterxml.jackson.core.JsonParseException: Unexpected character ('\' (code 92)): was expecting either valid name character (for unquoted name) or double-quote (for quoted) to start field name
es_1 |  at [Source: [B@5f22c259; line: 1, column: 3]
es_1 |  at com.fasterxml.jackson.core.JsonParser._constructError(JsonParser.java:1581)
es_1 |  at com.fasterxml.jackson.core.base.ParserMinimalBase._reportError(ParserMinimalBase.java:533)
es_1 |  at com.fasterxml.jackson.core.base.ParserMinimalBase._reportUnexpectedChar(ParserMinimalBase.java:462)
es_1 |  at com.fasterxml.jackson.core.json.UTF8StreamJsonParser._handleOddName(UTF8StreamJsonParser.java:2012)
es_1 |  at com.fasterxml.jackson.core.json.UTF8StreamJsonParser._parseName(UTF8StreamJsonParser.java:1650)
es_1 |  at com.fasterxml.jackson.core.json.UTF8StreamJsonParser.nextToken(UTF8StreamJsonParser.java:740)
es_1 |  at org.elasticsearch.common.xcontent.json.JsonXContentParser.nextToken(JsonXContentParser.java:53)
es_1 |  at org.elasticsearch.search.SearchService.parseSource(SearchService.java:830)
es_1 |  at org.elasticsearch.search.SearchService.createContext(SearchService.java:654)
es_1 |  at org.elasticsearch.search.SearchService.createAndPutContext(SearchService.java:620)
es_1 |  at org.elasticsearch.search.SearchService.executeScan(SearchService.java:281)
es_1 |  at org.elasticsearch.search.action.SearchServiceTransportAction$SearchScanTransportHandler.messageReceived(SearchServiceTransportAction.java:425)
es_1 |  at org.elasticsearch.search.action.SearchServiceTransportAction$SearchScanTransportHandler.messageReceived(SearchServiceTransportAction.java:421)
es_1 |  at org.elasticsearch.transport.TransportRequestHandler.messageReceived(TransportRequestHandler.java:33)
es_1 |  at org.elasticsearch.shield.transport.ShieldServerTransportService$ProfileSecuredRequestHandler.messageReceived(ShieldServerTransportService.java:180)
es_1 |  ... 6 more

So it looks like somewhere, there is too much escaping happening?

If I just use single quotes around the query parameter value and remove the backslashes, i.e.:

    query => '{"query": { "match_all": {} } }'

No exceptions are thrown.

Is this a bug in the docs or a bug in the plugin?

Scheduler support

I propose to add this feature using a new parameter schedule

Change default query to sort by _doc

According to this page: https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-scroll.html
there is a performance improvement when using the following query on scroll:

{
  "sort": [
    "_doc"
  ]
}

We should consider changing the default query to use that

Proxy support

Hello,
The plugin doesn't seem to support proxy, does it ?

This conf runs like a charm (confirming that proxy settings are taken into account for the output to elasticsearch) :
input { stdin { } } output { elasticsearch { hosts => "localhost" index => "myindex" } }

But this one gives proxy error :
input { elasticsearch { hosts => "localhost" index => "newindex" } } output { elasticsearch { hosts => "localhost" index => "myindex" } }

Error (proxy specific I guess) :

[2017-05-12T16:47:23,958][ERROR][logstash.pipeline        ] A plugin had an unrecoverable error. Will restart this plugin.
  Plugin: <LogStash::Inputs::Elasticsearch hosts=>["localhost"], index=>"newindex", id=>"3b3dd485e50c4cfd54b23afe815aa314b0a5eb36-1", enable_metric=>true, codec=><LogStash::Codecs::JSON id=>"json_a13f235f-9f8d-42ed-b68b-cf21e69eeeb5", enable_metric=>true, charset=>"UTF-8">, query=>"{ \"sort\": [ \"_doc\" ] }", size=>1000, scroll=>"1m", docinfo=>false, docinfo_target=>"@metadata", docinfo_fields=>["_index", "_type", "_id"], ssl=>false>
  Error: [411] <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<HTML><HEAD><META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
<TITLE>ERROR: The requested URL could not be retrieved</TITLE>
<STYLE type="text/css"><!--BODY{background-color:#ffffff;font-family:verdana,sans-serif}PRE{font-family:sans-serif}--></STYLE>
</HEAD><BODY>
<H1>ERROR</H1>
<H2>The requested URL could not be retrieved</H2>
<HR noshade size="1px">
<P>
While trying to process the request:
<PRE>
GET /newindex/_search?scroll=1m&amp;size=1000 HTTP/1.1
Content-Type: application/json
User-Agent: Faraday v0.9.2
Accept: */*
Connection: close
Host: localhost:9200
Content-Length: 22

</PRE>
<P>
The following error was encountered:
<UL>
<LI>
<STRONG>
Invalid Request
</STRONG>
</UL>

Running Logstash 5.3.2 with the logstash-input-elasticsearch

plugin output error ES 5.4

Just upgrade to 5.4 and notice that json, csv... output in logstash make errors now and perfectly work on ES 5.3

exemple of conf file

input {
  elasticsearch {
	hosts => "http://localhost:9200"
	index => "sirene"
	query => '{"query": {"query_string" : {"query": "((DEPET:54 AND provider:sp_mairie) OR (DEPET:55 AND provider:sp_mairie) OR (DEPET:60 AND provider:sp_mairie))"}}}'
  }
}
output {
  csv {
	fields => ["nom", "street", "cp","nomcommune","nom_maire","siteweb","email"]
	path => "/home/data-prospection/public_html/jsondata/57cae745456ab85d4ff76a83ef3f1f0e/dt_0d83a4d79454b856ab249ca6496b17b1.csv"
  }
}

Errors

19:57:49.856 [[main]<elasticsearch] ERROR logstash.pipeline - A plugin had an unrecoverable error. Will restart this plugin.
  Plugin: <LogStash::Inputs::Elasticsearch hosts=>["http://localhost:9200"], index=>"sirene", query=>"{\"query\": {\"query_string\" : {\"query\": \"((DEPET:54 AND provider:sp_mairie) OR (DEPET:55 AND provider:sp_mairie) OR (DEPET:60 AND provider:sp_mairie))\"}}}", id=>"ed057865a1d953e8c84455e0e115b774f2045393-1", enable_metric=>true, codec=><LogStash::Codecs::JSON id=>"json_2b455f6f-f285-4213-bfe7-3ca311b027b0", enable_metric=>true, charset=>"UTF-8">, size=>1000, scroll=>"1m", docinfo=>false, docinfo_target=>"@metadata", docinfo_fields=>["_index", "_type", "_id"], ssl=>false>
  Error: [400] {"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"Failed to parse request body"}],"type":"illegal_argument_exception","reason":"Failed to parse request body","caused_by":{"type":"json_parse_exception","reason":"Unrecognized token 'DnF1ZXJ5VGhlbkZldGNoBQAAAAAAAAC6Fm84Mm84SFpzU3VHSzdWMHdWQ3N3NGcAAAAAAAAAvRZvODJvOEhac1N1R0s3VjB3VkNzdzRnAAAAAAAAALsWbzgybzhIWnNTdUdLN1Ywd1ZDc3c0ZwAAAAAAAAC8Fm84Mm84SFpzU3VHSzdWMHdWQ3N3NGcAAAAAAAAAvhZvODJvOEhac1N1R0s3VjB3VkNzdzRn': was expecting ('true', 'false' or 'null')\n at [Source: org.elasticsearch.transport.netty4.ByteBufStreamInput@5039e667; line: 1, column: 457]"}},"status":400}

I run logstash 5.4.0
and I did this after the upgrade:

bin/logstash-plugin install logstash-input-elasticsearch

NoMethodError exception raised if search hits lack source document

As noted in https://discuss.elastic.co/t/error-undefined-method-for-nil-nilclass-in-logstash/54234, the elasticsearch input assumes that the hits in the search results contain the source document via the _source key, but this isn't true with a query like this:

{"fields": ["clientIP"], "query": { "match_all": {} } }

Such a query results in an "Error: undefined method `[]' for nil:NilClass" exception. I suppose the same might happen if the index's mappings are configured to not store the original document.

How should we deal with this situation? Use the available fields in lieu of the complete document?

failing specs under ng-pipeline

@andrewvc these specs failures seems related to ng-pipeline under master.

Using Accessor#strict_set for specs
Run options: exclude {:redis=>true, :socket=>true, :performance=>true, :couchdb=>true, :elasticsearch=>true, :elasticsearch_secure=>true, :export_cypher=>true, :integration=>true, :windows=>true}
FFFFF.FF.

Failures:

  1) LogStash::Inputs::Elasticsearch should retrieve json event from elasticseach with scan
     Failure/Error: Unable to find matching line from backtrace
     NoMethodError:
       undefined method `each' for nil:NilClass
     # ./logstash-core/lib/logstash/pipeline.rb:258:in `output_batch'
     # ./logstash-core/lib/logstash/pipeline.rb:255:in `output_batch'
     # ./logstash-core/lib/logstash/pipeline.rb:200:in `worker_loop'
     # ./logstash-core/lib/logstash/pipeline.rb:169:in `start_workers'

  2) LogStash::Inputs::Elasticsearch should retrieve json event from elasticseach
     Failure/Error: Unable to find matching line from backtrace
     NoMethodError:
       undefined method `each' for nil:NilClass
     # ./logstash-core/lib/logstash/pipeline.rb:258:in `output_batch'
     # ./logstash-core/lib/logstash/pipeline.rb:255:in `output_batch'
     # ./logstash-core/lib/logstash/pipeline.rb:200:in `worker_loop'
     # ./logstash-core/lib/logstash/pipeline.rb:169:in `start_workers'

  3) LogStash::Inputs::Elasticsearch with Elasticsearch document information when not defining the docinfo should keep the document information in the root of the event
     Failure/Error: Unable to find matching line from backtrace
     NoMethodError:
       undefined method `each' for nil:NilClass
     # ./logstash-core/lib/logstash/pipeline.rb:258:in `output_batch'
     # ./logstash-core/lib/logstash/pipeline.rb:255:in `output_batch'
     # ./logstash-core/lib/logstash/pipeline.rb:200:in `worker_loop'
     # ./logstash-core/lib/logstash/pipeline.rb:169:in `start_workers'

  4) LogStash::Inputs::Elasticsearch with Elasticsearch document information when defining docinfo merges the values if the `docinfo_target` already exist in the `_source` document
     Failure/Error: Unable to find matching line from backtrace
     NoMethodError:
       undefined method `each' for nil:NilClass
     # ./logstash-core/lib/logstash/pipeline.rb:258:in `output_batch'
     # ./logstash-core/lib/logstash/pipeline.rb:255:in `output_batch'
     # ./logstash-core/lib/logstash/pipeline.rb:200:in `worker_loop'
     # ./logstash-core/lib/logstash/pipeline.rb:169:in `start_workers'

  5) LogStash::Inputs::Elasticsearch with Elasticsearch document information when defining docinfo should move the document information to the specified field
     Failure/Error: Unable to find matching line from backtrace
     NoMethodError:
       undefined method `each' for nil:NilClass
     # ./logstash-core/lib/logstash/pipeline.rb:258:in `output_batch'
     # ./logstash-core/lib/logstash/pipeline.rb:255:in `output_batch'
     # ./logstash-core/lib/logstash/pipeline.rb:200:in `worker_loop'
     # ./logstash-core/lib/logstash/pipeline.rb:169:in `start_workers'

  6) LogStash::Inputs::Elasticsearch with Elasticsearch document information when defining docinfo should move the document info to the @metadata field
     Failure/Error: Unable to find matching line from backtrace
     NoMethodError:
       undefined method `each' for nil:NilClass
     # ./logstash-core/lib/logstash/pipeline.rb:258:in `output_batch'
     # ./logstash-core/lib/logstash/pipeline.rb:255:in `output_batch'
     # ./logstash-core/lib/logstash/pipeline.rb:200:in `worker_loop'
     # ./logstash-core/lib/logstash/pipeline.rb:169:in `start_workers'

  7) LogStash::Inputs::Elasticsearch with Elasticsearch document information when defining docinfo should allow to specify which fields from the document info to save to the @metadata field
     Failure/Error: Unable to find matching line from backtrace
     NoMethodError:
       undefined method `each' for nil:NilClass
     # ./logstash-core/lib/logstash/pipeline.rb:258:in `output_batch'
     # ./logstash-core/lib/logstash/pipeline.rb:255:in `output_batch'
     # ./logstash-core/lib/logstash/pipeline.rb:200:in `worker_loop'
     # ./logstash-core/lib/logstash/pipeline.rb:169:in `start_workers'

Finished in 1.45 seconds (files took 1.35 seconds to load)
9 examples, 7 failures

Failed examples:

rspec /Users/colin/dev/src/elasticsearch/logstash-plugins/logstash-input-elasticsearch/spec/inputs/elasticsearch_spec.rb:77 # LogStash::Inputs::Elasticsearch should retrieve json event from elasticseach with scan
rspec /Users/colin/dev/src/elasticsearch/logstash-plugins/logstash-input-elasticsearch/spec/inputs/elasticsearch_spec.rb:26 # LogStash::Inputs::Elasticsearch should retrieve json event from elasticseach
rspec /Users/colin/dev/src/elasticsearch/logstash-plugins/logstash-input-elasticsearch/spec/inputs/elasticsearch_spec.rb:297 # LogStash::Inputs::Elasticsearch with Elasticsearch document information when not defining the docinfo should keep the document information in the root of the event
rspec /Users/colin/dev/src/elasticsearch/logstash-plugins/logstash-input-elasticsearch/spec/inputs/elasticsearch_spec.rb:192 # LogStash::Inputs::Elasticsearch with Elasticsearch document information when defining docinfo merges the values if the `docinfo_target` already exist in the `_source` document
rspec /Users/colin/dev/src/elasticsearch/logstash-plugins/logstash-input-elasticsearch/spec/inputs/elasticsearch_spec.rb:251 # LogStash::Inputs::Elasticsearch with Elasticsearch document information when defining docinfo should move the document information to the specified field
rspec /Users/colin/dev/src/elasticsearch/logstash-plugins/logstash-input-elasticsearch/spec/inputs/elasticsearch_spec.rb:241 # LogStash::Inputs::Elasticsearch with Elasticsearch document information when defining docinfo should move the document info to the @metadata field
rspec /Users/colin/dev/src/elasticsearch/logstash-plugins/logstash-input-elasticsearch/spec/inputs/elasticsearch_spec.rb:272 # LogStash::Inputs::Elasticsearch with Elasticsearch document information when defining docinfo should allow to specify which fields from the document info to save to the @metadata field

Randomized with seed 59991

This plugin doesn't work with elasticsearch 5.3.x since version 4.0.1

Using logstash 5.3.x or 5.2.x which contain version >= 4.0.1 of this plugin generates an exception:

logstash-5.3.0 % bin/logstash -f cfg
Sending Logstash's logs to /tmp/logstash-5.3.0/logs which is now configured via log4j2.properties
[2017-04-06T13:23:01,551][INFO ][logstash.pipeline        ] Starting pipeline {"id"=>"main", "pipeline.workers"=>4, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>5, "pipeline.max_inflight"=>500}
[2017-04-06T13:23:01,821][INFO ][logstash.pipeline        ] Pipeline main started
[2017-04-06T13:23:01,881][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600}
[2017-04-06T13:23:02,183][ERROR][logstash.pipeline        ] A plugin had an unrecoverable error. Will restart this plugin.
  Plugin: <LogStash::Inputs::Elasticsearch hosts=>["localhost:9200"], index=>"twitter", id=>"e31d6014f6900363351c4bbb23b8f068af3681e0-1", enable_metric=>true, codec=><LogStash::Codecs::JSON id=>"json_11a1c469-5a4d-41ef-8d43-1756f46d48cd", enable_metric=>true, charset=>"UTF-8">, query=>"{ \"sort\": [ \"_doc\" ] }", size=>1000, scroll=>"1m", docinfo=>false, docinfo_target=>"@metadata", docinfo_fields=>["_index", "_type", "_id"], ssl=>false>
  Error: [400] {"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"Failed to parse request body"}],"type":"illegal_argument_exception","reason":"Failed to parse request body","caused_by":{"type":"json_parse_exception","reason":"Unrecognized token 'DnF1ZXJ5VGhlbkZldGNoBQAAAAAAAAABFnZteTVRcGVuVEthSjJ5UzEwUDc5SWcAAAAAAAAAAhZ2bXk1UXBlblRLYUoyeVMxMFA3OUlnAAAAAAAAAAMWdm15NVFwZW5US2FKMnlTMTBQNzlJZwAAAAAAAAAEFnZteTVRcGVuVEthSjJ5UzEwUDc5SWcAAAAAAAAABRZ2bXk1UXBlblRLYUoyeVMxMFA3OUln': was expecting ('true', 'false' or 'null')\n at [Source: org.elasticsearch.transport.netty4.ByteBufStreamInput@86c3826; line: 1, column: 457]"}},"status":400}

Version 4.0.1 started using the full blown elasticsearch ruby client version 5.x. Since all requests are now being forced to use content type json, the scroll is broken because this plugin uses the deprecated form body to pass the scroll id instead of using a json body.

This can be triggered easily with logstash 5.2.x or 5.3.x and elasticsearch ~~5.x~~ 5.3.x, and with a simple config:

input {
  elasticsearch { index => "twitter" }
}
output { 
  stdout { codec => json  }
}

Filter / mutate / convert

Hi I use ES 5.4 and LS 5.4 and the input plugin is up to date

I have this conf file where the field "population" is a float number, I try to convert it to a string in order to preserve its decimal format

input {
  elasticsearch {
	hosts => "http://localhost:9200"
	index => "sirene"
	query => '{"query": {"query_string" : {"query": "((DEPET:01 AND provider:sp_mairie AND _exists_:street))"}}}'
	
  }
}
filter {
  mutate {
    convert => { "population" => "string" }
  }
}
output {
  csv {
	fields => ["nom","nom_maire","CODGEO", "street", "cp","nomcommune","departement","DEPET","region","email","siteweb","tel","fax","population","location[lat]","location[lon]"]
	path => "/home/data/public_html/jsondata/57cae745456ab85d4ff76a83ef3f1f0e/dp_577b539512dfa520df2c0a18ffb8482f.csv"

  }
}

And exemple of an entry

      {
        "_index" : "sirene",
        "_type" : "logs",
        "_id" : "mairie-49317-01",
        "_score" : 1.8746464,
        "_source" : {
          "DEPET" : "49",
          "siteweb" : "http://www.saint-remy-la-varenne.fr",
          "identifiant" : "mairie-49317-01",
          "dept" : "49",
          "nom" : "Mairie de Saint-Rémy-la-Varenne",
          "cp" : "49250",
          "path" : "/home/data-prospection/public_html/datas/updates/1494436364.csv",
          "@timestamp" : "2017-05-10T17:14:36.701Z",
          "horaires" : "Du vendredi de 09h00 à 12h30 et de 14h00 à 17h00 - Du lundi de 09h00 à 12h30 - Du mercredi de 09h00 à 12h30 et de 14h00 à 17h00 - Du samedi de 09h00 à 12h00",
          "CODGEO" : "49317",
          "nomcommune" : "Saint-Rémy-la-Varenne",
          "provider" : "sp_mairie",
          "street" : "4 rue de la Mairie",
          "@version" : "1",
          "miseajour" : "all_20170508",
          "tel" : "0241570394",
          "location" : {
            "lon" : -0.317674010992,
            "lat" : 47.3968009949
          },
          "email" : "[email protected]",
          "num_region" : "52",
          "departement" : "Maine et Loire",
          "region" : "Pays-de-la-Loire",
          "nom_maire" : "Evelyne FARIBAULT",
          "population" : 975.0
        }

But the filter doesn't seem to be listen, I try long, integer and string without any success... The result for population look like this 0.975E3

Plugin Restart Behavior and Search Scroll

Would like to query the design behavior for this plugin.

What should be the behavior of the plugin when it encounter error and perform a restart?
How should it be handling the scroll?

I hit an OOM issue while trying to reindex my docs and I discovered that the search scroll context is not handled properly. While the plugin restarts, the search scroll context is left hanging when querying the status.
GET /_nodes/stats/indices/search?pretty
If the behavior is to continue after restart, I am not seeing it as the doc count exceed the original count. If the behavior is to restart the query, then the search scroll should be explicitly closed off before restarting.

Not too big an issue if interval settings is small but it can have significant impact if the interval is long or when resources are limited.

refactor shutdown sequence to use plugin.stop

Change shutdown sequence to be triggered by plugin.stop instead of ShutdownSignal exception.
Also remove any calls to: shutdown, finished, finished?, running? or terminating?
This depends on elastic/logstash#3895

Support for script_fields

Right now, this plugin only supports retrieving data from the _source field. Since the query parameter support the full search DSL, it would be great to also be able to enrich the event with data coming from script_fields.

For instance, it should be possible to configure the elasticsearch input like this:

input {
   elasticsearch {
     hosts => ["localhost:9200"]
     index => "twitter"
     query => '{"script_fields": {"ts": {"script": "doc._timestamp.value"}}, "_source":["*"]}'
     script_fields => true
  }
}

And the resulting events should look like this

{
       "message" => "My Tweet",
          "user" => 5672323424242,
            "ts" => 1496806671021,
      "@version" => "1",
    "@timestamp" => "2017-06-07T05:40:14.233Z"
}

	unless docinfo_target.is_a?(Hash)
	@logger.error("Elasticsearch Input: Incompatible Event, incompatible type for the docinfo_target=#{@docinfo_target} field in the `_source` document, expected a hash got:", :docinfo_target_type => docinfo_target.class, :event => event)

	# TODO: (colin) I am not sure raising is a good strategy here?
	raise Exception.new("Elasticsearch input: incompatible event")
	end

logstash-plugins / logstash-input-elasticsearch Goto Github PK

logstash-input-elasticsearch's People

Contributors

Stargazers

Watchers

Forkers

logstash-input-elasticsearch's Issues

Recommend Projects

Recommend Topics

Recommend Org