Git Product home page Git Product logo

Comments (7)

jszwedko avatar jszwedko commented on September 28, 2024

Thanks for filing this @csongpaxos . Normally there is another ERROR level log before the "service call failed" logs that indicates the error that is not being retried. It sounds like you aren't seeing that though?

from vector.

csongpaxos avatar csongpaxos commented on September 28, 2024

Hi @jszwedko, no we have tried the "debug" and "trace" log levels, enabling internal metrics / internal logs, and tweaking buffering/batching settings, but haven't been able to get any additional errors indicating the "real" error. Just stuck now with how to proceed / further debug without anything to work with

Not sure if you have seen this before with the splunk sink specifically when log volume is high?

from vector.

metalbreeze avatar metalbreeze commented on September 28, 2024

I have same error on nginx log->vector->vector->clickhouse
but other Scenarios are good

  1. nginx log->vector->vector centor-> output.file1,
  2. nginx log->vector-> output.file2.
  3. nginx log->vector->clickhouse
    PS: output.file1 and output.file2 are matched.

here is error msg:

2024-04-21T03:33:37.216528Z ERROR sink{component_kind="sink" component_id=clickhouse_2 component_type=clickhouse}:request{request_id=1}: vector::sinks::util::retries: Not retriable; dropping the request. reason="response status: 404 Not Found" internal_log_rate_limit=true
2024-04-21T03:33:37.216581Z ERROR sink{component_kind="sink" component_id=clickhouse_2 component_type=clickhouse}:request{request_id=1}: vector_common::internal_event::service: Service call failed. No retries or retries exhausted. error=None request_id=1 error_type="request_failed" stage="sending" internal_log_rate_limit=true
2024-04-21T03:33:37.216609Z ERROR sink{component_kind="sink" component_id=clickhouse_2 component_type=clickhouse}:request{request_id=1}: vector_common::internal_event::component_events_dropped: Events dropped intentional=false count=1 reason="Service call failed. No retries or retries exhausted." internal_log_rate_limit=true

here is the vector1 and vector2 config

data_dir: "/var/lib/vector"

api:
  enabled: true
  address: "0.0.0.0:8686"

sources:
  nginx_logs:
    type: "file"
    include:
      - "/var/log/nginx/*.log" # supports globbing
    ignore_older_secs: 86400     # 1 day

transforms:
  nginx_parser:
    inputs:
      - "nginx_logs"
    type: "remap"
    source: |
      .message=parse_nginx_log!(.message,"combined")
      .body=.message
  vc_parser:
    inputs:
      - nginx_parser
    type: "remap"
    source: |
      .body.src="vc"
  ck_parser:
    inputs:
      - nginx_parser
    type: "remap"
    source: |
      .body.src="ck"
  file_parser:
    inputs:
      - nginx_parser
    type: "remap"
    source: |
      .body.src="file"

sinks:
  my_vector:
    type: vector
    inputs:
      - vc_parser
    address: 192.168.111.25:6000
  clickhouse:
    type: "clickhouse"
    database : "signoz_logs"
    table : "access_logs_2"
    inputs:
      - ck_parser
    skip_unknown_fields : true
    endpoint:
      "http://testvector2:8123"
  my_file:
    type: file
    inputs:
      - file_parser
    path: /opt/vector/logs/vector-%Y-%m-%d.log
    encoding:
      codec: logfmt
sources:
  my_source_id:
    type: "vector"
    address: "0.0.0.0:6000"
    version: "2"
sinks:
  clickhouse_2:
    type: "clickhouse"
    database : "signoz_logs"
    table : "access_log_2"
    inputs:
      - my_source_id
       #format: json_each_row
    skip_unknown_fields : true
    endpoint:
      "http://192.168.111.25:8123"
  my_sink_id:
    type: file
    inputs:
       - my_source_id
    path: /opt/vector/logs/vector-%Y-%m-%d.log
    encoding:
      codec: logfmt

from vector.

metalbreeze avatar metalbreeze commented on September 28, 2024

fixed by adding following
batch:
timeout_secs: 30

  clickhouse_2:
    type: "clickhouse"
    database : "signoz_logs"
    table : "access_logs_2"
    inputs:
      - my_source_id
       #format: json_each_row
    skip_unknown_fields : true
    batch:
        timeout_secs: 30

from vector.

andywatts avatar andywatts commented on September 28, 2024

@jszwedko.
The error message has field "Error=None"..?

error=None request_id=1 error_type="request_failed" stage="sending"

Here's a link to the error in trace logs

from vector.

csongpaxos avatar csongpaxos commented on September 28, 2024

Adding the

    batch:
        timeout_secs: 30

to my Splunk sink appears to not make a difference - still seeing the same service call failed / retries exhausted error with no additional errors.

2024-04-22T17:55:32.314290Z DEBUG hyper::client::pool: pooling idle connection for ("https", http-inputs-XXX.splunkcloud.com)
2024-04-22T17:55:32.314359Z DEBUG vector::sinks::splunk_hec::common::acknowledgements: Stored ack id. ack_id=118
2024-04-22T17:55:32.314400Z ERROR sink{component_kind="sink" component_id=splunk component_type=splunk_hec_logs}:request{request_id=555}: vector_common::internal_event::service: Internal log [Service call failed. No retries or retries exhausted.] has been suppressed 12 times.
2024-04-22T17:55:32.314413Z ERROR sink{component_kind="sink" component_id=splunk component_type=splunk_hec_logs}:request{request_id=555}: vector_common::internal_event::service: Service call failed. No retries or retries exhausted. error=None request_id=555 error_type="request_failed" stage="sending" internal_log_rate_limit=true
2024-04-22T17:55:32.314437Z ERROR sink{component_kind="sink" component_id=splunk component_type=splunk_hec_logs}:request{request_id=555}: vector_common::internal_event::component_events_dropped: Internal log [Events dropped] has been suppressed 12 times.
2024-04-22T17:55:32.314443Z ERROR sink{component_kind="sink" component_id=splunk component_type=splunk_hec_logs}:request{request_id=555}: vector_common::internal_event::component_events_dropped: Events dropped intentional=false count=626 reason="Service call failed. No retries or retries exhausted." internal_log_rate_limit=true

from vector.

Cbeck527 avatar Cbeck527 commented on September 28, 2024

Adding more info in case it's helpful: I was seeing this error with our setup, which is almost identical to OP... k8s pods → DataDog agent → Vector → Splunk HEC. I was seeing some events flow into splunk, but wasn't able to figure out any kind of pattern for the errors.

In playing around with the settings, the error has disappeared when I disabled acknowledgements on the sink:

...
splunk_eks:
  type: splunk_hec_logs
  endpoint: "${SPLUNK_CLOUD_HTTP_ENDPOINT}"
  default_token: "${SPLUNK_CLOUD_TOKEN}"
  acknowledgements:
    enabled: false
    indexer_acknowledgements_enabled: false
...

perhaps the service call is related to the acknowledgement piece? Best I can tell, our volume of events is the same if ACKs are on or off... so either the events were always getting there (and the error is on the ACK), or they were never getting there.

To add another data point, we're only seeing this error on 1 of our 7 clusters. Vector is set up identically everywhere, with the only difference being the default_token used.

from vector.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.