etcdv3 / etcd-client Goto Github PK

View Code? Open in Web Editor NEW

201.0 201.0 49.0 225 KB

An etcd v3 API client

License: Apache License 2.0

Rust 100.00%

etcd-client's Introduction

etcd-client

An etcd v3 API client for Rust. It provides asynchronous client backed by tokio and tonic.

Features

etcd API v3
asynchronous

Supported APIs

Usage

Add this to your Cargo.toml:

[dependencies]
etcd-client = "0.14"
tokio = { version = "1.0", features = ["full"] }

To get started using etcd-client:

use etcd_client::{Client, Error};

#[tokio::main]
async fn main() -> Result<(), Error> {
    let mut client = Client::connect(["localhost:2379"], None).await?;
    // put kv
    client.put("foo", "bar", None).await?;
    // get kv
    let resp = client.get("foo", None).await?;
    if let Some(kv) = resp.kvs().first() {
        println!("Get kv: {{{}: {}}}", kv.key_str()?, kv.value_str()?);
    }

    Ok(())
}

Examples

Examples can be found in examples.

Feature Flags

tls: Enables the rustls-based TLS connection. Not enabled by default.
tls-roots: Adds system trust roots to rustls-based TLS connection using the rustls-native-certs crate. Not enabled by default.
pub-response-field: Exposes structs used to create regular etcd-client responses including internal protobuf representations. Useful for mocking. Not enabled by default.
tls-openssl: Enables the openssl-based TLS connections. This would make your binary dynamically link to libssl.
tls-openssl-vendored: Like tls-openssl, however compile openssl from source code and statically link to it.

Test

We test this library with etcd 3.5.

Notes that we use a fixed etcd server URI (localhost:2379) to connect to etcd server.

Rust version requirements

The minimum supported version is 1.70. The current etcd-client version is not guaranteed to build on Rust versions earlier than the minimum supported version.

License

Dual-licensed to be compatible with the Rust project.

Licensed under the Apache License, Version 2.0 http://www.apache.org/licenses/LICENSE-2.0 or the MIT license http://opensource.org/licenses/MIT, at your option. This file may not be copied, modified, or distributed except according to those terms.

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in etcd-client by you, shall be licensed as Apache-2.0 and MIT, without any additional terms or conditions.

etcd-client's People

Contributors

Stargazers

Watchers

etcd-client's Issues

Any support for rwlock?

Is there any support for rwlock?

Is it possible to enable the gRPC traces while running etcd-client program

The gRPC implementation supports enabling traces as per
https://github.com/grpc/grpc/blob/master/TROUBLESHOOTING.md

# Print info from 3 different tracers, including tracing logs with log level DEBUG
GRPC_VERBOSITY=debug GRPC_TRACE=tcp,http,api ./helloworld_application_using_grpc

However I'm not able to see these traces getting enabled while running the RUST code based on etcd-client. Any pointers for the same, or do we support while running this as a RUST progam ?

How to avoid the need for copying to obtain a mutable client due to sharing immutable data in a multi-tasking program (or what is the best practice for using this library in a concurrent sharing scenario)?

How to avoid the need for copying to obtain a mutable client due to sharing immutable data in a multi-tasking program (or what is the best practice for using this library in a concurrent sharing scenario)?

In our program we shared a Arc with a client in it. And some api of client is mutable, so we always clone() the client when we need to call the apis which requires &mut self. Is there better solution?

The inner representation of `KeyValue` couldn't be exported by `pub-response-field`

Hi! I get some trouble when I'm going to access the internal prost field of KeyValue:

etcd-client/src/rpc/mod.rs

Lines 70 to 74 in be00b16

 /// Key-value pair. 

 #[cfg_attr(feature = "pub-field", visible::StructFields(pub))] 

 #[derive(Debug, Clone)] 

 #[repr(transparent)] 

 pub struct KeyValue(PbKeyValue);

It seems that the field was exported by #[cfg_attr(feature = "pub-field", visible::StructFields(pub))], which (I guess) is omitted in the procedure of #23 (comment). May I provide a PR for fixing it?

Docs broken

how to implement CAS operation?

if i want to implement an id generator with etcd, it seems that i need below operation in one transaction :
let t = etcd_client::new_txn();
let old_id = t.get("id_key");
let new_id = old_id + 1;
t.set("id_key", new_id);
t.commit();
or using CAS(compare and swap operation). but i can not find proper method to impement in this project.
is there any suggestions? thanks

panic at etcd-client-0.12.4/src/client.rs:103:18

SortTarget and SortOrder not declare

hi，when i use request with_sort(),I found SortTarget and SortOrder not declare
etcd-client version is 0.9.1

How to get the relationship between the watcher and watcher_id for one key multi watcher.

//! Watch example

use etcd_client::*;

#[tokio::main]
async fn main() -> Result<(), Error> {
    let mut client = Client::connect(["localhost:2379"], None).await?;
    let (mut watcher, mut stream) = client.watch("foo", None).await?;

    watcher.watch("foo", None).await?;
    watcher.watch("foo", None).await?;

    while let Some(resp) = stream.message().await? {
        println!("[{}] receive watch response", resp.watch_id());
        for event in resp.events() {
            println!("event type: {:?}", event.event_type());
            if let Some(kv) = event.kv() {
               println!("kv: {{{}: {}}}", kv.key_str()?, kv.value_str()?);
            }
            if EventType::Delete == event.event_type() {
                watcher.cancel_by_id(resp.watch_id()).await?;
            }
        }
        println!();
    }

    Ok(())
}

I want to register multi watcher for one key at different scenes, so I want to get the watcher_id fro every watcher registered and build the relationship between the scenes and watcher_id, because I need cancel the watcher according to the scenes.

According to the example code, I only get the watcher_id from the stream.message().

So how to get the relationship between the watcher and watcher_id for one key multi watcher?

Create a GitHub Action for PR

We need a Github Action to run test cases.

Maybe expose the channel config `keep_alive_while_idle`?

For now, we always set the keep_alive_while_idle as true. However, for both grpc-go and grpc core, if we send too many pings while there isn't requests, the server side may cut down the connection, exposing this would allow users to bypass the failure of connection.

etcd-client/src/client.rs

Lines 155 to 157 in 3c315d7

 endpoint = endpoint 

 .keep_alive_while_idle(true) 

 .http2_keep_alive_interval(interval)

Some references:

https://grpc.github.io/grpc/core/md_doc_keepalive.html

Why am I receiving a GOAWAY with error code ENHANCE_YOUR_CALM?
A server sends a GOAWAY with ENHANCE_YOUR_CALM if the client sends too many misbehaving pings as described in A8-client-side-keepalive.md. Some scenarios where this can happen are -

if a server has GRPC_ARG_KEEPALIVE_PERMIT_WITHOUT_CALLS set to false while the client has set this to true resulting in keepalive pings being sent even when there is no call in flight.

if the client's GRPC_ARG_KEEPALIVE_TIME_MS setting is lower than the server's GRPC_ARG_HTTP2_MIN_RECV_PING_INTERVAL_WITHOUT_DATA_MS.

tikv/pd#3230

PD follower log got errors "Got too many pings from the client". Unlike tikv and tidb, they are responsible for establishing grpc server by itself, PD uses the grpc server provided by etcd, so the parameters cannot be set. This error may occur when the client and server parameters PermitWithoutStream are not uniform.

[Feature Request] Pass-Through Transaction Read/WriteSet

One feature of the Typescript etcd3 client is a Transaction class that offers the put*/get* API.

It does this by capturing every object (or at least its revision) in Read and Write sets, and passes them to conditional checks in the transaction execution itself.

This makes for a much more natural DB interaction experience.

e.g. in Rust it'd be something like:

let txn = client.txn()?; 

let foo = txn.get("bar/foo").await?; 
let baz = txn.get("bar/baz").await?;

let froboz = MyThing::merge(&[&foo, &baz]);
txn.put("bar/froboz", froboz.serialize()?)?;
txn.commit();

As mentioned above, the typescript implementation does this by capturing every object and the database's mod_revision
It then finds the lowest mod_revision and uses the Compare::mod_revision to ensure that the objects being written haven't been updated. This can be overly restrictive, but since it doesn't know the actual details of the updates, I think it's the best it can do.

We've been using this approach in hundreds of different service handlers across dozens of services and it has been working quite well.

We've not yet had to bypass this mechanism to create a Transaction object with the traditional fetch/compare/set operation.

Here are some pointers to the actual implementation:

WriteSet: https://github.com/microsoft/etcd3/blob/master/src/stm.ts#L100
ReadSet: https://github.com/microsoft/etcd3/blob/master/src/stm.ts#L100-L148
Txn Execution: https://github.com/microsoft/etcd3/blob/master/src/stm.ts#L497-L527

Default feature set of tonic

etcd-client depends on tonic using its default features:

[dependencies]
tonic = "0.9.2"

The default features of tonic include A LOT of transitive dependencies, specifically a lot of it comes in through hyper. Would it be possible to put any of that in a feature gate for etcd-client? I don't know enough about how tonic is used to be able to say.

Etcd client connect return error when tls enabled

The following code panic with error TransportError(tonic::transport::Error(Transport, hyper::Error(Connect, InvalidDNSNameError)))'

The config file is like this:

[pd]
endpoints = ["172.16.5.32:2379"]

[security]
ca-path = "~/ca/ca.cert.pem"
cert-path = "~/ca/client.cert.pem"
key-path = "~/client.key.pem"

 let mut option = ConnectOptions::new();
    if !config.security.ca_path.is_empty() {
        let (ca, cert, key) = config.security.load_certs().unwrap();
        option = option.with_tls(
            TlsOptions::new()
                .ca_certificate(Certificate::from_pem(ca))
                .identity(Identity::from_pem(cert, key)),
        );
    }
    let mut etcd_client = etcd_client::Client::connect(&config.pd.endpoints, Some(option))
        .await
        .unwrap();

The client cannot work in a three-node ETCD cluster if one node is stoped

Three-node ETCD cluster if only two nodes are alive.The client instance has an error.

status: Unknown, message: "Service was not ready: transport error: buffered service failed: load balancer discovery error: error trying to connect: tcp connect error: Connection refused (os error 111)", details: [], metadata: MetadataMap { headers: {} }

Are you considering implementing health-check to automatically insert and remove unavailable nodes

[feature request] Add ability to set connect timeout to client

Currently we can set the timeout on each request, but it seems that the actual initial connection itself has no timeout, which can lead to hanging if an endpoint is down.

Adding this would probably be a matter of adding a new field to ConnectOptions to expose tonic::transport's connect_timeout method

Question: How to use the directories feature?

It's not immediently clear how to use the directory features like creating, listing, and deleting

https://etcd.io/docs/v2.3/api/#listing-a-directory

I know there is get with prefix but I'm looking for directory calls like in the old etcd rs library. Does this library have that?

Thanks

Enhancement: Add with_prefix() to DeleteOptions

Add the with_prefix behavior to the DeleteOptions

Example:
DeleteOptions::new().with_prefix();

Support namespace API

It's convenient to put/get keys in a "namespace" for separating different usage, or it's even required to smoothly write concurrent tests share the same etcd cluster.

This concept, in ZooKeeper, is named as Chroot. I don't find it in etcd but it should be a client-side feature. I create this issue for discussing whether it's desired to implement in this crate.

The API should affect only the constructors for accepting a chroot parameter/options:

Client::connect(endpoints, chroot /* String */, options);
Client::connect(endpoints, options); // chroot in options

... and all the keys pass to this client will be prepended with the "chroot".

Adding more keys to a watch

The documentation makes it seem like I can watch additional keys on the same watcher but I don't see where or how that would work. I don't want to create multiple watch streams so I'd prefer to just have one set of channels to and from Etcd. Is that possible?

Tolerate (partial) connection failures in endpoints in the Balancer Client

Motivation:

We connect to a quorum of etcd servers across regions (not the recommended architecture, but it works quite well)

For various reasons, a small subset of the nodes might be unavailable.
This should instead tolerate failures and adjust the pool accordingly, if that is the desire of the consumer of the API.

This functionality lives in the tower::balancer and tonic::transport::service behavior. The discovery mechanism in balancer_channel connects "lazily" upon receiving its requests. It appears to connect to all endpoints, but if one fails, the entire operation fails.

It seems like the only option here is to work with the Tower team to provide a partial success route. This is preferred not only because it is the right thing for initial connection, but should provide the proper behavior in an ongoing fashion.

I will continue to pursue this approach, but I'd like to leave this ticket open because there will likely be some (hopefully non-breaking) changes to the etcd client to optionally utilize the partial-success behavior.

Get as Mutable methods

Hi,

Seems get/put/delete are all mutable methods, was this intended, even for the get?

Thanks

Enhancement: Allow external libraries to instantiate *Response types for mocking

Hi,

I would like to use etcd-client in a project of mine. The project should also come with some unit tests where the (wrapped) interaction with etcd-client is mocked via mockall. I face however some issues with the return types such as GetResponse or PutResponse, because I am not able to create an 'instance' of those responses (which should be returned by the mock object) on my own.

It would be cool to optionally expose the related, internal protobuf representations and make the constructor functions such as GetResponse::new public to allow such use cases.

I have already prepared the first part here. Now, only the public constructor functions are missing. I image some more conditional compilation leveraging your visible crate, which avoids large modifications to the code base like this:

// inside src/rpc/kv.rs
impl PutResponse {
    /// Create a new `PutResponse` from pb put response.
    #[cfg_attr(protobuf-response-structs, visible::Function(pub))]
    #[inline]
    const fn new(resp: PbPutResponse) -> Self {
        Self(resp)
    }
...
}

@davidli2010 What do you think about the approach, do you support the idea?

tower's default features should be disabled

tower's default features introduced a lot of unnecessary stuff, including tower's log

Determine loss of watch connectivity

I'm having an issue I need to figure out. My app starts a watch on start but when testing, I found that if I kill Etcd, the watch just keeps waiting for the next message even though the connection is dead. I would have expected it to exit with an error.

while let Some(msg) = stream.message().await.map_err(econv)? {
            for event in msg.events() {
                println!("Got event of {:?}", event.event_type());
                if let Some(kv) = event.kv() {
                    println!(
                        "  key: {}  val: {}",
                        kv.key_str().map_err(econv)?,
                        kv.value_str().map_err(econv)?
                    );
                }
            }
        }

Unable to run examples

I have my local etcd container running, but almost all the examples and tests fail:


$ cargo run --example lock
   Compiling etcd-client v0.6.5 (/home/plan/projects/qingcloud/etcd-client)
    Finished dev [unoptimized + debuginfo] target(s) in 5.18s
     Running `target/debug/examples/lock`
try to lock with name 'lock-test'
Error: GRpcStatus(Status { code: Unimplemented, message: "unknown service v3lockpb.Lock", metadata: MetadataMap { headers: {"content-type": "application/grpc"} } })

Unexpected behaviour duing continious master election

Hi,

I am trying to write a code for a cluster, where only one node would periodially download a file and store it for everyone else. I don't want to put the file via promote because I want to know who is the leader first, so that only that node does work. I plan to write the file to another key and let everyone observe. I could probably still use promote, but the file should still be readable if it is stale and I'll probably have to put an empty value during initial campaign call.

Anyway, it seems like I cannot implement a continious master election. I am having spurious wake-up from client.campaign(...) for both leaders and not leaders. This is due to lease expiry during the campaign call.

Here is the code that reproduces the issue

use etcd_client::*;
use tracing::info;

async fn elect_and_work(
    idx: i32,
    mut etcd_client: Client,
    etcd_path: String,
) -> anyhow::Result<()> {
    loop {
        let resp = etcd_client.lease_grant(10, None).await?;
        let lease_id = resp.id();
        // info!("grant ttl:{:?}, id:{:?}", resp.ttl(), resp.id());
        let election_path = format!("{}/test_election", etcd_path);

        // campaign
        let resp = etcd_client
            .campaign(election_path.as_str(), "123", lease_id)
            .await?;
        let leader = resp.leader().unwrap();
        info!(
            "Won election name: idx={}, {:?}, leaseId:{:?}, my_lease: {:?}",
            idx,
            leader.name_str(),
            leader.lease(),
            lease_id,
        );
        let (mut keeper, mut keeper_responses) = etcd_client.lease_keep_alive(lease_id).await?;
        for _ in 0..5 {
            info!("Keeping alive (idx={})...", idx);
            keeper.keep_alive().await?;
            if let Some(resp) = keeper_responses.message().await? {
                info!(
                    "Lease on idx={}, {:?} kept alive, new ttl {:?}",
                    idx,
                    resp.id(),
                    resp.ttl()
                );
            }
            tokio::time::delay_for(tokio::time::Duration::from_secs(3)).await;
        }
    }
}

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    ::std::env::set_var("RUST_LOG", "election_repro=info");
    let endpoints = vec!["http://127.0.0.1:2379"];
    tracing_subscriber::fmt::init();
    let etcd_client = etcd_client::Client::connect(endpoints, None).await?;

    let mut join_handles = vec![];

    for idx in 0..5 {
        let client = etcd_client.clone();
        let join = tokio::spawn(async move {
            elect_and_work(idx, client, "/election_repro".into())
                .await
                .unwrap();
        });
        join_handles.push(join);
    }
    futures::future::join_all(join_handles).await;

    Ok(())
}

The log:

Dec 28 16:55:42.986  INFO election_repro: Won election name: idx=0, Ok("/election_repro/test_election"), leaseId:7587851380158607485, my_lease: 7587851380158607485
Dec 28 16:55:42.987  INFO election_repro: Keeping alive (idx=0)...
Dec 28 16:55:42.988  INFO election_repro: Lease on idx=0, 7587851380158607485 kept alive, new ttl 10
Dec 28 16:55:45.989  INFO election_repro: Keeping alive (idx=0)...
Dec 28 16:55:45.991  INFO election_repro: Lease on idx=0, 7587851380158607485 kept alive, new ttl 10
Dec 28 16:55:48.992  INFO election_repro: Keeping alive (idx=0)...
Dec 28 16:55:48.993  INFO election_repro: Lease on idx=0, 7587851380158607485 kept alive, new ttl 10
Dec 28 16:55:51.994  INFO election_repro: Keeping alive (idx=0)...
Dec 28 16:55:51.995  INFO election_repro: Lease on idx=0, 7587851380158607485 kept alive, new ttl 10
Dec 28 16:55:54.996  INFO election_repro: Keeping alive (idx=0)...
Dec 28 16:55:54.998  INFO election_repro: Lease on idx=0, 7587851380158607485 kept alive, new ttl 10
Dec 28 16:56:05.031  INFO election_repro: Won election name: idx=4, Ok("/election_repro/test_election"), leaseId:7587851380158607493, my_lease: 7587851380158607493
Dec 28 16:56:05.031  INFO election_repro: Won election name: idx=1, Ok("/election_repro/test_election"), leaseId:7587851380158607487, my_lease: 7587851380158607487
Dec 28 16:56:05.032  INFO election_repro: Won election name: idx=0, Ok("/election_repro/test_election"), leaseId:7587851380158607512, my_lease: 7587851380158607512
Dec 28 16:56:05.032  INFO election_repro: Won election name: idx=2, Ok("/election_repro/test_election"), leaseId:7587851380158607489, my_lease: 7587851380158607489
Dec 28 16:56:05.032  INFO election_repro: Won election name: idx=3, Ok("/election_repro/test_election"), leaseId:7587851380158607491, my_lease: 7587851380158607491
Dec 28 16:56:05.033  INFO election_repro: Keeping alive (idx=1)...
Dec 28 16:56:05.034  INFO election_repro: Keeping alive (idx=4)...
Dec 28 16:56:05.034  INFO election_repro: Keeping alive (idx=3)...
Dec 28 16:56:05.034  INFO election_repro: Keeping alive (idx=0)...
Dec 28 16:56:05.034  INFO election_repro: Keeping alive (idx=2)...
Dec 28 16:56:05.035  INFO election_repro: Lease on idx=3, 7587851380158607491 kept alive, new ttl 0
Dec 28 16:56:05.035  INFO election_repro: Lease on idx=1, 7587851380158607487 kept alive, new ttl 0
Dec 28 16:56:05.035  INFO election_repro: Lease on idx=4, 7587851380158607493 kept alive, new ttl 0
Dec 28 16:56:05.035  INFO election_repro: Lease on idx=2, 7587851380158607489 kept alive, new ttl 0
Dec 28 16:56:05.035  INFO election_repro: Lease on idx=0, 7587851380158607512 kept alive, new ttl 10
Dec 28 16:56:08.037  INFO election_repro: Keeping alive (idx=2)...
Dec 28 16:56:08.037  INFO election_repro: Keeping alive (idx=4)...
Dec 28 16:56:08.037  INFO election_repro: Keeping alive (idx=1)...
Dec 28 16:56:08.037  INFO election_repro: Keeping alive (idx=3)...
Dec 28 16:56:08.037  INFO election_repro: Keeping alive (idx=0)...
Dec 28 16:56:08.038  INFO election_repro: Lease on idx=0, 7587851380158607512 kept alive, new ttl 10
Dec 28 16:56:08.038  INFO election_repro: Lease on idx=2, 7587851380158607489 kept alive, new ttl 0
Dec 28 16:56:08.038  INFO election_repro: Lease on idx=1, 7587851380158607487 kept alive, new ttl 0
Dec 28 16:56:08.038  INFO election_repro: Lease on idx=3, 7587851380158607491 kept alive, new ttl 0
Dec 28 16:56:08.038  INFO election_repro: Lease on idx=4, 7587851380158607493 kept alive, new ttl 0
Dec 28 16:56:11.039  INFO election_repro: Keeping alive (idx=4)...
Dec 28 16:56:11.039  INFO election_repro: Keeping alive (idx=2)...
Dec 28 16:56:11.039  INFO election_repro: Keeping alive (idx=0)...
Dec 28 16:56:11.039  INFO election_repro: Keeping alive (idx=3)...
Dec 28 16:56:11.039  INFO election_repro: Keeping alive (idx=1)...
Dec 28 16:56:11.040  INFO election_repro: Lease on idx=3, 7587851380158607491 kept alive, new ttl 0
Dec 28 16:56:11.040  INFO election_repro: Lease on idx=0, 7587851380158607512 kept alive, new ttl 10
Dec 28 16:56:11.040  INFO election_repro: Lease on idx=2, 7587851380158607489 kept alive, new ttl 0
Dec 28 16:56:11.040  INFO election_repro: Lease on idx=1, 7587851380158607487 kept alive, new ttl 0
Dec 28 16:56:11.040  INFO election_repro: Lease on idx=4, 7587851380158607493 kept alive, new ttl 0
Dec 28 16:56:14.042  INFO election_repro: Keeping alive (idx=3)...
Dec 28 16:56:14.042  INFO election_repro: Keeping alive (idx=1)...
Dec 28 16:56:14.042  INFO election_repro: Keeping alive (idx=2)...
Dec 28 16:56:14.042  INFO election_repro: Keeping alive (idx=0)...
Dec 28 16:56:14.042  INFO election_repro: Keeping alive (idx=4)...
Dec 28 16:56:14.043  INFO election_repro: Lease on idx=0, 7587851380158607512 kept alive, new ttl 10
Dec 28 16:56:14.043  INFO election_repro: Lease on idx=2, 7587851380158607489 kept alive, new ttl 0
Dec 28 16:56:14.043  INFO election_repro: Lease on idx=3, 7587851380158607491 kept alive, new ttl 0
Dec 28 16:56:14.043  INFO election_repro: Lease on idx=4, 7587851380158607493 kept alive, new ttl 0
Dec 28 16:56:14.043  INFO election_repro: Lease on idx=1, 7587851380158607487 kept alive, new ttl 0
Dec 28 16:56:17.044  INFO election_repro: Keeping alive (idx=2)...
Dec 28 16:56:17.044  INFO election_repro: Keeping alive (idx=4)...
Dec 28 16:56:17.044  INFO election_repro: Keeping alive (idx=3)...
Dec 28 16:56:17.044  INFO election_repro: Keeping alive (idx=0)...
Dec 28 16:56:17.045  INFO election_repro: Keeping alive (idx=1)...
Dec 28 16:56:17.046  INFO election_repro: Lease on idx=0, 7587851380158607512 kept alive, new ttl 10
Dec 28 16:56:17.046  INFO election_repro: Lease on idx=4, 7587851380158607493 kept alive, new ttl 0
Dec 28 16:56:17.046  INFO election_repro: Lease on idx=2, 7587851380158607489 kept alive, new ttl 0
Dec 28 16:56:17.046  INFO election_repro: Lease on idx=3, 7587851380158607491 kept alive, new ttl 0
Dec 28 16:56:17.046  INFO election_repro: Lease on idx=1, 7587851380158607487 kept alive, new ttl 0
Dec 28 16:56:27.537  INFO election_repro: Won election name: idx=0, Ok("/election_repro/test_election"), leaseId:7587851380158607519, my_lease: 7587851380158607519
Dec 28 16:56:27.539  INFO election_repro: Keeping alive (idx=0)...
Dec 28 16:56:27.539  INFO election_repro: Lease on idx=0, 7587851380158607519 kept alive, new ttl 10
Dec 28 16:56:30.541  INFO election_repro: Keeping alive (idx=0)...
Dec 28 16:56:30.542  INFO election_repro: Lease on idx=0, 7587851380158607519 kept alive, new ttl 10
Dec 28 16:56:33.543  INFO election_repro: Keeping alive (idx=0)...
Dec 28 16:56:33.544  INFO election_repro: Lease on idx=0, 7587851380158607519 kept alive, new ttl 10
Dec 28 16:56:36.546  INFO election_repro: Keeping alive (idx=0)...
Dec 28 16:56:36.547  INFO election_repro: Lease on idx=0, 7587851380158607519 kept alive, new ttl 10

Notice how first election runs correctly. Only one client gets to win the election. Everyone else waits.

The leader sends keep-alives as expected. To simulate failure it stops sending keep-alive after 5 times. Now the interesting part begins.

New election runs. Someone wins, but everyone's campaign call completes. I'd expect an error to be returned at this point, but instead everyone gets a LeaderKey with their own lease_id. Did everyone just won and lost instantly? I am not sure, but it does not look like it. Only one can successfully update keep-alive on the lease and this is idx=0 because this client got a chance to refresh the lease due to completing the loop. Still, I don't really known the leader.

Eventually they all complete this strange loop, refresh their grants and participate in the correct election again.

What is most suprising none of this ever resurns Result=Err. Is there anything wrong in my code and I should check something else? Should Err be returned on wrong lease? Both?

Thanks,
Igor.

how to connect to etcd with cert key? Is there a plan to support it?

etcd server connection issues in client

Hi,

I'm having trouble with etcd server connections in client. Since the server is self-hosted and used for cross-region and cross-continent access, network connectivity is very unstable. Sometimes the server needs to be restarted and client cannot reconnect.

I tried using with_keep_alive and with_connect_timeout but it didn't help.

The problem mainly occurs in lease_keep_alive. When the connection is interrupted or etcd is restarted, it cannot automatically reconnect. Sometimes it shows that the connection is refused and no longer retries.

What are some ways to handle client in this situation? How can I implement reasonable keep alive?

feature request: provide methods for updating authentication tokens or automatically update the tokens for long-live clients

Currently, there's no such interface for updating the expired token held by a Client, which leads to a problem when using a long-live client. We would always fail after the token TTL and have to create another Client in case of authenticate errors. It's not easy for shared services, especially in Rust.

So I wonder if we can provide a method for updating the authentication token of some Client or just updating it automatically inside the client when an authentication error's encountered.

I would prefer the second solution. I think the AuthService::call would be a good location for this. How does this sound to you?

Looking forward to your comments.

	/// Key-value pair.
	#[cfg_attr(feature = "pub-field", visible::StructFields(pub))]
	#[derive(Debug, Clone)]
	#[repr(transparent)]
	pub struct KeyValue(PbKeyValue);

	endpoint = endpoint
	.keep_alive_while_idle(true)
	.http2_keep_alive_interval(interval)