Git Product home page Git Product logo

Comments (14)

xyfleet avatar xyfleet commented on August 25, 2024

looks like the script failed to create table in the DB. https://github.com/milvus-io/milvus/blob/master/tests/scripts/values/mysql.yaml
The script above only works for internal mysqlDB?

@NicoYuan1986 could you please help take a look at this one?

from milvus-helm.

LoveEachDay avatar LoveEachDay commented on August 25, 2024

@xyfleet You need manually provision a mysql database with tables first.
You can use the sql statement from https://github.com/milvus-io/milvus/blob/master/tests/scripts/values/mysql.yaml.

from milvus-helm.

xyfleet avatar xyfleet commented on August 25, 2024

@LoveEachDay Thank you so much. I added these tables manually. Then I met another issue.

In rootcood pod:

[2023/02/14 06:23:10.893 +00:00] [ERROR] [grpcclient/client.go:149] ["failed to get client address"] [error="find no available querycoord, check querycoord state"] ...
[2023/02/14 06:23:10.893 +00:00] [ERROR] [grpcclient/client.go:305] ["ClientBase ReCall grpc second call get error"] [role=querycoord] [error="err: find no available querycoord, check querycoord state
[2023/02/14 06:23:10.893 +00:00] [WARN] [rootcoord/quota_center.go:129] ["quotaCenter sync metrics failed"] [error="quotaCenter get Data cluster failed, err = DataCoord 171 is not ready"]

In DataCoord pod:

[2023/02/14 06:25:52.889 +00:00] [WARN] [datacoord/services.go:849] ["DataCoord.GetMetrics failed"] [traceID=4cdb1156d9ffea02] [nodeID=171] [req="{\"metric_type\":\"system_info\"}"] [error="DataCoord 171 is not ready"]

In querycood pod, proxy pod, datanode pod:

[2023/02/14 06:22:27.550 +00:00] [WARN] [retry/retry.go:39] ["retry func failed"] ["retry time"=0] [error="WaitForComponentStates, not meet, DataCoord current state: StandBy"]

In querycood pod:

[2023/02/14 06:25:26.150 +00:00] [ERROR] [querynode/query_node.go:271] ["QueryNode init vector storage failed"] [error="Endpoint url cannot have fully qualified paths."] [stack="github.com/milvus-io/milvus/internal/querynode.(*QueryNode).Init.func1\n\t/go/src/github.com/milvus-io/milvus/internal/querynode/query_node.go:271\nsync.(*Once).doSlow\n\t/usr/local/go/src/sync/once.go:68\nsync.(*Once).Do\n\t/usr/local/go/src/sync/once.go:59\ngithub.com/milvus-io/milvus/internal/querynode.(*QueryNode).Init\n\t/go/src/github.com/milvus-io/milvus/internal/querynode/query_node.go:249\ngithub.com/milvus-io/milvus/internal/distributed/querynode.(*Server).init\n\t/go/src/github.com/milvus-io/milvus/internal/distributed/querynode/service.go:133\ngithub.com/milvus-io/milvus/internal/distributed/querynode.(*Server).Run\n\t/go/src/github.com/milvus-io/milvus/internal/distributed/querynode/service.go:213\ngithub.com/milvus-io/milvus/cmd/components.(*QueryNode).Run\n\t/go/src/github.com/milvus-io/milvus/cmd/components/query_node.go:54\ngithub.com/milvus-io/milvus/cmd/roles.runComponent[...].func1\n\t/go/src/github.com/milvus-io/milvus/cmd/roles/roles.go:104"]
[2023/02/14 06:25:26.150 +00:00] [ERROR] [querynode/service.go:134] ["QueryNode init error: "] [error="Endpoint url cannot have fully qualified paths."] [stack="github.com/milvus-io/milvus/internal/distributed/querynode.(*Server).init\n\t/go/src/github.com/milvus-io/milvus/internal/distributed/querynode/service.go:134\ngithub.com/milvus-io/milvus/internal/distributed/querynode.(*Server).Run\n\t/go/src/github.com/milvus-io/milvus/internal/distributed/querynode/service.go:213\ngithub.com/milvus-io/milvus/cmd/components.(*QueryNode).Run\n\t/go/src/github.com/milvus-io/milvus/cmd/components/query_node.go:54\ngithub.com/milvus-io/milvus/cmd/roles.runComponent[...].func1\n\t/go/src/github.com/milvus-io/milvus/cmd/roles/roles.go:104"]
panic: Endpoint url cannot have fully qualified paths.

In the same time, I see some errors from etcd pod and zookeeper.

errors from etcd:

{"level":"warn","ts":"2023-02-14T05:59:31.783Z","caller":"etcdserver/util.go:123","msg":"failed to apply request","took":"8.867µs","request":"header:<ID:5789524982615469658 > lease_revoke:<id:5058864c270ae1ce>","response":"size:27","error":"lease not found"}

one zookeeper:

2023-02-13 19:00:58,522 [myid:2] - ERROR [LeaderConnector-milvus-db-zookeeper-2.milvus-db-zookeeper-headless.milvus.svc.cluster.local:2888:Learner$LeaderConnector@389] - Failed connect to milvus-db-zookeeper-2.milvus-db-zookeeper-headless.milvus.svc.cluster.local:2888
java.net.UnknownHostException: milvus-db-zookeeper-2.milvus-db-zookeeper-headless.milvus.svc.cluster.local

I think something is wrong with my etcd and kafka-zookeeper setting. Not sure if this is the reason why milvus pods still failed?

About the etcd and zookeeper errors, do you have any ideas? About the external s3 and kafka, please help check if there is an issue:

minio:
  enabled: false

etcd:
  persistence:
    storageClass: ebs-sc
    accessMode: ReadWriteOnce
    size: 10Gi

pulsar:
  enabled: false

kafka:
  enabled: true
  persistence:
    enabled: true
    storageClass: ebs-sc
    accessMode: ReadWriteOnce
    size: 300Gi
  metrics:
    ## Prometheus Kafka exporter: exposes complimentary metrics to JMX exporter
    kafka:
      enabled: true
    jmx:
      enabled: true
    serviceMonitor:
      enabled: true


externalS3:
  enabled: true
  host: "xxxx"
  port: "80"
  accessKey: "s3_access_key"
  secretkey: "s3_secret_key"
  bucketName: "s3_bucket_name"

from milvus-helm.

LoveEachDay avatar LoveEachDay commented on August 25, 2024

@xyfleet Could you use this script to export logs for all components?

from milvus-helm.

xyfleet avatar xyfleet commented on August 25, 2024

@LoveEachDay Today, I reconfigured the kafka, zookeeper and etcd. Still get the same error. Logs attached. Thanks.

milvus-log.tar.gz

from milvus-helm.

LoveEachDay avatar LoveEachDay commented on August 25, 2024

@xyfleet From the provided log:

[2023/02/14 20:03:08.123 +00:00] [ERROR] [datacoord/server.go:406] ["chunk manager init failed"] [error="Endpoint url cannot have fully qualified paths."] [stack="github.com/milvus-io/milvus/internal/datacoord.(*Server).newChunkManagerFactory\n\t/go/src/github.com/milvus-io/milvus/internal/datacoord/server.go:406\ngithub.com/milvus-io/milvus/internal/datacoord.(*Server).initDataCoord\n\t/go/src/github.com/milvus-io/milvus/internal/datacoord/server.go:293\ngithub.com/milvus-io/milvus/internal/datacoord.(*Server).Init.func1\n\t/go/src/github.com/milvus-io/milvus/internal/datacoord/server.go:270\ngithub.com/milvus-io/milvus/internal/util/sessionutil.(*Session).ProcessActiveStandBy\n\t/go/src/github.com/milvus-io/milvus/internal/util/sessionutil/session_util.go:811\ngithub.com/milvus-io/milvus/internal/datacoord.(*Server).Register\n\t/go/src/github.com/milvus-io/milvus/internal/datacoord/server.go:224\ngithub.com/milvus-io/milvus/internal/distributed/datacoord.(*Server).start\n\t/go/src/github.com/milvus-io/milvus/internal/distributed/datacoord/service.go:195\ngithub.com/milvus-io/milvus/internal/distributed/datacoord.(*Server).Run\n\t/go/src/github.com/milvus-io/milvus/internal/distributed/datacoord/service.go:246\ngithub.com/milvus-io/milvus/cmd/components.(*DataCoord).Run\n\t/go/src/github.com/milvus-io/milvus/cmd/components/data_coord.go:49\ngithub.com/milvus-io/milvus/cmd/roles.runComponent[...].func1\n\t/go/src/github.com/milvus-io/milvus/cmd/roles/roles.go:104"]
[2023/02/14 20:03:08.123 +00:00] [ERROR] [datacoord/server.go:271] ["DataCoord init failed"] [error="Endpoint url cannot have fully qualified paths."] [stack="github.com/milvus-io/milvus/internal/datacoord.(*Server).Init.func1\n\t/go/src/github.com/milvus-io/milvus/internal/datacoord/server.go:271\ngithub.com/milvus-io/milvus/internal/util/sessionutil.(*Session).ProcessActiveStandBy\n\t/go/src/github.com/milvus-io/milvus/internal/util/sessionutil/session_util.go:811\ngithub.com/milvus-io/milvus/internal/datacoord.(*Server).Register\n\t/go/src/github.com/milvus-io/milvus/internal/datacoord/server.go:224\ngithub.com/milvus-io/milvus/internal/distributed/datacoord.(*Server).start\n\t/go/src/github.com/milvus-io/milvus/internal/distributed/datacoord/service.go:195\ngithub.com/milvus-io/milvus/internal/distributed/datacoord.(*Server).Run\n\t/go/src/github.com/milvus-io/milvus/internal/distributed/datacoord/service.go:246\ngithub.com/milvus-io/milvus/cmd/components.(*DataCoord).Run\n\t/go/src/github.com/milvus-io/milvus/cmd/components/data_coord.go:49\ngithub.com/milvus-io/milvus/cmd/roles.runComponent[...].func1\n\t/go/src/github.com/milvus-io/milvus/cmd/roles/roles.go:104"]
[2023/02/14 20:03:08.123 +00:00] [WARN] [datacoord/service.go:197] ["DataCoord register service failed"] [error="Endpoint url cannot have fully qualified paths."]

The config for externalS3.host is invalid, seems your host includes a path. Just provide a host instead.

from milvus-helm.

xyfleet avatar xyfleet commented on August 25, 2024

@LoveEachDay Thanks a lot. The host, you mean something like this
"my_bucket.s3.region-code.amazonaws.com", right?

from milvus-helm.

haorenfsa avatar haorenfsa commented on August 25, 2024

@xyfleet and also set externalS3.port to 443, externalS3.useSSL to true

from milvus-helm.

LoveEachDay avatar LoveEachDay commented on August 25, 2024

Yes

@LoveEachDay Thanks a lot. The host, you mean something like this "my_bucket.s3.region-code.amazonaws.com", right?

from milvus-helm.

xyfleet avatar xyfleet commented on August 25, 2024

@LoveEachDay I tried several times, still get this error

[2023/02/15 05:23:18.973 +00:00] [WARN] [storage/minio_chunk_manager.go:106] ["failed to check blob bucket exist"] [bucket=my-bucket-name] [error="Access Denied."]

I tried two different s3 accounts but got the same error. I think my s3-user has right permission. Do you have any idea?

s3-user permission:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": [
                "s3:PutObjectACL",
                "s3:PutObject",
                "s3:ListBucket",
                "s3:GetObjectVersion",
                "s3:GetObjectACL",
                "s3:GetObject",
                "s3:DeleteObjectVersion",
                "s3:DeleteObject"
            ],
            "Effect": "Allow",
            "Resource": [
                "arn:aws:s3:::my-s3-bucket/*",
                "arn:aws:s3:::my-s3-bucket"
            ],
            "Sid": ""
        }
    ]
}

externals3 config:

externalS3:
  enabled: true
  host: "s3-bucket.s3.us-east-1.amazonaws.com"
  port: "443"
  accessKey: "externa_s3_access_key"
  secretkey: ""externa_s3_secret_key"
  bucketName: "my-s3-bucket"
  useSSL: true

from milvus-helm.

haorenfsa avatar haorenfsa commented on August 25, 2024

@xyfleet Sry, my mistake. Milvus uses url path to access bucket, it doesn't support it by host. You can change your externalS3.host to s3.us-east-1.amazonaws.com

from milvus-helm.

xyfleet avatar xyfleet commented on August 25, 2024

@haorenfsa Thanks for your update. I updated my code and still got error. Really weird.

externalS3:
  enabled: true
  host: s3.us-east-1.amazonaws.com
  port: "443"
  accessKey: "externa_s3_access_key"
  secretkey: ""externa_s3_secret_key"
  bucketName: "my-s3-bucket"
  useSSL: true
[2023/02/15 05:50:16.219 +00:00] [WARN] [storage/minio_chunk_manager.go:106] ["failed to check blob bucket exist"] [bucket=my-s3-bucket] [error="Access Denied."]

I found this python code, https://github.com/milvus-io/milvus/blob/master/tests/benchmark/milvus_benchmark/update.py,
Not sure if this works as expected.

values_dict['minio']['enabled'] = True
    # values_dict["externalS3"]["enabled"] = True
    values_dict["externalS3"]["enabled"] = False
    values_dict["externalS3"]["host"] = config.MINIO_HOST
    values_dict["externalS3"]["port"] = config.MINIO_PORT
    values_dict["externalS3"]["accessKey"] = config.MINIO_ACCESS_KEY
    values_dict["externalS3"]["secretKey"] = config.MINIO_SECRET_KEY
    values_dict["externalS3"]["bucketName"] = config.MINIO_BUCKET_NAME
    logging.debug(values_dict["externalS3"])

from milvus-helm.

xyfleet avatar xyfleet commented on August 25, 2024

@haorenfsa Do you think we can have a quick zoom meeting to troubleshoot this one?

from milvus-helm.

LoveEachDay avatar LoveEachDay commented on August 25, 2024

@xyfleet Could you join the slack channel?

@haorenfsa Do you think we can have a quick zoom meeting to troubleshoot this one?

from milvus-helm.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.