Git Product home page Git Product logo

beegfs-csi-driver's Introduction

BeeGFS CSI Driver

Go report card License Build, Test, and Publish Code scanning using CodeQL

Contents


Overview

The BeeGFS Container Storage Interface (CSI) driver provides high performing and scalable storage for workloads running in container orchestrators like Kubernetes. This driver allows containers to access existing datasets or request on-demand ephemeral or persistent high speed storage backed by BeeGFS parallel file systems.

The driver can be easily deployed using the provided Kubernetes manifests. Optionally the BeeGFS CSI Driver Operator can be used to automate day-1 (install/ configure) and day-2 (reconfigure/update) tasks for the driver. This especially simplifies discovery and installation from Operator Lifecycle Manger (OLM) enabled clusters. Multi-arch images supporting amd64 and arm64 Kubernetes nodes are provided for the BeeGFS CSI driver and operator.

Notable Features

  • Integration of Storage Classes in Kubernetes with storage pools in BeeGFS, allowing different tiers of storage within the same file system to be exposed to end users.
  • Management of global and node specific BeeGFS client configuration applied to Kubernetes nodes, simplifying use in large environments.
  • Specify permissions in BeeGFS from Storage Classes in Kubernetes simplifying integration with BeeGFS quotas and providing visibility and control over user consumption of the shared file system.
  • Set striping parameters in BeeGFS from Storage Classes in Kubernetes to optimize for diverse workloads sharing the same file system.
  • Support for ReadWriteOnce, ReadOnlyMany, and ReadWriteMany access modes in Kubernetes allow workloads distributed across multiple Kubernetes nodes to share access to the same working directories and enable multi-user/application access to common datasets.

Compatibility

The BeeGFS CSI driver must interact with both Kubernetes and a BeeGFS filesystem. To ensure compatibility with relevant versions of these key software components regular testing is done throughout each release cycle. The following table describes the versions of each component used in testing each release of the BeeGFS CSI driver. These configurations should be considered compatible and supported.

BeeGFS CSI Driver K8s Versions BeeGFS Client Versions CSI Version
v1.6.0 1.25.16, 1.26.14, 1.27.11, 1.28.7 7.3.4, 7.4.2 v1.8.0
v1.5.0 1.23.17, 1.24.15, 1.25.11, 1.26.3, 1.27.3 7.3.4, 7.4.0 v1.7.0
v1.4.0 1.22.6, 1.23.5, 1.24.1, 1.25.2 7.3.2, 7.2.8 v1.7.0
v1.3.0 1.21.4, 1.22.3, 1.23.1, 1.24.1 7.3.1, 7.2.7 v1.6.0
v1.2.2 1.20.11, 1.21.4, 1.22.3, 1.23.1 7.3.0, 7.2.6 1 v1.5.0
v1.2.1 1.19.15, 1.20.11, 1.21.4, 1.22.3 7.2.5 1 v1.5.0
v1.2.0 1.18, 1.19, 1.20, 1.21 7.2.4 1 v1.5.0
v1.1.0 1.18, 1.19, 1.20 7.2.1 1 v1.3.0
v1.0.0 1.19 7.2 1 v1.3.0

Additional notes:

  • Starting with v1.6.0 official multi-arch container images are provided for both amd64 and arm64.
  • The BeeGFS CSI driver offers experimental support for Hashicorp Nomad.
  • As of v1.5.0 the BeeGFS CSI driver is no longer tested with Red Hat OpenShift.

See the compatibility guide for more details on expectations of compatibility for the BeeGFS CSI driver.

Known Incompatibilities

BeeGFS CSI Driver compatibility with BeeGFS 7.2.7+ and 7.3.1+

Versions of the BeeGFS CSI driver prior to v1.3.0 are known to have issues initializing the driver when used in conjunction with BeeGFS clients 7.2.7 or 7.3.1. These issues relate to the changes in these BeeGFS versions that require Connection Authentication configuration to be set. The v1.3.0 release of the driver resolves these issues and maintains compatibility with the prior BeeGFS versions (7.2.6 and 7.3.0). Therefore, in an environment where an existing installation is upgrading from BeeGFS 7.2.6 or 7.3.0 to 7.2.7 or 7.3.1 the recommendation would be to upgrade the BeeGFS CSI driver to v1.3.0 or later before upgrading the BeeGFS clients.


Support

Support for the BeeGFS CSI driver is "best effort". The following policy is in no way binding and may change without notice.

Only the latest version of the BeeGFS CSI driver is supported. Bugs or vulnerabilities found in this version may be fixed in a patch release or may be fixed in a new minor version. If they are fixed in a new minor version, upgrading to this version may be required to obtain the fix.

Fixes for old minor versions of the driver or backporting fixes to an older minor release of the driver should not be expected. The maintainers may choose to release a fix in a patch for an older release at their discretion.

Support for the BeeGFS driver can be expected when the driver is used with components listed in the compatibility table. The ability to provide support for issues with components outside of the compatibility matrix will depend on the details of the issue.

If you have any questions, feature requests, or would like to report an issue please submit them at https://github.com/ThinkParQ/beegfs-csi-driver/issues.


Getting Started

Prerequisite(s)

  • Deploying the driver requires access to a terminal with kubectl.
  • The BeeGFS DKMS client must be preinstalled to each Kubernetes node that needs BeeGFS access.
    • As part of this setup the beegfs-helperd and beegfs-utils packages must be installed, and the beegfs-helperd service must be started and enabled.
    • For BeeGFS versions 7.3.1+ or 7.2.7+, the beegfs-helperd service must be configured with connDisableAuthentication = true or connAuthFile = <path to a connAuthFile shared by all file systems>. See BeeGFS Helperd Configuration for other options or more details.
    • Experimental support for OpenShift environments with RedHat CoreOS nodes negates this requirement.
  • Each BeeGFS mount point uses an ephemeral UDP port. On Linux the selected ephemeral port is constrained by the values of IP variables. Ensure that firewalls allow UDP traffic between BeeGFS management/metadata/storage nodes and ephemeral ports on Kubernetes nodes.
  • One or more existing BeeGFS file systems should be available to the Kubernetes nodes over a TCP/IP and/or RDMA (InfiniBand/RoCE) capable network (not required to deploy the driver).

Quick Start

The steps in this section allow you to get the driver up and running quickly. See them in action on NetApp TV. For production use cases or air-gapped environments it is recommended to read through the full kubectl deployment guide or operator deployment guide.

  1. On a machine with kubectl and access to the Kubernetes cluster where you want to deploy the BeeGFS CSI driver clone this repository: git clone https://github.com/ThinkParQ/beegfs-csi-driver.git.
  2. Change to the BeeGFS CSI driver directory (cd beegfs-csi-driver).
  3. In BeeGFS versions 7.3.1+ or 7.2.7+, explicit connAuth configuration is required. Do one of the following or see ConnAuth Configuration for more details.
    • Set connDisableAuthentication to true in csi-beegfs-config.yaml if your existing file system does not use connection authentication.
      config:
        beegfsClientConf:
          connDisableAuthentication: true
    • Provide connAuth details in csi-beegfs-connauth.yaml if your existing file system does use connection authentication.
      - sysMgmtdHost: <sysMgmtdHost>
        connAuth: <connAuthSecret>
        encoding: <encodingType> # raw or base64
  4. Run kubectl apply -k deploy/k8s/overlays/default. Note by default the beegfs-csi-driver image will be pulled from DockerHub.
  5. Verify all components are installed and operational: kubectl get pods -n beegfs-csi.

Provided all Pods are running the driver is now ready for use. See the following sections for how to get started using the driver.


Basic Use

This section provides a quick summary of basic driver use and functionality. Please see the full usage documentation for a complete overview of all available functionality. The driver was designed to support both dynamic and static storage provisioning and allows directories in BeeGFS to be used as Persistent Volumes (PVs) in Kubernetes. Pods with Persistent Volume Claims (PVCs) are only able to see/access the specified directory (and any subdirectories), providing isolation between multiple applications and users using the same BeeGFS file system when desired.

Dynamic Storage Provisioning:

Administrators create a Storage Class in Kubernetes referencing at minimum a specific BeeGFS file system and parent directory within that file system. Users can then submit PVCs against the Storage Class, and are provided isolated access to new directories under the parent specified in the Storage Class. See the process in action on NetApp TV.

Static Provisioning:

Administrators create a PV and PVC representing an existing directory in a BeeGFS file system. This is useful for exposing some existing dataset or shared directory to Kubernetes users and applications. See the process in action on NetApp TV.

Examples

Example Kubernetes manifests of how to use the driver are provided. These are meant to be repurposed to simplify creating objects related to the driver including Storage Classes, Persistent Volumes, and Persistent Volume Claims in your environment.


Contributing to the Project

The BeeGFS CSI Driver maintainers welcome improvements from the BeeGFS and open source community! Please see CONTRIBUTING.md for how to get started.


Releases

The goal is to release a new driver version three to four times per year (roughly quarterly). Releases may be major, minor, or patch at the discretion of the maintainers in accordance with needs of the community (i.e. large features, small features, or miscellaneous bug fixes).


Versioning

The BeeGFS CSI driver versioning is based on the semantic versioning scheme outlined at semver.org. According to this scheme, given a version number MAJOR.MINOR.PATCH, we increment the:

  • MAJOR version when:
    • We make significant code changes beyond just a new feature.
    • Backwards incompatible changes are made.
  • MINOR version when:
    • New driver features are added.
    • New versions of Kubernetes or BeeGFS are supported.
  • PATCH version when: small bug or security fixes are needed in a more timely manner.

When upgrading the driver using default Kustomize (kubectl) deployment option, it is recommended to reference the upgrade notes for your particular upgrade path. While the driver itself may be backwards compatible, if you used a non-standard file layout or customized the Kubernetes manifests used to deploy the driver, you may need to make adjustments to your manifests.


License

Apache License 2.0


Footnotes

  1. Support for the BeeGFS 7.1.5 filesystem is provided when the BeeGFS 7.2.x client is used. These configurations were tested in that manner. 2 3 4 5

beegfs-csi-driver's People

Contributors

austinmajor avatar ckrizek avatar ejweber avatar gmarks-ntap avatar iamjoemccormick avatar jparnell-ntap avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

beegfs-csi-driver's Issues

metrics endpoint will not be started because `metrics-address` was not specified

Hi,

when csi-provisioner container inside csi-beegfs-controller-0 pod it logs a warning with the message:

W0617 12:47:35.586467       1 metrics.go:142] metrics endpoint will not be started because `metrics-address` was not specified.

Full log:

I0617 12:47:31.644718       1 csi-provisioner.go:121] Version: v2.0.2
I0617 12:47:31.644795       1 csi-provisioner.go:135] Building kube configs for running in cluster...
I0617 12:47:31.651298       1 connection.go:153] Connecting to unix:///csi/csi.sock
I0617 12:47:35.580545       1 common.go:111] Probing CSI driver for readiness
I0617 12:47:35.580566       1 connection.go:182] GRPC call: /csi.v1.Identity/Probe
I0617 12:47:35.580570       1 connection.go:183] GRPC request: {}
I0617 12:47:35.585663       1 connection.go:185] GRPC response: {}
I0617 12:47:35.585730       1 connection.go:186] GRPC error: <nil>
I0617 12:47:35.585742       1 connection.go:182] GRPC call: /csi.v1.Identity/GetPluginInfo
I0617 12:47:35.585745       1 connection.go:183] GRPC request: {}
I0617 12:47:35.586397       1 connection.go:185] GRPC response: {"name":"beegfs.csi.netapp.com","vendor_version":"v1.1.0-0-gc65b537"}
I0617 12:47:35.586447       1 connection.go:186] GRPC error: <nil>
I0617 12:47:35.586458       1 csi-provisioner.go:182] Detected CSI driver beegfs.csi.netapp.com
W0617 12:47:35.586467       1 metrics.go:142] metrics endpoint will not be started because `metrics-address` was not specified.
I0617 12:47:35.586477       1 connection.go:182] GRPC call: /csi.v1.Identity/GetPluginCapabilities
I0617 12:47:35.586481       1 connection.go:183] GRPC request: {}
I0617 12:47:35.587155       1 connection.go:185] GRPC response: {"capabilities":[{"Type":{"Service":{"type":1}}}]}
I0617 12:47:35.587274       1 connection.go:186] GRPC error: <nil>
I0617 12:47:35.587284       1 connection.go:182] GRPC call: /csi.v1.Controller/ControllerGetCapabilities
I0617 12:47:35.587288       1 connection.go:183] GRPC request: {}
I0617 12:47:35.587831       1 connection.go:185] GRPC response: {"capabilities":[{"Type":{"Rpc":{"type":1}}}]}
I0617 12:47:35.588083       1 connection.go:186] GRPC error: <nil>
I0617 12:47:35.588233       1 csi-provisioner.go:210] CSI driver does not support PUBLISH_UNPUBLISH_VOLUME, not watching VolumeAttachments
I0617 12:47:35.588790       1 controller.go:735] Using saving PVs to API server in background
I0617 12:47:35.689094       1 volume_store.go:97] Starting save volume queue

Does this driver support Prometheus metrics? If does, how they can be enabled? And is there any way that the Kubernetes admin can track how much storage is in use?

Plans for supporting K8s 1.23.x and 1.24.x

Hi,

I was wondering about plans to support newer versions of K8s. Are there plans to support beegfs 7.1.5 on the K8s version 1.23.x and other versions since 1.24 is planned for April?

Failed to start controller : can t find valid default client configuration template file or file not json

Hi Team,

i deploy beegfs-csi-driver using kubectl apply -k deploy/k8s/overlays/default
then i update the configmap csi-beegfs-config to add :

...
data:
  csi-beegfs-config.yaml: |
    config:
      connInterfaces:
      - ib0
      connRDMAInterfaces:
      - ib0
      beegfsClientConf:
        sysMgmtdHost: 172.19.204.1
        connClientPortUDP: 8004
        connHelperdPortTCP: 8006
        connMgmtdPortTCP: 8008
        connMgmtdPortUDP: 8008
        connPortShift: 0
        connCommRetrySecs: 600
        connFallbackExpirationSecs: 900
        connMaxInternodeNum: 12
        connMaxConcurrentAttempts: 0
        connUseRDMA: true
        connRDMABufNum: 70
        connRDMABufSize: 8192
        connRDMATypeOfService: 0
        logClientID: false
        logLevel: 3
        logType: helperd
        quotaEnabled: false
        sysCreateHardlinksAsSymlinks: false
        sysMountSanityCheckMS: 11000
        sysSessionCheckOnClose: false
        sysSyncOnClose: false
        sysTargetOfflineTimeoutSecs: 900
        sysUpdateTargetStatesSecs: 30
        sysXAttrsEnabled: false
        tuneFileCacheType: buffered
        tuneRemoteFSync: true
        tuneUseGlobalAppendLocks: false
        tuneUseGlobalFileLocks: false
...

but pod csi-beegfs-controller-0 is still in error :

"msg"="Fatal: Failed to initialize driver" "error"="failed to get valid default client configuration template file" "fullError"="failed to get valid default client configuration template file\ngithub.com/netapp/beegfs-csi-driver/pkg/beegfs.newBeegfsDriver\n\t/var/lib/jenkins/workspace/beegfs-csi-driver_master@2/pkg/beegfs/beegfs.go:225\ngithub.com/netapp/beegfs-csi-driver/pkg/beegfs.NewBeegfsDriver\n\t/var/lib/jenkins/workspace/beegfs-csi-driver_master@2/pkg/beegfs/beegfs.go:160\nmain.handle\n\t/var/lib/jenkins/workspace/beegfs-csi-driver_master@2/cmd/beegfs-csi-driver/main.go:67\nmain.main\n\t/var/lib/jenkins/workspace/beegfs-csi-driver_master@2/cmd/beegfs-csi-driver/main.go:62\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:255\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1581" "goroutine"="main"

What do i miss ?

Add support for ARM64

Just wanted to report that it builds okay:

  • BeeGFS CSI driver v1.2.1-0-g316c1cd
  • Ubuntu 20.04.4 LTS (4.9.277-83) with go v1.17
$ make
./release-tools/verify-go-version.sh "go"

======================================================
                  WARNING

  This projects is tested with Go v1.15.
  Your current Go version is v1.17.
  This may or may not be close enough.

  In particular test-gofmt and test-vendor
  are known to be sensitive to the version of
  Go.
======================================================

mkdir -p bin
echo '' | tr ';' '\n' | while read -r os arch suffix; do \
	if ! (set -x; CGO_ENABLED=0 GOOS="$os" GOARCH="$arch" go build  -a -ldflags ' -X main.version=v1.2.1-0-g316c1cd  -extldflags "-static"' -o "./bin/beegfs-csi-driver$suffix" ./cmd/beegfs-csi-driver); then \
		echo "Building beegfs-csi-driver for GOOS=$os GOARCH=$arch failed, see error(s) above."; \
		exit 1; \
	fi; \
done
+ CGO_ENABLED=0 GOOS= GOARCH= go build -a -ldflags  -X main.version=v1.2.1-0-g316c1cd  -extldflags "-static" -o ./bin/beegfs-csi-driver ./cmd/beegfs-csi-driver

$ ./bin/beegfs-csi-driver -version
beegfs-csi-driver v1.2.1-0-g316c1cd

Request for feedback

A couple of months ago, ThinkParQ posted a survey in the BeeGFS newsletter and the BeeGFS user group asking for feedback on the current and future use of BeeGFS in containers and in the cloud (public, hybrid, private, or otherwise). This survey was created in collaboration with the BeeGFS CSI driver maintainers in the hopes of better understanding what features we should add to the 1.x version of the driver (and what a 2.x version of the driver might look like). Unfortunately, the survey didn't receive enough responses to be particularly useful. If you have not already taken the survey please consider doing so at the link below!

BeeGFS Cloud and Containerization Survey

Getting "failed to get valid default client" error

Hello, I tried configuring the driver but I am struggling with an issue with the client configuration. Can you please help me identify what is wrong in my installation. attached a screenshot of the error:
image001

Add ReadWriteOncePod?

I'm not sure if this would be useful to BeeGFS CSI users, but a new flavor of RWO called ReadWriteOncePod could be added if there's a use case for it.

It seems like an anti-feature for BeeGFS CSI... Maybe there's a use case for lowering risks of application-induced data corruption in RWO workloads?

cannot mount the same beegfs volume twice

I have a previous beegfs volume to which I manually create a PV and a PVC to make it available in a certain namespace. Now I would like to have it mounted on a different namespace, so I have created a PV and a new PVC but it throws:

  ----     ------         ----  ----                         -------
  Warning  ClaimMisbound  38s   persistentvolume-controller  Two claims are bound to the same volume, this one is bound incorrectly

Is it allowed ? We have also commercial support if needed.

Upgrade Operator SDK and Operator dependancies

Further updating various Operator dependancies breaks things in a number of places. While currently there aren't any pressing reasons to further update those dependancies, in the future we should first update the SDK version as it is possible that will help smooth over the rest of the dependency updates.

Failed mount after kubernetes worker node upgrade from v1.23.15 to v1.24.9

Hi!

So after upgrading half of my worker nodes to new kubernetes (v1.24.9) I noticed that some of the pods got stuck in failed mount.

Warning  FailedMount  15s (x6 over 31s)  kubelet
MountVolume.MountDevice failed for volume "pvc-5bc91a74" : rpc error: code = 
Internal desc = stat /var/lib/kubelet/plugins/kubernetes.io/csi/beegfs.csi.netapp.com/874cf8f302b0da66de76a4edb4ca3f7e0c5f7a6f25ad368e8ce8fda969225eb5/globalmount: no such file or directory

To get them up and running again I forced them to use nodes with the old kubernetes version (v1.23.15) and that works.

Versions:

  • BeeGFS: v7.3.2
  • CSI Driver: v1.3.0

Regarding the csi driver deployment I am using the k8s one from the repo.
Config:

config:
  beegfsClientConf:
    connClientPortUDP: "8028"
    connDisableAuthentication: "true"
    logType: "helperd"

And the only modification I had to make was in csi-beegfs-node.yaml where I set the plugins-mount-dir to /var/lib/kubelet/plugins/kubernetes.io/csi/pv instead of /var/lib/kubelet/plugins/kubernetes.io/csi

The kubernetes 1.23.15 worker node directory structure of /var/lib/kubelet/plugins/kubernetes.io/csi

tree -L 4
.
└── pv
    ├── pvc-01ba9661
    │   ├── globalmount
    │   │   ├── beegfs-client.conf
    │   │   └── mount
    │   └── vol_data.json
    ├── pvc-03357f3e
    │   ├── globalmount
    │   │   ├── beegfs-client.conf
    │   │   └── mount
    │   └── vol_data.json
...

The kubernetes 1.24.9 worker node directory structure of /var/lib/kubelet/plugins/kubernetes.io/csi

tree -L 4
.
├── beegfs.csi.netapp.com
└── pv
    ├── pvc-090f23e1
    │   ├── globalmount
    │   │   ├── beegfs-client.conf
    │   │   └── mount
    │   └── vol_data.json
    ├── pvc-14ba4b44
    │   ├── globalmount
    │   │   ├── beegfs-client.conf
    │   │   └── mount
    │   └── vol_data.json
...

So for some reason the node with the newer kubernetes version has an empty beegfs.csi.netapp.com directory.
Why are the pods on the "new" nodes trying to mount this other location? Is the v1.3.0 version of the driver incompatible with kubernetes 1.24.9? Should I upgrade the driver to v1.4.0?

Please say if you need any more info.

Thanks in advance!

base64 for `ConnAuthConfig.ConnAuth`

Our conn_auth is random binary bytes, and I didn't find a way to put it in the csi-beegfs-connauth.yaml file. Is there any possible to support base64 encoding for this field?

eg.

// ConnAuthConfig associates a ConnAuth with a SysMgmtdHost.
type ConnAuthConfig struct {
        SysMgmtdHost string `json:"sysMgmtdHost"`
        ConnAuth     string `json:"connAuth"`
        
        // Add new field for configuration
        Encoding string `json:"encoding"`
}

The Encoding field could be either raw or base64

Recreate all end-to-end tests in GitHub Actions

See the "End-to-End Test" stage and runIntegrationSuite() from the original Jenkinsfile. More documentation is also in the test/e2e/README.md. This includes both the driver deployed with and without the operator as there doesn't seem to be a separate test suite/command.

Unless we plan to continue using statically deployed clusters for these tests, much of what was there needs to be reworked. This should actually simplify things greatly as all of the pieces to clean up old tests no longer apply. We can just run the Ginkgo e2e tests once the driver is deployed in the existing e2e job that is already setup to run against a matrix of BeeGFS and Kubernetes versions.

To use the current e2e test setup with Minikube with the Ginkgo e2e tests we will need to deploy a multi-node Minikube cluster. This shouldn't be to difficult since we no longer need to use the "none" driver to use BeeGFS with Minikube so long as we deploy BeeGFS into K8s. Alternatively we would need to deploy K8s clusters outside GitHub Actions (ideally managed through something like Rancher) as self-hosted runners.

Currently e2e tests are built on Ginkgo, and some tests are using deprecated Ginkgo features. Part of this issue is evaluating how we can update our current tests to use the latest version of Ginkgo. If it will take significant effort to migrate the current test suites that should be done as a separate issue where we also reevaluate what else is out there for K8s/CSI e2e tests and if it would be better to start from scratch in something else. Historically many K8s e2e tests used Ginkgo so this seems unlikely, but is worth exploring.

Test Issue

This test confirms issues can be created by the public.

Feedback on Nomad CSI

  • It seems the paths in driver arguments no longer work with 1.3.0 Beta-1. I'm still investigating but basically monolith container fails to start. (In 1.3.0 Nomad will start supplying env var "--endpoint=${CSI_ENDPOINT}" so maybe that will simplify things somewhat). This is what I get no matter how I try to change the paths. I'll try to take another look at this another day.
Apr 20 14:57:10 b5 nomad[6938]:     2022-04-20T14:57:10.809Z [ERROR] client.alloc_runner.task_runner.task_hook: killing task because plugin failed: alloc_id=e2f36449-9542-e65d-0769-6f8e15aa32c3 task=plugin error="CSI plugin failed probe: timeout while connecting to gRPC socket: failed to stat socket: stat /opt/nomad/data/client/csi/plugins/e2f36449-9542-e65d-0769-6f8e15aa32c3/csi.sock: no such file or directory"
  • node-id (seen here) seems to be nodeid in more recent config examples by HashiCorp staff.

  • It's not entirely clear from the instructions if driver has to be deployed on a host (worker) with a functioning BeeGFS client or not. For example, connAuth: in plugin.nomad: if BeeGFS cluster isn't using connAuth, can that section be left out from the plugin job? Or will using the plugin break BeeGFS client if that's left out (or if it's not left out)? I know I'll find out after I get there, but it should be clear before because many may be concerned if plugin can interfere their BeeGFS client

  • beegfs-dkms-client uninstalls beegfs-client (v7.3.0) which is dangerous. If that package isn't necessary, let's not suggest to install it

Consider deploying to dedicated namespace?

I used the "one liner" to deploy, it deployed to kube-system.

Perhaps we should document that this approach doesn't deploy to a dedicated namespace and suggest to create one?

Consider updating the module path and driver name

Currently the module path as defined by go.mod is "github.com/netapp/beegfs-csi-driver" even though the actual path is now "github.com/thinkparq/beegfs-csi-driver". While migrated GitHub repositories setup a redirect (so either go get github.com/netapp/beegfs-csi-driver or go get github.com/thinkparq/beegfs-csi-driver will work), it would be good to eventually update the path to reflect the new repository.

Because this is a CSI driver, most users will be downloading/deploying it using a prebuilt container image and would not be affected by changing the module path. It is much less likely (though not impossible) there are other projects or libraries that are importing these packages and would be broken. Ideally any change to the module path would result in a major version bump.

A change that would be much more impactful to users is changing the driver name from "beegfs.csi.netapp.com" to "beegfs.csi.thinkparq.com". As this would almost certainly require bumping the major version it makes sense to plan for these two changes to happen together.

As there is no pressing reason to make either of these changes, we'll put them on the back burner until there is a more compelling reason to ship a 2.0 version.

Add support for expanding volumes

I tried to use beegfs-csi to create a PVC, and then I expanded the capacity on the basis of this PVC. I found that it was not successful. Is expansion of PVC supported now?

failed to read client configuration template file: open /host/etc/beegfs/beegfs-client.conf: no such file or directory

I used to have the controller running on a master node via affinity rule, (version 1.2.0) but now that I have upgraded to v1.2.1 I had to remove the affinity rule otherwise the beegfs container would fail like this:

16:59 # kubectl logs -f -n kube-system csi-beegfs-controller-0 beegfs 
E0725 14:57:42.243667       1 main.go:70]  "msg"="Fatal: Failed to initialize driver" "error"="failed to read client configuration template file: open /host/etc/beegfs/beegfs-client.conf: no such file or directory" "fullError"="open /host/etc/beegfs/beegfs-client.conf: no such file or directory\nfailed to read client configuration template file" "goroutine"="main" 

So my master nodes never had the beegfs requirements installed (everything that is required to be installed like client) and it always worked, so my question is if there has been any change introduced that we required the controller to be scheduled in a beegfs capable client node, and if not, what could be the issue.

Cheers.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.