Hey Guys, Creating a disk in gwcli

<a target="_blank" rel="noopener noreferrer nofollow" href="https://user-images.github

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Python3 - KeyError: 'pool' when 'refresh' is invoked about ceph-iscsi HOT 16 OPEN

ceph commented on June 30, 2024

Python3 - KeyError: 'pool' when 'refresh' is invoked

from ceph-iscsi.

Comments (16)

mikechristie commented on June 30, 2024 2

Yes, it's only needed because that package makes the dirs that rtslib uses. I'm not sure why rtslib has some dirs names hardcoded but then relies on other apps to make them.

from ceph-iscsi.

gvikram18 commented on June 30, 2024

Hi,

I am facing the same issue. I have downloaded ceph-iscsi-3.0 and tcmu-runner-1.4.0 from shaman

rbd-api-target gives the following status message

from ceph-iscsi.

dillaman commented on June 30, 2024

Can you provide a (sanitized) copy of your "gateway.conf" (rados -p rbd get gateway.conf -)? The "pool" attribute has been a part of the disk structure for a very long time.

from ceph-iscsi.

gvikram18 commented on June 30, 2024

from ceph-iscsi.

mikechristie commented on June 30, 2024

@gvikram18

This is a new install right? You didn't start from a old 2.x ceph-iscsi-config or github commit did you?

If this is a new install, I think the bug is that the initial creation failed but did not fully clean itself up. The second creation reported success but did not fully set it up.

What version is your rtslib? And is a distro rpm or did you install the upstream one from GitHub?

Do you have targetcli installed and if so is that a distro or upstream one?

Could you start from a clean slate? Do the following:

Make a /etc/target and /etc/target/pr dir if you do not have it.

It looks like there is a bug in some rtslib versions where if tagretcli has not created the /etc/target (or /var/target or it is not specified by or dir then when we try to create a device we will get a failure. This is due to some rtslib code checking for that dir and the pr dir in there or in configfs.

Start from a clean slate. Delete the bad gateway.conf

rados -p rbd rm gateway.conf

Restart the gws. Either reboot the node or stop and start the rbd-target-api service.

from ceph-iscsi.

mikechristie commented on June 30, 2024

The above comment is not correct. It looks like we fixed all the partial setup errors by 3.0.

Starting from a clean slate like described above, can you provide the /var/log/rbd-target-api/rbd-target-api.log for when you try to create the disk? I cannot replicate the issue here.

from ceph-iscsi.

wwdillingham commented on June 30, 2024

I just encountered the above issue:
Running:

[root@cephigw002-v06c ~]# rpm -qa | grep -i -e rtslib -e iscsi -e tcmu
python-rtslib-2.1.fb68-1.noarch
libiscsi-1.9.0-7.el7.x86_64
tcmu-runner-1.4.0-106.gd17d24e.el7.x86_64
ceph-iscsi-3.0.1-1.el7.noarch
libtcmu-1.4.0-106.gd17d24e.el7.x86_64

python 2.7.5

The issue occured when attempting to add a disk directly through the rbd-target-api (using curl) where the error received was:

disk create/update failed on vm1cephigw002. Unhandled exception: 'backstore_object_name'

when using 'gwcli -d' I got the same "KeyError: 'pool'" as above in the original post.

I fixed it by pulling down the configuration object, and searching for the conf for the disk name that I attempted to create. I noticed that the disk was different from the other, working, disks in that it only had the "created" key and lacked the others "pool", "allocating_host" etc. Upon removing the json section for this disk reuploading via rados put and finally restarting rbd-target-api on all GWs things were back to normal.

Thankfully this was on our DEV cluster so am not sure if this would have been disruptive to client IO in a production cluster but wondering if this fixed in a later release? Thanks.

from ceph-iscsi.

mikechristie commented on June 30, 2024

@wwdillingham

Sorry for the late reply. I have been on PTO. It is not fixed yet. I am not able to replicate the problem and was waiting on logs in my last comment.

Can you:

Give me the curl command you used? Maybe we are parsing a specific string wrong, so if possible could you give me the exact values you used?
Does it happen every time you run the command?
Did gwcli disk creation work?
Could you give me the /var/log/rbd-target-api/rbd-target-api.log for when this happens?

from ceph-iscsi.

mikechristie commented on June 30, 2024

Could you give me the /var/log/rbd-target-api/rbd-target-api.log for when this happens?

Oh yeah, since this was days ago now, the log info might be in the /var/log/rbd-target-api dir in one of the gzipped up files.

from ceph-iscsi.

wwdillingham commented on June 30, 2024

@mikechristie

I spoke incorrectly the initial API call was with a .NET framework an external client is using via the exposed rbd-target-api. Also the error msg I initially gave you from the API "disk create/update failed on vm1cephigw002. Unhandled exception: 'backstore_object_name'" was in fact from subsequent failures (including via curl - all made while the config object was in broken state), not from the first attempt.
However I can report that the request was made with a

PUT /disk/rbd/plesk_test0
body: "mode=create&size=256m&pool=rbd&create_image=true"

It does not happen every time we run the command. The same method that initially failed subsequently worked after restoring the config object and removing the rbd via the rbd command.
I did not attempt to do a gwcli disk creation because I was unable to "enter" gwcli, gwcli would error out with: "KeyError: Pool"
I can get you all logs that I have but would prefer to send off github, how can i best get them to you? I can also provide the contents of the config object in its errored state.

Further, I can say that I was able to quickly bounce the rbd-target-api service on our IGW01, but not on our IGW02 (which is the node listed in the rbd-target-api error above).

from ceph-iscsi.

mikechristie commented on June 30, 2024

Ok, I see one way to hit it now, but am not sure if everyone is hitting the same thing.

@wwdillingham on the iscsi target systems:

Do you have targetcli installs on all of them? Is one of the systems missing it?
Do all the systems have a /etc/target or /var/target?

It seems some versions of rtslib require one of those dirs. If you install targetcli then they will get made. If you do not have the dirs, then we can end up partially creating the disk. We will then hit other bugs because it only got partially created and not fully setup, and we did not fully clean it up when it failed.

from ceph-iscsi.

mikechristie commented on June 30, 2024

So no matter what we need to fix the error handler so it fully cleans up partially created disks so if we hit any failures we do not end up in this state.

We also need to fix rtslib/targetcli so rtslib creates the dirs it needs. As a temp hack we can just install targetcli and/or have ceph-iscsi make the dirs.

from ceph-iscsi.

wwdillingham commented on June 30, 2024

@mikechristie no package matching "targetcli" on either of the IGW nodes. Also neither of those directories exist.

from ceph-iscsi.

wwdillingham commented on June 30, 2024

So is targetcli package needed ONLY for the purposes of creating those dirs? /etc/target & /var/target ? I can count myself lucky I haven't encountered more problems I think.

from ceph-iscsi.

mikechristie commented on June 30, 2024

Just one clarification. The targetcli rpm and/or if your distro has it the target-restore rpm makes the dirs.

If you re installing from the upstream repo tarball releases or from the GH repo source code, then you have to manually make the dirs as a temp workaround.

from ceph-iscsi.

wwdillingham commented on June 30, 2024

@mikechristie thanks for your help on this one. I have always been pulling my RPMs from shaman, perhaps why I overlooked targetcli. I was able to snatch targetcli 2.1.fb49-1.el7 from base centos repos. This created /etc/target but not /var/target.

I think the feature of cleaning up partially created disks or otherwise validating the config object as correct before committing would be great. Thanks again for the help.

from ceph-iscsi.

Python3 - KeyError: 'pool' when 'refresh' is invoked about ceph-iscsi HOT 16 OPEN

Comments (16)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent