Git Product home page Git Product logo

Comments (6)

chien-intel avatar chien-intel commented on July 24, 2024

This is not a bug. You need to use verbs;ofi_rxm as shown from your fi_info output.

from libfabric.

jordialcaraz avatar jordialcaraz commented on July 24, 2024

Thanks Chien.

I had also tried with verbs;ofi_rxm, but although fi_info works, fi_pingpong fails (it looks for ofi_rxm at the end, instead of verbs;ofi_rxm):

$ FI_PROVIDER="verbs;ofi_rxm" FI_LOG_LEVEL=Debug fi_pingpong
libfabric:4113849:1708031595::core:core:fi_param_get_():373 variable perf_cntr=
libfabric:4113849:1708031595::core:core:fi_param_get_():373 variable hook=
libfabric:4113849:1708031595::core:core:fi_param_get_():373 variable hmem=
libfabric:4113849:1708031595::core:core:ofi_hmem_init():607 Hmem iface FI_HMEM_CUDA not supported
libfabric:4113849:1708031595::core:core:ofi_hmem_init():607 Hmem iface FI_HMEM_ROCR not supported
libfabric:4113849:1708031595::core:core:ofi_hmem_init():607 Hmem iface FI_HMEM_ZE not supported
libfabric:4113849:1708031595::core:core:ofi_hmem_init():607 Hmem iface FI_HMEM_NEURON not supported
libfabric:4113849:1708031595::core:core:ofi_hmem_init():607 Hmem iface FI_HMEM_SYNAPSEAI not supported
libfabric:4113849:1708031595::core:core:fi_param_get_():373 variable hmem_disable_p2p=
libfabric:4113849:1708031595::core:core:fi_param_get_():373 variable mr_cache_max_size=
libfabric:4113849:1708031595::core:core:fi_param_get_():373 variable mr_cache_max_count=
libfabric:4113849:1708031595::core:core:fi_param_get_():373 variable mr_cache_monitor=
libfabric:4113849:1708031595::core:core:fi_param_get_():373 variable mr_cuda_cache_monitor_enabled=
libfabric:4113849:1708031595::core:core:fi_param_get_():373 variable mr_rocr_cache_monitor_enabled=
libfabric:4113849:1708031595::core:core:fi_param_get_():373 variable mr_ze_cache_monitor_enabled=
libfabric:4113849:1708031595::core:mr:ofi_default_cache_size():79 default cache size=526983472
libfabric:4113849:1708031595::core:core:fi_param_get_():382 read string var provider=verbs;ofi_rxm
libfabric:4113849:1708031595::core:core:fi_param_get_():373 variable universe_size=
libfabric:4113849:1708031595::core:core:fi_param_get_():373 variable av_remove_cleanup=
libfabric:4113849:1708031595::core:core:fi_param_get_():373 variable offload_coll_provider=
libfabric:4113849:1708031595::core:core:fi_param_get_():373 variable provider_path=
libfabric:4113849:1708031595::ofi_rxm:core:fi_param_get_():373 variable enable_passthru=
libfabric:4113849:1708031595::ofi_rxm:core:fi_param_get_():373 variable buffer_size=
libfabric:4113849:1708031595::ofi_rxm:core:fi_param_get_():373 variable tx_size=
libfabric:4113849:1708031595::ofi_rxm:core:fi_param_get_():373 variable rx_size=
libfabric:4113849:1708031595::ofi_rxm:core:fi_param_get_():373 variable msg_tx_size=
libfabric:4113849:1708031595::ofi_rxm:core:fi_param_get_():373 variable msg_rx_size=
libfabric:4113849:1708031595::ofi_rxm:core:fi_param_get_():373 variable cm_progress_interval=
libfabric:4113849:1708031595::ofi_rxm:core:fi_param_get_():373 variable cq_eq_fairness=
libfabric:4113849:1708031595::ofi_rxm:core:fi_param_get_():373 variable data_auto_progress=
libfabric:4113849:1708031595::ofi_rxm:core:fi_param_get_():373 variable use_rndv_write=
libfabric:4113849:1708031595::ofi_rxm:core:fi_param_get_():373 variable def_wait_obj=
libfabric:4113849:1708031595::ofi_rxm:core:fi_param_get_():373 variable def_tcp_wait_obj=
libfabric:4113849:1708031595::core:core:ofi_register_provider():506 registering provider: ofi_rxm (120.10)
libfabric:4113849:1708031595::core:core:ofi_register_provider():506 registering provider: verbs (120.10)
libfabric:4113849:1708031595::core:core:ofi_register_provider():506 registering provider: ofi_hook_perf (120.10)
libfabric:4113849:1708031595::core:core:ofi_register_provider():506 registering provider: ofi_hook_trace (120.10)
libfabric:4113849:1708031595::core:core:ofi_register_provider():506 registering provider: ofi_hook_debug (120.10)
libfabric:4113849:1708031595::core:core:fi_param_get_():373 variable hmem=
libfabric:4113849:1708031595::core:core:ofi_hmem_init():607 Hmem iface FI_HMEM_CUDA not supported
libfabric:4113849:1708031595::core:core:ofi_hmem_init():607 Hmem iface FI_HMEM_ROCR not supported
libfabric:4113849:1708031595::core:core:ofi_hmem_init():607 Hmem iface FI_HMEM_ZE not supported
libfabric:4113849:1708031595::core:core:ofi_hmem_init():607 Hmem iface FI_HMEM_NEURON not supported
libfabric:4113849:1708031595::core:core:ofi_hmem_init():607 Hmem iface FI_HMEM_SYNAPSEAI not supported
libfabric:4113849:1708031595::core:core:fi_param_get_():373 variable hmem_disable_p2p=
libfabric:4113849:1708031595::core:core:ofi_register_provider():506 registering provider: ofi_hook_hmem (120.10)
libfabric:4113849:1708031595::core:core:ofi_register_provider():506 registering provider: ofi_hook_dmabuf_peer_mem (120.10)
libfabric:4113849:1708031595::core:core:ofi_register_provider():506 registering provider: ofi_hook_noop (120.10)
libfabric:4113849:1708031595::core:core:ofi_register_provider():506 registering provider: off_coll (120.10)
libfabric:4113849:1708031595::verbs:core:fi_param_get_():373 variable tx_size=
libfabric:4113849:1708031595::verbs:core:fi_param_get_():373 variable rx_size=
libfabric:4113849:1708031595::verbs:core:fi_param_get_():373 variable tx_iov_limit=
libfabric:4113849:1708031595::verbs:core:fi_param_get_():373 variable rx_iov_limit=
libfabric:4113849:1708031595::verbs:core:fi_param_get_():373 variable inline_size=
libfabric:4113849:1708031595::verbs:core:fi_param_get_():373 variable min_rnr_timer=
libfabric:4113849:1708031595::verbs:core:fi_param_get_():373 variable use_odp=
libfabric:4113849:1708031595::verbs:core:fi_param_get_():373 variable prefer_xrc=
libfabric:4113849:1708031595::verbs:core:fi_param_get_():373 variable xrcd_filename=
libfabric:4113849:1708031595::verbs:core:fi_param_get_():373 variable cqread_bunch_size=
libfabric:4113849:1708031595::verbs:core:fi_param_get_():373 variable gid_idx=
libfabric:4113849:1708031595::verbs:core:fi_param_get_():373 variable device_name=
libfabric:4113849:1708031595::verbs:core:fi_param_get_():373 variable use_dmabuf=
libfabric:4113849:1708031595::verbs:core:vrb_read_params():720 dmabuf support is enabled
libfabric:4113849:1708031595::verbs:core:fi_param_get_():373 variable iface=
libfabric:4113849:1708031595::verbs:core:fi_param_get_():373 variable dgram_use_name_server=
libfabric:4113849:1708031595::verbs:core:fi_param_get_():373 variable dgram_name_server_port=
libfabric:4113849:1708031595::verbs:fabric:verbs_devs_print():889 list of verbs devices found for FI_EP_MSG:
libfabric:4113849:1708031596::verbs:fabric:vrb_get_device_attrs():620 device mlx5_0: first found active port is 1
libfabric:4113849:1708031596::verbs:fabric:vrb_get_device_attrs():620 device mlx5_0: first found active port is 1
libfabric:4113849:1708031596::verbs:fabric:vrb_get_device_attrs():620 device mlx5_0: first found active port is 1
libfabric:4113849:1708031596::verbs:fabric:vrb_get_matching_info():1556 checking domain: #1 mlx5_0
libfabric:4113849:1708031596::verbs:core:ofi_check_ep_type():690 unsupported endpoint type
libfabric:4113849:1708031596::verbs:core:ofi_check_ep_type():691 Supported: FI_EP_MSG
libfabric:4113849:1708031596::verbs:core:ofi_check_ep_type():691 Requested: FI_EP_DGRAM
libfabric:4113849:1708031596::verbs:fabric:vrb_get_matching_info():1556 checking domain: #2 mlx5_0
libfabric:4113849:1708031596::verbs:core:ofi_check_ep_type():690 unsupported endpoint type
libfabric:4113849:1708031596::verbs:core:ofi_check_ep_type():691 Supported: FI_EP_MSG
libfabric:4113849:1708031596::verbs:core:ofi_check_ep_type():691 Requested: FI_EP_DGRAM
libfabric:4113849:1708031596::verbs:fabric:vrb_get_matching_info():1556 checking domain: #3 mlx5_0-xrc
libfabric:4113849:1708031596::verbs:core:ofi_check_ep_type():690 unsupported endpoint type
libfabric:4113849:1708031596::verbs:core:ofi_check_ep_type():691 Supported: FI_EP_MSG
libfabric:4113849:1708031596::verbs:core:ofi_check_ep_type():691 Requested: FI_EP_DGRAM
libfabric:4113849:1708031596::verbs:fabric:vrb_get_matching_info():1556 checking domain: #4 mlx5_0-xrc
libfabric:4113849:1708031596::verbs:core:ofi_check_ep_type():690 unsupported endpoint type
libfabric:4113849:1708031596::verbs:core:ofi_check_ep_type():691 Supported: FI_EP_MSG
libfabric:4113849:1708031596::verbs:core:ofi_check_ep_type():691 Requested: FI_EP_DGRAM
libfabric:4113849:1708031596::verbs:fabric:vrb_get_matching_info():1556 checking domain: #5 mlx5_0-dgram
libfabric:4113849:1708031596::verbs:fabric:vrb_get_matching_info():1601 adding fi_info for domain: mlx5_0-dgram
libfabric:4113849:1708031596::verbs:fabric:vrb_get_matching_info():1556 checking domain: #6 mlx5_0-dgram
libfabric:4113849:1708031596::verbs:fabric:vrb_get_matching_info():1601 adding fi_info for domain: mlx5_0-dgram
libfabric:4113849:1708031596::ofi_rxm:core:ofi_check_ep_type():690 unsupported endpoint type
libfabric:4113849:1708031596::ofi_rxm:core:ofi_check_ep_type():691 Supported: FI_EP_RDM
libfabric:4113849:1708031596::ofi_rxm:core:ofi_check_ep_type():691 Requested: FI_EP_DGRAM
libfabric:4113849:1708031596::ofi_rxm:core:ofi_check_ep_type():690 unsupported endpoint type
libfabric:4113849:1708031596::ofi_rxm:core:ofi_check_ep_type():691 Supported: FI_EP_RDM
libfabric:4113849:1708031596::ofi_rxm:core:ofi_check_ep_type():691 Requested: FI_EP_DGRAM
libfabric:4113849:1708031596::ofi_rxm:core:ofi_check_ep_type():690 unsupported endpoint type
libfabric:4113849:1708031596::ofi_rxm:core:ofi_check_ep_type():691 Supported: FI_EP_RDM
libfabric:4113849:1708031596::ofi_rxm:core:ofi_check_ep_type():691 Requested: FI_EP_DGRAM
libfabric:4113849:1708031596::ofi_rxm:core:ofi_check_ep_type():690 unsupported endpoint type
libfabric:4113849:1708031596::ofi_rxm:core:ofi_check_ep_type():691 Supported: FI_EP_RDM
libfabric:4113849:1708031596::ofi_rxm:core:ofi_check_ep_type():691 Requested: FI_EP_DGRAM
libfabric:4113849:1708031596::ofi_rxm:core:ofi_check_ep_type():690 unsupported endpoint type
libfabric:4113849:1708031596::ofi_rxm:core:ofi_check_ep_type():691 Supported: FI_EP_RDM
libfabric:4113849:1708031596::ofi_rxm:core:ofi_check_ep_type():691 Requested: FI_EP_DGRAM
libfabric:4113849:1708031596::core:core:fi_getinfo_():1304 fi_getinfo: provider ofi_rxm returned -61 (No data available)
fi_getinfo(): util/pingpong.c:1489, ret=-61 (No data available)

Thank you.

from libfabric.

ooststep avatar ooststep commented on July 24, 2024

by default, fi_pingpong uses FI_EP_DGRAM. try fi_pingpong -e rdm

from libfabric.

jordialcaraz avatar jordialcaraz commented on July 24, 2024

With fi_pingpong -e rdm and also -e rdm -p verbs, the output is:

libfabric:4117195:1708032872::core:core:fi_param_get_():373 variable perf_cntr=
libfabric:4117195:1708032872::core:core:fi_param_get_():373 variable hook=
libfabric:4117195:1708032872::core:core:fi_param_get_():373 variable hmem=
libfabric:4117195:1708032872::core:core:ofi_hmem_init():607 Hmem iface FI_HMEM_CUDA not supported
libfabric:4117195:1708032872::core:core:ofi_hmem_init():607 Hmem iface FI_HMEM_ROCR not supported
libfabric:4117195:1708032872::core:core:ofi_hmem_init():607 Hmem iface FI_HMEM_ZE not supported
libfabric:4117195:1708032872::core:core:ofi_hmem_init():607 Hmem iface FI_HMEM_NEURON not supported
libfabric:4117195:1708032872::core:core:ofi_hmem_init():607 Hmem iface FI_HMEM_SYNAPSEAI not supported
libfabric:4117195:1708032872::core:core:fi_param_get_():373 variable hmem_disable_p2p=
libfabric:4117195:1708032872::core:core:fi_param_get_():373 variable mr_cache_max_size=
libfabric:4117195:1708032872::core:core:fi_param_get_():373 variable mr_cache_max_count=
libfabric:4117195:1708032872::core:core:fi_param_get_():373 variable mr_cache_monitor=
libfabric:4117195:1708032872::core:core:fi_param_get_():373 variable mr_cuda_cache_monitor_enabled=
libfabric:4117195:1708032872::core:core:fi_param_get_():373 variable mr_rocr_cache_monitor_enabled=
libfabric:4117195:1708032872::core:core:fi_param_get_():373 variable mr_ze_cache_monitor_enabled=
libfabric:4117195:1708032872::core:mr:ofi_default_cache_size():79 default cache size=526983472
libfabric:4117195:1708032872::core:core:fi_param_get_():373 variable provider=
libfabric:4117195:1708032872::core:core:fi_param_get_():373 variable universe_size=
libfabric:4117195:1708032872::core:core:fi_param_get_():373 variable av_remove_cleanup=
libfabric:4117195:1708032872::core:core:fi_param_get_():373 variable offload_coll_provider=
libfabric:4117195:1708032872::core:core:fi_param_get_():373 variable provider_path=
libfabric:4117195:1708032872::ofi_rxm:core:fi_param_get_():373 variable enable_passthru=
libfabric:4117195:1708032872::ofi_rxm:core:fi_param_get_():373 variable buffer_size=
libfabric:4117195:1708032872::ofi_rxm:core:fi_param_get_():373 variable tx_size=
libfabric:4117195:1708032872::ofi_rxm:core:fi_param_get_():373 variable rx_size=
libfabric:4117195:1708032872::ofi_rxm:core:fi_param_get_():373 variable msg_tx_size=
libfabric:4117195:1708032872::ofi_rxm:core:fi_param_get_():373 variable msg_rx_size=
libfabric:4117195:1708032872::ofi_rxm:core:fi_param_get_():373 variable cm_progress_interval=
libfabric:4117195:1708032872::ofi_rxm:core:fi_param_get_():373 variable cq_eq_fairness=
libfabric:4117195:1708032872::ofi_rxm:core:fi_param_get_():373 variable data_auto_progress=
libfabric:4117195:1708032872::ofi_rxm:core:fi_param_get_():373 variable use_rndv_write=
libfabric:4117195:1708032872::ofi_rxm:core:fi_param_get_():373 variable def_wait_obj=
libfabric:4117195:1708032872::ofi_rxm:core:fi_param_get_():373 variable def_tcp_wait_obj=
libfabric:4117195:1708032872::core:core:ofi_register_provider():506 registering provider: ofi_rxm (120.10)
libfabric:4117195:1708032872::core:core:ofi_register_provider():506 registering provider: verbs (120.10)
libfabric:4117195:1708032872::core:core:ofi_register_provider():506 registering provider: ofi_hook_perf (120.10)
libfabric:4117195:1708032872::core:core:ofi_register_provider():506 registering provider: ofi_hook_trace (120.10)
libfabric:4117195:1708032872::core:core:ofi_register_provider():506 registering provider: ofi_hook_debug (120.10)
libfabric:4117195:1708032872::core:core:fi_param_get_():373 variable hmem=
libfabric:4117195:1708032872::core:core:ofi_hmem_init():607 Hmem iface FI_HMEM_CUDA not supported
libfabric:4117195:1708032872::core:core:ofi_hmem_init():607 Hmem iface FI_HMEM_ROCR not supported
libfabric:4117195:1708032872::core:core:ofi_hmem_init():607 Hmem iface FI_HMEM_ZE not supported
libfabric:4117195:1708032872::core:core:ofi_hmem_init():607 Hmem iface FI_HMEM_NEURON not supported
libfabric:4117195:1708032872::core:core:ofi_hmem_init():607 Hmem iface FI_HMEM_SYNAPSEAI not supported
libfabric:4117195:1708032872::core:core:fi_param_get_():373 variable hmem_disable_p2p=
libfabric:4117195:1708032872::core:core:ofi_register_provider():506 registering provider: ofi_hook_hmem (120.10)
libfabric:4117195:1708032872::core:core:ofi_register_provider():506 registering provider: ofi_hook_dmabuf_peer_mem (120.10)
libfabric:4117195:1708032872::core:core:ofi_register_provider():506 registering provider: ofi_hook_noop (120.10)
libfabric:4117195:1708032872::core:core:ofi_register_provider():506 registering provider: off_coll (120.10)
libfabric:4117195:1708032872::verbs:core:fi_param_get_():373 variable tx_size=
libfabric:4117195:1708032872::verbs:core:fi_param_get_():373 variable rx_size=
libfabric:4117195:1708032872::verbs:core:fi_param_get_():373 variable tx_iov_limit=
libfabric:4117195:1708032872::verbs:core:fi_param_get_():373 variable rx_iov_limit=
libfabric:4117195:1708032872::verbs:core:fi_param_get_():373 variable inline_size=
libfabric:4117195:1708032872::verbs:core:fi_param_get_():373 variable min_rnr_timer=
libfabric:4117195:1708032872::verbs:core:fi_param_get_():373 variable use_odp=
libfabric:4117195:1708032872::verbs:core:fi_param_get_():373 variable prefer_xrc=
libfabric:4117195:1708032872::verbs:core:fi_param_get_():373 variable xrcd_filename=
libfabric:4117195:1708032872::verbs:core:fi_param_get_():373 variable cqread_bunch_size=
libfabric:4117195:1708032872::verbs:core:fi_param_get_():373 variable gid_idx=
libfabric:4117195:1708032872::verbs:core:fi_param_get_():373 variable device_name=
libfabric:4117195:1708032872::verbs:core:fi_param_get_():373 variable use_dmabuf=
libfabric:4117195:1708032872::verbs:core:vrb_read_params():720 dmabuf support is enabled
libfabric:4117195:1708032872::verbs:core:fi_param_get_():373 variable iface=
libfabric:4117195:1708032872::verbs:core:fi_param_get_():373 variable dgram_use_name_server=
libfabric:4117195:1708032872::verbs:core:fi_param_get_():373 variable dgram_name_server_port=
libfabric:4117195:1708032872::verbs:fabric:verbs_devs_print():889 list of verbs devices found for FI_EP_MSG:
libfabric:4117195:1708032873::verbs:fabric:vrb_get_device_attrs():620 device mlx5_0: first found active port is 1
libfabric:4117195:1708032873::verbs:fabric:vrb_get_device_attrs():620 device mlx5_0: first found active port is 1
libfabric:4117195:1708032874::verbs:fabric:vrb_get_device_attrs():620 device mlx5_0: first found active port is 1
libfabric:4117195:1708032874::verbs:fabric:vrb_get_matching_info():1556 checking domain: #1 mlx5_0
libfabric:4117195:1708032874::verbs:core:ofi_check_ep_type():690 unsupported endpoint type
libfabric:4117195:1708032874::verbs:core:ofi_check_ep_type():691 Supported: FI_EP_MSG
libfabric:4117195:1708032874::verbs:core:ofi_check_ep_type():691 Requested: FI_EP_RDM
libfabric:4117195:1708032874::verbs:fabric:vrb_get_matching_info():1556 checking domain: #2 mlx5_0
libfabric:4117195:1708032874::verbs:core:ofi_check_ep_type():690 unsupported endpoint type
libfabric:4117195:1708032874::verbs:core:ofi_check_ep_type():691 Supported: FI_EP_MSG
libfabric:4117195:1708032874::verbs:core:ofi_check_ep_type():691 Requested: FI_EP_RDM
libfabric:4117195:1708032874::verbs:fabric:vrb_get_matching_info():1556 checking domain: #3 mlx5_0-xrc
libfabric:4117195:1708032874::verbs:core:ofi_check_ep_type():690 unsupported endpoint type
libfabric:4117195:1708032874::verbs:core:ofi_check_ep_type():691 Supported: FI_EP_MSG
libfabric:4117195:1708032874::verbs:core:ofi_check_ep_type():691 Requested: FI_EP_RDM
libfabric:4117195:1708032874::verbs:fabric:vrb_get_matching_info():1556 checking domain: #4 mlx5_0-xrc
libfabric:4117195:1708032874::verbs:core:ofi_check_ep_type():690 unsupported endpoint type
libfabric:4117195:1708032874::verbs:core:ofi_check_ep_type():691 Supported: FI_EP_MSG
libfabric:4117195:1708032874::verbs:core:ofi_check_ep_type():691 Requested: FI_EP_RDM
libfabric:4117195:1708032874::verbs:fabric:vrb_get_matching_info():1556 checking domain: #5 mlx5_0-dgram
libfabric:4117195:1708032874::verbs:core:ofi_check_ep_type():690 unsupported endpoint type
libfabric:4117195:1708032874::verbs:core:ofi_check_ep_type():691 Supported: FI_EP_DGRAM
libfabric:4117195:1708032874::verbs:core:ofi_check_ep_type():691 Requested: FI_EP_RDM
libfabric:4117195:1708032874::verbs:fabric:vrb_get_matching_info():1556 checking domain: #6 mlx5_0-dgram
libfabric:4117195:1708032874::verbs:core:ofi_check_ep_type():690 unsupported endpoint type
libfabric:4117195:1708032874::verbs:core:ofi_check_ep_type():691 Supported: FI_EP_DGRAM
libfabric:4117195:1708032874::verbs:core:ofi_check_ep_type():691 Requested: FI_EP_RDM
libfabric:4117195:1708032874::core:core:fi_getinfo_():1304 fi_getinfo: provider verbs returned -61 (No data available)
libfabric:4117195:1708032874::ofi_rxm:core:fi_param_get_():373 variable use_srx=
libfabric:4117195:1708032874:ofi_rxm:verbs:fabric:vrb_get_matching_info():1556 checking domain: #1 mlx5_0
libfabric:4117195:1708032874:ofi_rxm:verbs:fabric:vrb_get_matching_info():1601 adding fi_info for domain: mlx5_0
libfabric:4117195:1708032874:ofi_rxm:verbs:fabric:vrb_get_matching_info():1556 checking domain: #2 mlx5_0
libfabric:4117195:1708032874:ofi_rxm:verbs:fabric:vrb_get_matching_info():1601 adding fi_info for domain: mlx5_0
libfabric:4117195:1708032874:ofi_rxm:verbs:fabric:vrb_get_matching_info():1556 checking domain: #3 mlx5_0-xrc
libfabric:4117195:1708032874:ofi_rxm:verbs:core:ofi_check_ep_attr():775 Provider requires use of shared rx context
libfabric:4117195:1708032874:ofi_rxm:verbs:fabric:vrb_get_matching_info():1556 checking domain: #4 mlx5_0-xrc
libfabric:4117195:1708032874:ofi_rxm:verbs:core:ofi_check_ep_attr():775 Provider requires use of shared rx context
libfabric:4117195:1708032874:ofi_rxm:verbs:fabric:vrb_get_matching_info():1556 checking domain: #5 mlx5_0-dgram
libfabric:4117195:1708032874:ofi_rxm:verbs:core:ofi_check_ep_type():690 unsupported endpoint type
libfabric:4117195:1708032874:ofi_rxm:verbs:core:ofi_check_ep_type():691 Supported: FI_EP_DGRAM
libfabric:4117195:1708032874:ofi_rxm:verbs:core:ofi_check_ep_type():691 Requested: FI_EP_MSG
libfabric:4117195:1708032874:ofi_rxm:verbs:fabric:vrb_get_matching_info():1556 checking domain: #6 mlx5_0-dgram
libfabric:4117195:1708032874:ofi_rxm:verbs:core:ofi_check_ep_type():690 unsupported endpoint type
libfabric:4117195:1708032874:ofi_rxm:verbs:core:ofi_check_ep_type():691 Supported: FI_EP_DGRAM
libfabric:4117195:1708032874:ofi_rxm:verbs:core:ofi_check_ep_type():691 Requested: FI_EP_MSG
libfabric:4117195:1708032874:ofi_rxm:core:core:ofi_layering_ok():1183 Provider ofi_rxm is excluded
libfabric:4117195:1708032874::ofi_rxm:core:ofi_check_fabric_attr():412 Requesting provider verbs, skipping tcp
libfabric:4117195:1708032874::ofi_rxm:core:ofi_check_fabric_attr():412 Requesting provider verbs, skipping tcp
libfabric:4117195:1708032874::ofi_rxm:core:fi_param_get_():373 variable use_srx=
libfabric:4117195:1708032874:ofi_rxm:verbs:fabric:vrb_get_matching_info():1556 checking domain: #1 mlx5_0
libfabric:4117195:1708032874:ofi_rxm:verbs:fabric:vrb_get_matching_info():1601 adding fi_info for domain: mlx5_0
libfabric:4117195:1708032874:ofi_rxm:verbs:fabric:vrb_get_matching_info():1556 checking domain: #2 mlx5_0
libfabric:4117195:1708032874:ofi_rxm:verbs:fabric:vrb_get_matching_info():1601 adding fi_info for domain: mlx5_0
libfabric:4117195:1708032874:ofi_rxm:verbs:fabric:vrb_get_matching_info():1556 checking domain: #3 mlx5_0-xrc
libfabric:4117195:1708032874:ofi_rxm:verbs:core:ofi_check_ep_attr():775 Provider requires use of shared rx context
libfabric:4117195:1708032874:ofi_rxm:verbs:fabric:vrb_get_matching_info():1556 checking domain: #4 mlx5_0-xrc
libfabric:4117195:1708032874:ofi_rxm:verbs:core:ofi_check_ep_attr():775 Provider requires use of shared rx context
libfabric:4117195:1708032874:ofi_rxm:verbs:fabric:vrb_get_matching_info():1556 checking domain: #5 mlx5_0-dgram
libfabric:4117195:1708032874:ofi_rxm:verbs:core:ofi_check_ep_type():690 unsupported endpoint type
libfabric:4117195:1708032874:ofi_rxm:verbs:core:ofi_check_ep_type():691 Supported: FI_EP_DGRAM
libfabric:4117195:1708032874:ofi_rxm:verbs:core:ofi_check_ep_type():691 Requested: FI_EP_MSG
libfabric:4117195:1708032874:ofi_rxm:verbs:fabric:vrb_get_matching_info():1556 checking domain: #6 mlx5_0-dgram
libfabric:4117195:1708032874:ofi_rxm:verbs:core:ofi_check_ep_type():690 unsupported endpoint type
libfabric:4117195:1708032874:ofi_rxm:verbs:core:ofi_check_ep_type():691 Supported: FI_EP_DGRAM
libfabric:4117195:1708032874:ofi_rxm:verbs:core:ofi_check_ep_type():691 Requested: FI_EP_MSG
libfabric:4117195:1708032874:ofi_rxm:core:core:ofi_layering_ok():1183 Provider ofi_rxm is excluded
libfabric:4117195:1708032874::ofi_rxm:core:fi_param_get_():373 variable use_srx=
libfabric:4117195:1708032874:ofi_rxm:verbs:fabric:vrb_get_matching_info():1556 checking domain: #1 mlx5_0
libfabric:4117195:1708032874:ofi_rxm:verbs:fabric:vrb_get_matching_info():1601 adding fi_info for domain: mlx5_0
libfabric:4117195:1708032874:ofi_rxm:verbs:fabric:vrb_get_matching_info():1556 checking domain: #2 mlx5_0
libfabric:4117195:1708032874:ofi_rxm:verbs:fabric:vrb_get_matching_info():1601 adding fi_info for domain: mlx5_0
libfabric:4117195:1708032874:ofi_rxm:verbs:fabric:vrb_get_matching_info():1556 checking domain: #3 mlx5_0-xrc
libfabric:4117195:1708032874:ofi_rxm:verbs:core:ofi_check_ep_attr():775 Provider requires use of shared rx context
libfabric:4117195:1708032874:ofi_rxm:verbs:fabric:vrb_get_matching_info():1556 checking domain: #4 mlx5_0-xrc
libfabric:4117195:1708032874:ofi_rxm:verbs:core:ofi_check_ep_attr():775 Provider requires use of shared rx context
libfabric:4117195:1708032874:ofi_rxm:verbs:fabric:vrb_get_matching_info():1556 checking domain: #5 mlx5_0-dgram
libfabric:4117195:1708032874:ofi_rxm:verbs:core:ofi_check_ep_type():690 unsupported endpoint type
libfabric:4117195:1708032874:ofi_rxm:verbs:core:ofi_check_ep_type():691 Supported: FI_EP_DGRAM
libfabric:4117195:1708032874:ofi_rxm:verbs:core:ofi_check_ep_type():691 Requested: FI_EP_MSG
libfabric:4117195:1708032874:ofi_rxm:verbs:fabric:vrb_get_matching_info():1556 checking domain: #6 mlx5_0-dgram
libfabric:4117195:1708032874:ofi_rxm:verbs:core:ofi_check_ep_type():690 unsupported endpoint type
libfabric:4117195:1708032874:ofi_rxm:verbs:core:ofi_check_ep_type():691 Supported: FI_EP_DGRAM
libfabric:4117195:1708032874:ofi_rxm:verbs:core:ofi_check_ep_type():691 Requested: FI_EP_MSG
libfabric:4117195:1708032874:ofi_rxm:core:core:ofi_layering_ok():1183 Provider ofi_rxm is excluded
libfabric:4117195:1708032874::verbs:fabric:vrb_get_matching_info():1556 checking domain: #1 mlx5_0
libfabric:4117195:1708032874::verbs:fabric:vrb_get_matching_info():1601 adding fi_info for domain: mlx5_0
libfabric:4117195:1708032874::verbs:fabric:vrb_get_matching_info():1556 checking domain: #2 mlx5_0
libfabric:4117195:1708032874::verbs:fabric:vrb_get_matching_info():1601 adding fi_info for domain: mlx5_0
libfabric:4117195:1708032874::verbs:fabric:vrb_get_matching_info():1556 checking domain: #3 mlx5_0-xrc
libfabric:4117195:1708032874::verbs:fabric:vrb_get_matching_info():1578 hints->ep_attr->rx_ctx_cnt != FI_SHARED_CONTEXT. Skipping XRC FI_EP_MSG endpoints
libfabric:4117195:1708032874::verbs:fabric:vrb_get_matching_info():1556 checking domain: #4 mlx5_0-xrc
libfabric:4117195:1708032874::verbs:fabric:vrb_get_matching_info():1578 hints->ep_attr->rx_ctx_cnt != FI_SHARED_CONTEXT. Skipping XRC FI_EP_MSG endpoints
libfabric:4117195:1708032874::verbs:fabric:vrb_get_matching_info():1556 checking domain: #5 mlx5_0-dgram
libfabric:4117195:1708032874::verbs:fabric:vrb_get_matching_info():1601 adding fi_info for domain: mlx5_0-dgram
libfabric:4117195:1708032874::verbs:fabric:vrb_get_matching_info():1556 checking domain: #6 mlx5_0-dgram
libfabric:4117195:1708032874::verbs:fabric:vrb_get_matching_info():1601 adding fi_info for domain: mlx5_0-dgram
libfabric:4117195:1708032874::core:core:ofi_layering_ok():1183 Provider ofi_rxm is excluded
libfabric:4117195:1708032874::core:core:fi_fabric_():1504 Opened fabric: IB-0xfe80000000000000
libfabric:4117195:1708032874::ofi_rxm:core:ofi_check_fabric_attr():412 Requesting provider off_coll, skipping verbs
libfabric:4117195:1708032874::ofi_rxm:core:ofi_check_fabric_attr():412 Requesting provider off_coll, skipping tcp
libfabric:4117195:1708032874::ofi_rxm:core:ofi_check_fabric_attr():412 Requesting provider off_coll, skipping tcp
libfabric:4117195:1708032874::ofi_rxm:core:fi_param_get_():373 variable use_srx=
libfabric:4117195:1708032874:ofi_rxm:core:core:ofi_layering_ok():1183 Provider ofi_rxm is excluded
libfabric:4117195:1708032874:ofi_rxm:core:core:ofi_layering_ok():1194 Need core provider, skipping off_coll
libfabric:4117195:1708032874::ofi_rxm:core:fi_param_get_():373 variable use_srx=
libfabric:4117195:1708032874:ofi_rxm:core:core:ofi_layering_ok():1183 Provider ofi_rxm is excluded
libfabric:4117195:1708032874:ofi_rxm:core:core:ofi_layering_ok():1194 Need core provider, skipping off_coll
libfabric:4117195:1708032874::core:core:fi_getinfo_():1304 fi_getinfo: provider ofi_rxm returned -61 (No data available)
libfabric:4117195:1708032874::core:core:fi_fabric_():1504 Opened fabric: UTIL-COLL
libfabric:4117195:1708032874::core:core:fi_fabric_():1504 Opened fabric: IB-0xfe80000000000000
libfabric:4117195:1708032874::ofi_rxm:core:fi_param_get_():373 variable use_srx=
libfabric:4117195:1708032874:ofi_rxm:verbs:fabric:vrb_get_matching_info():1556 checking domain: #1 mlx5_0
libfabric:4117195:1708032874:ofi_rxm:verbs:fabric:vrb_get_matching_info():1601 adding fi_info for domain: mlx5_0
libfabric:4117195:1708032874:ofi_rxm:verbs:fabric:vrb_get_matching_info():1556 checking domain: #2 mlx5_0
libfabric:4117195:1708032874:ofi_rxm:verbs:fabric:vrb_get_matching_info():1601 adding fi_info for domain: mlx5_0
libfabric:4117195:1708032874:ofi_rxm:verbs:fabric:vrb_get_matching_info():1556 checking domain: #3 mlx5_0-xrc
libfabric:4117195:1708032874:ofi_rxm:verbs:core:vrb_check_hints():268 skipping device mlx5_0-xrc (want mlx5_0)
libfabric:4117195:1708032874:ofi_rxm:verbs:fabric:vrb_get_matching_info():1556 checking domain: #4 mlx5_0-xrc
libfabric:4117195:1708032874:ofi_rxm:verbs:core:vrb_check_hints():268 skipping device mlx5_0-xrc (want mlx5_0)
libfabric:4117195:1708032874:ofi_rxm:verbs:fabric:vrb_get_matching_info():1556 checking domain: #5 mlx5_0-dgram
libfabric:4117195:1708032874:ofi_rxm:verbs:core:ofi_check_ep_type():690 unsupported endpoint type
libfabric:4117195:1708032874:ofi_rxm:verbs:core:ofi_check_ep_type():691 Supported: FI_EP_DGRAM
libfabric:4117195:1708032874:ofi_rxm:verbs:core:ofi_check_ep_type():691 Requested: FI_EP_MSG
libfabric:4117195:1708032874:ofi_rxm:verbs:fabric:vrb_get_matching_info():1556 checking domain: #6 mlx5_0-dgram
libfabric:4117195:1708032874:ofi_rxm:verbs:core:ofi_check_ep_type():690 unsupported endpoint type
libfabric:4117195:1708032874:ofi_rxm:verbs:core:ofi_check_ep_type():691 Supported: FI_EP_DGRAM
libfabric:4117195:1708032874:ofi_rxm:verbs:core:ofi_check_ep_type():691 Requested: FI_EP_MSG
libfabric:4117195:1708032874:ofi_rxm:verbs:fabric:vrb_get_rai_id():301 rdma_resolve_addr: Invalid argument (22)
libfabric:4117195:1708032874:ofi_rxm:verbs:fabric:vrb_get_rai_id():303 src addr: fi_sockaddr_ib://[fe80::b83f:d203:2b:b478]:0xffff:0x13f:0x0
libfabric:4117195:1708032874:ofi_rxm:verbs:fabric:vrb_get_rai_id():305 dst addr: (null)
libfabric:4117195:1708032874:ofi_rxm:verbs:fabric:vrb_get_match_infos():1825 handling of the socket address fails - -22
libfabric:4117195:1708032874:ofi_rxm:verbs:core:vrb_get_match_infos():1845 Handling of the addresses fails, the getting infos is unsuccessful
libfabric:4117195:1708032874:ofi_rxm:core:core:fi_getinfo_():1304 fi_getinfo: provider verbs returned -61 (No data available)
libfabric:4117195:1708032874:ofi_rxm:core:core:ofi_layering_ok():1183 Provider ofi_rxm is excluded
fi_domain(): util/pingpong.c:1415, ret=-61 (No data available)

from libfabric.

ooststep avatar ooststep commented on July 24, 2024

verbs supports msg endpoints (you would need -e msg argument)
verbs;ofi_rxm supports rdm endpoints (you would need -e rdm argument)

You can run fi_info -v -p verbs to view the full set of supported capabilities and endpoint types

from libfabric.

chien-intel avatar chien-intel commented on July 24, 2024

From your fi_info and log, I'm guessing you do not have IPoIB set up. fi_pingpong requires either IPv4 or IPv6 address. After you have that configured, use verbs;ofi_rxm with -e rdm, that should work for you.

from libfabric.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.