Environment:
XCP-NG 7.6 updated, fresh install
HP Proliant DL380p Gen8
Mellanox ConnectX-3 Pro card
When I enable SR-IOV on Mellanox ConnectX-3 Pro card (HP Proliant DL380p Gen8), XAPI constantly reboots the process.
Checked with stock drivers and newest Mellanox drivers. Standard and experimental kernel, checked with xapi-core, xapi-xe from updates_testing repository. No change.
when I disable creating virtual functions in driver - everything works correctly again.
The xensource.log looks like this:
Mar 30 21:00:24 Alpha xapi: [debug|Alpha|0 |dbsync (update_env) R:9072bd1e6610|xapi] PCI 0000:03:01.4, Mellanox Technologies, MT27500/MT27520 Family [ConnectX-3/ConnectX-3 Pro Virtual Function] created
Mar 30 21:00:24 Alpha xapi: [debug|Alpha|0 |dbsync (update_env) R:9072bd1e6610|db_write] create_row PCI (OpaqueRef:2558881f-9933-4f4e-81ef-af3cc2ecdbcc) [(_ref,v),(uuid,v),(class_id,v),(class_name,v),(vendor_id,v),(vendor_name,v),(device_id,v),(device_name,v),(host,v),(pci_id,v),(functions,v),(physical_function,v),(dependencies,v),(other_config,v),(subsystem_vendor_id,v),(subsystem_vendor_name,v),(subsystem_device_id,v),(subsystem_device_name,v),(scheduled_to_be_attached_to,v),(driver_name,v)]
Mar 30 21:00:24 Alpha xapi: [debug|Alpha|0 |dbsync (update_env) R:9072bd1e6610|xapi] PCI 0000:03:01.5, Mellanox Technologies, MT27500/MT27520 Family [ConnectX-3/ConnectX-3 Pro Virtual Function] created
Mar 30 21:00:24 Alpha xapi: [debug|Alpha|0 |dbsync (update_env) R:9072bd1e6610|db_write] create_row PCI (OpaqueRef:2cc30358-ed61-4030-93fb-1dcbbf4c5919) [(_ref,v),(uuid,v),(class_id,v),(class_name,v),(vendor_id,v),(vendor_name,v),(device_id,v),(device_name,v),(host,v),(pci_id,v),(functions,v),(physical_function,v),(dependencies,v),(other_config,v),(subsystem_vendor_id,v),(subsystem_vendor_name,v),(subsystem_device_id,v),(subsystem_device_name,v),(scheduled_to_be_attached_to,v),(driver_name,v)]
Mar 30 21:00:24 Alpha xapi: [debug|Alpha|0 |dbsync (update_env) R:9072bd1e6610|xapi] PCI 0000:03:01.6, Mellanox Technologies, MT27500/MT27520 Family [ConnectX-3/ConnectX-3 Pro Virtual Function] created
Mar 30 21:00:24 Alpha xapi: [debug|Alpha|0 |dbsync (update_env) R:9072bd1e6610|redo_log] WriteField(task, OpaqueRef:9072bd1e-6610-4181-8ac5-c7387792280b, progress, 0, 1)
Mar 30 21:00:24 Alpha xapi: [debug|Alpha|0 |dbsync (update_env) R:9072bd1e6610|redo_log] WriteField(task, OpaqueRef:9072bd1e-6610-4181-8ac5-c7387792280b, error_info, (), ('INTERNAL_ERROR' 'Not_found'))
Mar 30 21:00:24 Alpha xapi: [debug|Alpha|0 |dbsync (update_env) R:9072bd1e6610|redo_log] WriteField(task, OpaqueRef:9072bd1e-6610-4181-8ac5-c7387792280b, backtrace, (), (((process"xapi @ Alpha")(filename list.ml)(line 214))((process"xapi @ Alpha")(filename ocaml/xapi/xapi_pci.ml)(line 218))((process"xapi @ Alpha")(filename list.ml)(line 82))((process"xapi @ Alpha")(filename ocaml/xapi/xapi_pci.ml)(line 216))((process"xapi @ Alpha")(filename ocaml/xapi/xapi_pci.ml)(line 227))((process"xapi @ Alpha")(filename ocaml/xapi/dbsync_slave.ml)(line 239))((process"xapi @ Alpha")(filename ocaml/xapi/dbsync_slave.ml)(line 305))((process"xapi @ Alpha")(filename ocaml/xapi/dbsync.ml)(line 63))((process"xapi @ Alpha")(filename ocaml/xapi/server_helpers.ml)(line 80))))
Mar 30 21:00:24 Alpha xapi: [debug|Alpha|0 |dbsync (update_env) R:9072bd1e6610|redo_log] WriteField(task, OpaqueRef:9072bd1e-6610-4181-8ac5-c7387792280b, finished, 19700101T00:00:00Z, 20190330T20:00:24Z)
Mar 30 21:00:24 Alpha xapi: [debug|Alpha|0 |dbsync (update_env) R:9072bd1e6610|redo_log] WriteField(task, OpaqueRef:9072bd1e-6610-4181-8ac5-c7387792280b, status, pending, failure)
Mar 30 21:00:24 Alpha xapi: [debug|Alpha|0 |dbsync (update_env) R:9072bd1e6610|db_write] delete_row task (OpaqueRef:9072bd1e-6610-4181-8ac5-c7387792280b)
Mar 30 21:00:24 Alpha xapi: [error|Alpha|0 |starting up database engine D:7d8677d911f0|backtrace] dbsync (update_env) R:9072bd1e6610 failed with exception Not_found
Mar 30 21:00:24 Alpha xapi: [error|Alpha|0 |starting up database engine D:7d8677d911f0|backtrace] Raised Not_found
Mar 30 21:00:24 Alpha xapi: [error|Alpha|0 |starting up database engine D:7d8677d911f0|backtrace] 1/15 xapi @ Alpha Raised at file list.ml, line 214
Mar 30 21:00:24 Alpha xapi: [error|Alpha|0 |starting up database engine D:7d8677d911f0|backtrace] 2/15 xapi @ Alpha Called from file ocaml/xapi/xapi_pci.ml, line 218
Mar 30 21:00:24 Alpha xapi: [error|Alpha|0 |starting up database engine D:7d8677d911f0|backtrace] 3/15 xapi @ Alpha Called from file list.ml, line 82
Mar 30 21:00:24 Alpha xapi: [error|Alpha|0 |starting up database engine D:7d8677d911f0|backtrace] 4/15 xapi @ Alpha Called from file ocaml/xapi/xapi_pci.ml, line 216
Mar 30 21:00:24 Alpha xapi: [error|Alpha|0 |starting up database engine D:7d8677d911f0|backtrace] 5/15 xapi @ Alpha Called from file ocaml/xapi/xapi_pci.ml, line 227
Mar 30 21:00:24 Alpha xapi: [error|Alpha|0 |starting up database engine D:7d8677d911f0|backtrace] 6/15 xapi @ Alpha Called from file ocaml/xapi/dbsync_slave.ml, line 239
Mar 30 21:00:24 Alpha xapi: [error|Alpha|0 |starting up database engine D:7d8677d911f0|backtrace] 7/15 xapi @ Alpha Called from file ocaml/xapi/dbsync_slave.ml, line 305
Mar 30 21:00:24 Alpha xapi: [error|Alpha|0 |starting up database engine D:7d8677d911f0|backtrace] 8/15 xapi @ Alpha Called from file ocaml/xapi/dbsync.ml, line 63
Mar 30 21:00:24 Alpha xapi: [error|Alpha|0 |starting up database engine D:7d8677d911f0|backtrace] 9/15 xapi @ Alpha Called from file ocaml/xapi/server_helpers.ml, line 80
Mar 30 21:00:24 Alpha xapi: [error|Alpha|0 |starting up database engine D:7d8677d911f0|backtrace] 10/15 xapi @ Alpha Called from file hashtbl.ml, line 194
Mar 30 21:00:24 Alpha xapi: [error|Alpha|0 |starting up database engine D:7d8677d911f0|backtrace] 11/15 xapi @ Alpha Called from file lib/debug.ml, line 92
Mar 30 21:00:24 Alpha xapi: [error|Alpha|0 |starting up database engine D:7d8677d911f0|backtrace] 12/15 xapi @ Alpha Called from file ocaml/xapi/server_helpers.ml, line 99
Mar 30 21:00:24 Alpha xapi: [error|Alpha|0 |starting up database engine D:7d8677d911f0|backtrace] 13/15 xapi @ Alpha Called from file lib/xapi-stdext-pervasives/pervasiveext.ml, line 24
Mar 30 21:00:24 Alpha xapi: [error|Alpha|0 |starting up database engine D:7d8677d911f0|backtrace] 14/15 xapi @ Alpha Called from file hashtbl.ml, line 194
Mar 30 21:00:24 Alpha xapi: [error|Alpha|0 |starting up database engine D:7d8677d911f0|backtrace] 15/15 xapi @ Alpha Called from file lib/debug.ml, line 92
Mar 30 21:00:24 Alpha xapi: [error|Alpha|0 |starting up database engine D:7d8677d911f0|backtrace]
Mar 30 21:00:24 Alpha xapi: [debug|Alpha|0 |starting up database engine D:7d8677d911f0|dbsync] dbsync caught an exception: INTERNAL_ERROR: [ Not_found ]
Mar 30 21:00:24 Alpha xapi: [error|Alpha|0 |server_init D:319630a43329|backtrace] starting up database engine D:7d8677d911f0 failed with exception Not_found
Mar 30 21:00:24 Alpha xapi: [error|Alpha|0 |server_init D:319630a43329|backtrace] Raised Not_found
Mar 30 21:00:24 Alpha xapi: [error|Alpha|0 |server_init D:319630a43329|backtrace] 1/19 xapi @ Alpha Raised at file lib/debug.ml, line 240
Mar 30 21:00:24 Alpha xapi: [error|Alpha|0 |server_init D:319630a43329|backtrace] 2/19 xapi @ Alpha Called from file lib/xapi-stdext-pervasives/pervasiveext.ml, line 24
Mar 30 21:00:24 Alpha xapi: [error|Alpha|0 |server_init D:319630a43329|backtrace] 3/19 xapi @ Alpha Called from file lib/backtrace.ml, line 114
Mar 30 21:00:24 Alpha xapi: [error|Alpha|0 |server_init D:319630a43329|backtrace] 4/19 xapi @ Alpha Called from file lib/xapi-stdext-pervasives/pervasiveext.ml, line 35
Mar 30 21:00:24 Alpha xapi: [error|Alpha|0 |server_init D:319630a43329|backtrace] 5/19 xapi @ Alpha Called from file ocaml/xapi/dbsync.ml, line 75
Mar 30 21:00:24 Alpha xapi: [error|Alpha|0 |server_init D:319630a43329|backtrace] 6/19 xapi @ Alpha Called from file hashtbl.ml, line 194
Mar 30 21:00:24 Alpha xapi: [error|Alpha|0 |server_init D:319630a43329|backtrace] 7/19 xapi @ Alpha Called from file lib/debug.ml, line 92
Mar 30 21:00:24 Alpha xapi: [error|Alpha|0 |server_init D:319630a43329|backtrace] 8/19 xapi @ Alpha Called from file ocaml/xapi/dbsync.ml, line 80
Mar 30 21:00:24 Alpha xapi: [error|Alpha|0 |server_init D:319630a43329|backtrace] 9/19 xapi @ Alpha Called from file ocaml/xapi/xapi.ml, line 102
Mar 30 21:00:24 Alpha xapi: [error|Alpha|0 |server_init D:319630a43329|backtrace] 10/19 xapi @ Alpha Called from file lib/xapi-stdext-pervasives/pervasiveext.ml, line 24
Mar 30 21:00:24 Alpha xapi: [error|Alpha|0 |server_init D:319630a43329|backtrace] 11/19 xapi @ Alpha Called from file lib/backtrace.ml, line 114
Mar 30 21:00:24 Alpha xapi: [error|Alpha|0 |server_init D:319630a43329|backtrace] 12/19 xapi @ Alpha Called from file lib/xapi-stdext-pervasives/pervasiveext.ml, line 35
Mar 30 21:00:24 Alpha xapi: [error|Alpha|0 |server_init D:319630a43329|backtrace] 13/19 xapi @ Alpha Called from file ocaml/xapi/server_helpers.ml, line 80
Mar 30 21:00:24 Alpha xapi: [error|Alpha|0 |server_init D:319630a43329|backtrace] 14/19 xapi @ Alpha Called from file string.ml, line 118
Mar 30 21:00:24 Alpha xapi: [error|Alpha|0 |server_init D:319630a43329|backtrace] 15/19 xapi @ Alpha Called from file sexp.ml, line 112
Mar 30 21:00:24 Alpha xapi: [error|Alpha|0 |server_init D:319630a43329|backtrace] 16/19 xapi @ Alpha Called from file ocaml/xapi/server_helpers.ml, line 99
Mar 30 21:00:24 Alpha xapi: [error|Alpha|0 |server_init D:319630a43329|backtrace] 17/19 xapi @ Alpha Called from file lib/xapi-stdext-pervasives/pervasiveext.ml, line 24
Mar 30 21:00:24 Alpha xapi: [error|Alpha|0 |server_init D:319630a43329|backtrace] 18/19 xapi @ Alpha Called from file hashtbl.ml, line 194
Mar 30 21:00:24 Alpha xapi: [error|Alpha|0 |server_init D:319630a43329|backtrace] 19/19 xapi @ Alpha Called from file lib/debug.ml, line 92
Mar 30 21:00:24 Alpha xapi: [error|Alpha|0 |server_init D:319630a43329|backtrace]
Mar 30 21:00:24 Alpha xapi: [ warn|Alpha|0 |server_init D:319630a43329|startup] task [starting up database engine] exception: Not_found
Mar 30 21:00:24 Alpha xapi: [error|Alpha|0 ||backtrace] server_init D:319630a43329 failed with exception Not_found
Mar 30 21:00:24 Alpha xapi: [error|Alpha|0 ||backtrace] Raised Not_found
Mar 30 21:00:24 Alpha xapi: [error|Alpha|0 ||backtrace] 1/1 xapi @ Alpha Raised at file (Thread 0 has no backtrace table. Was with_backtraces called?, line 0
Mar 30 21:00:24 Alpha xapi: [error|Alpha|0 ||backtrace]
Mar 30 21:00:24 Alpha xapi: [debug|Alpha|0 ||xapi] xapi top-level caught exception: INTERNAL_ERROR: [ Not_found ]
Mar 30 21:00:24 Alpha xapi: [error|Alpha|0 ||backtrace] Raised Not_found
Mar 30 21:00:24 Alpha xapi: [error|Alpha|0 ||backtrace] 1/1 xapi @ Alpha Raised at file (Thread 0 has no backtrace table. Was with_backtraces called?, line 0
Mar 30 21:00:24 Alpha xapi: [error|Alpha|0 ||backtrace]
now it looks like the problem is in the xapi_pci.ml .... with dependencies:
let update_dependencies pfs =
let rec update = function
| [] -> ()
| (pref, prec, pci, _) :: remaining ->
let dependencies = List.map
(fun address ->
let r, _, , _ = List.find (fun (, rc, _, _) -> rc.Db_actions.pCI_pci_id = address) pfs
in r)
pci.related
in
Db.PCI.set_dependencies ~__context ~self:pref ~value:dependencies;
update remaining
in
update pfs
in
update_dependencies pfs;
Any pointers how to solve this?