There is no speedup on gpu, what's the problem? I am using 2 conda envs (with tens

there is an open PR about this - <a class="issue-link js-issue-link" data-error-text="

there is an open PR about this - <a class="issue-link js-issue-link" data

unfortunately, i do not have a gpu and cannot make tests :/ </blockqu

Same speed on CPU and GPU on i7 7700 and 2080TI about deepface HOT 10 CLOSED

Lingyun97 commented on May 29, 2024

Same speed on CPU and GPU on i7 7700 and 2080TI

from deepface.

Comments (10)

serengil commented on May 29, 2024

deepface does nothing with gpu. do you mind to raise this issue in the tensorflow's repo?

from deepface.

Lingyun97 commented on May 29, 2024

Thanks for reply. And I found the yolov8.pt not for tensorflow.

from deepface.

serengil commented on May 29, 2024

there is an open PR about this - #1145

Once this is merged, then yolo can be used with GPU

from deepface.

Lingyun97 commented on May 29, 2024

there is an open PR about this - #1145

Once this is merged, then yolo can be used with GPU

Thanks. I am gonna try it.
And the use of tensorflow-gpu still confusing me. Do you know why the period of find embeddings ("embedding = model.find_embeddings(img)" ) cost same time ~0.2s on GPU and CPU. It used Ghostfacenet and I thought it is based on tensorflow that the gpu should speedup the period?
Thanks again for your work and help.

from deepface.

Lingyun97 commented on May 29, 2024

guess the gpu problem is because of the input type, the img ndarray works in cpu and need transfer to tensor for gpus.

from deepface.

serengil commented on May 29, 2024

numpy to tensor is required for yolo?

from deepface.

Lingyun97 commented on May 29, 2024

numpy to tensor is required for yolo?

I think the input format problems makes size，usage of gpu require the tensor format.

I am quite a rookie for codes, so just gonna try it.
I found the gpu works in 10X speed for retinaface which did the tensor format.
and I just install the pytorch and made yolov8.to(device="cuda:0")，and there is a error:

Call arguments received by layer "conv2d" " f"(type Conv2D):
• inputs=tf.Tensor(shape=(1, 112, 112, 3), dtype=float32)

np.ndarray is for cpu and may be the reason ghostfacenet keep the same speed on tensorflow-gpu that it still use the cpu without right format.

from deepface.

serengil commented on May 29, 2024

unfortunately, i do not have a gpu and cannot make tests :/

from deepface.

Lingyun97 commented on May 29, 2024

numpy to tensor is required for yolo?

not required for yolo, just same as fastmtcnn.
but facing the problem in embedding_obj = representation.represent() when using ghostfacenet，still trying to figure out(think not the detector problem that the fastmtcnn output the same error, and kind of strange there is CuDNN 8.1.0.77 in pycharm env but it notify me the 8.0.5 in usage and I don't remember I have installed any 8.0.5, I only have used the other 8.9.2 for base env):

2024-04-09 10:08:59.696442: E tensorflow/stream_executor/cuda/cuda_dnn.cc:377] Loaded runtime CuDNN library: 8.0.5 but source was compiled with: 8.1.0. CuDNN library needs to have matching major version and equal or higher minor version. If using a binary install, upgrade your CuDNN library. If building from sources, make sure the library loaded at runtime is compatible with the version specified during compile configuration.
2024-04-09 10:08:59.697755: W tensorflow/core/framework/op_kernel.cc:1780] OP_REQUIRES failed at conv_ops.cc:1134 : UNIMPLEMENTED: DNN library is not found.
Traceback (most recent call last):
File "D:\Lingyun svn\trunk\facetrack\DeepFace_mod.py", line 582, in
main()
File "D:\Lingyun svn\trunk\facetrack\DeepFace_mod.py", line 578, in main
test_vid(vid,False,detector,model,dbase)
File "D:\Lingyun svn\trunk\facetrack\DeepFace_mod.py", line 537, in test_vid
img = perform_facial_recognition(
File "D:\Lingyun svn\trunk\facetrack\DeepFace_mod.py", line 133, in perform_facial_recognition
target_label, target_img = search_identity(
File "D:\Lingyun svn\trunk\facetrack\DeepFace_mod.py", line 435, in search_identity
dfs = DeepFace.find(
File "g:\deepface_new0407\deepface\deepface\DeepFace.py", line 309, in find
return recognition.find(
File "g:\deepface_new0407\deepface\deepface\modules\recognition.py", line 180, in find
representations += __find_bulk_embeddings(
File "g:\deepface_new0407\deepface\deepface\modules\recognition.py", line 389, in __find_bulk_embeddings
embedding_obj = representation.represent(
File "g:\deepface_new0407\deepface\deepface\modules\representation.py", line 107, in represent
embedding = model.forward(img)
File "g:\deepface_new0407\deepface\deepface\models\FacialRecognition.py", line 29, in forward
return self.model(img, training=False).numpy()[0].tolist()
File "G:\conda_env\deepface_tf_torch_gpu\lib\site-packages\keras\utils\traceback_utils.py", line 70, in error_handler
raise e.with_traceback(filtered_tb) from None
File "G:\conda_env\deepface_tf_torch_gpu\lib\site-packages\tensorflow\python\framework\ops.py", line 7209, in raise_from_not_ok_status
raise core._status_to_exception(e) from None # pylint: disable=protected-access
tensorflow.python.framework.errors_impl.UnimplementedError: Exception encountered when calling layer "conv2d" " f"(type Conv2D).

{{function_node _wrapped__Conv2D_device/job:localhost/replica:0/task:0/device:GPU:0}} DNN library is not found. [Op:Conv2D]

Call arguments received by layer "conv2d" " f"(type Conv2D):
• inputs=tf.Tensor(shape=(1, 112, 112, 3), dtype=float32)

from deepface.

Lingyun97 commented on May 29, 2024

unfortunately, i do not have a gpu and cannot make tests :/

my bad, it's should not be about tensor. The error was from the cudnn version from tensorflow and pytorch. should install cudakittool 11.3 and cudnn 8.1 before them to solve.
It's strange ghostfacenetv1 have same inference time on gpu (sured tf run on gpu). I tried its onnx model, which actually did 10X speed on gpu. ghostfacenet v1 spend 0.2s to inference on deepface and onnx spend 0.07s while onnx-gpu cost 0.007s.

from deepface.

Same speed on CPU and GPU on i7 7700 and 2080TI about deepface HOT 10 CLOSED

Comments (10)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent