Comments (7)
Looks like when i add your code it works well, changed my code by adding
r, rendered_frame, poses, bboxes = process_image_img2pose(frame, model, renderer, transform, threshold, threed_points)
for i, pose in enumerate(poses):
bbox = bboxes[i]
pitch, yaw, roll, _, _, scale = pose
tdx = bbox[0] + ((bbox[2] - bbox[0]) / 2)
tdy = bbox[1] + ((bbox[3] - bbox[1]) / 2)
rendered_frame = draw_axis(np.asarray(rendered_frame), yaw, pitch, roll, tdx=tdx, tdy=tdy, size=1000 / scale)
from img2pose.
Yes, the order is pitch, yaw, roll, horizontal translation, vertical translation, and scale.
By your pose example it looks like you are doing it right, but just double check that you are giving the pose mean and std deviation when creating the model, or adding it afterwards.
from img2pose.
Thanks for quick response,
This is how i load and prepare model
def load():
renderer = Renderer(
vertices_path="/app/detectors/img2pose/pose_references/vertices_trans.npy",
triangles_path="/app/detectors/img2pose/pose_references/triangles.npy"
)
threed_points = np.load('/app/detectors/img2pose/pose_references/reference_3d_5_points_trans.npy')
transform = transforms.Compose([transforms.ToTensor()])
DEPTH = 18
MAX_SIZE = 1400
MIN_SIZE = 600
POSE_MEAN = "/app/detectors/img2pose/models/WIDER_train_pose_mean_v1.npy"
POSE_STDDEV = "/app/detectors/img2pose/models/WIDER_train_pose_stddev_v1.npy"
MODEL_PATH = "/app/detectors/img2pose/models/img2pose_v1.pth"
pose_mean = np.load(POSE_MEAN)
pose_stddev = np.load(POSE_STDDEV)
img2pose_model = img2poseModel(
DEPTH, MIN_SIZE, MAX_SIZE,
pose_mean=pose_mean, pose_stddev=pose_stddev,
threed_68_points=threed_points,
)
load_model(img2pose_model.fpn_model, MODEL_PATH, cpu_mode=str(img2pose_model.device) == "cpu", model_only=True)
img2pose_model.evaluate()
threshold = 0.9
return renderer, img2pose_model, transform, threshold, threed_points
This is how i process the current frame
def process_image_img2pose(frame, img2pose_model, renderer, transform, threshold, threed_points):
frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
img = Image.fromarray(frame)
(w, h) = img.size
image_intrinsics = np.array([[w + h, 0, w // 2], [0, w + h, h // 2], [0, 0, 1]])
res = img2pose_model.predict([transform(img)])[0]
all_bboxes = res["boxes"].cpu().numpy().astype('float')
poses = []
bboxes = []
for i in range(len(all_bboxes)):
if res["scores"][i] > threshold:
bbox = all_bboxes[i]
pose_pred = res["dofs"].cpu().numpy()[i].astype('float')
pose_pred = pose_pred.squeeze()
poses.append(pose_pred)
bboxes.append(bbox)
aligned_faces = align_faces_lm(threed_points, img, poses)
if not aligned_faces:
aligned_faces = []
return aligned_faces, render_plot(img.copy(), poses, bboxes, renderer), poses
Video processing
while capture.isOpened():
detections = []
ret, frame = capture.read()
if not ret:
break
r, rendered_frame, poses = process_image_img2pose(frame, model, renderer, transform, threshold, threed_points)
for i, pose in enumerate(poses):
print(pose[0:3])
draw_axis(rendered_frame, pose[0:3], np.mean(r[i], axis=0))
for lms in r:
for lm in lms:
point = (int(lm[0]), int(lm[1]))
cv2.circle(rendered_frame, point, 3, (255, 100, 100), 1, cv2.LINE_AA)
result = {
'lms': [lms.tolist() for lms in r],
'pose': [p.tolist() for p in poses]
}
json_result[frame_id] = result
sink.write(rendered_frame)
frame_id += 1
with open(json_out_path, 'w') as json_file:
json.dump(json_result, json_file, indent=2)
I am trying to process some videos quickly for qualitative assesment on my data, currently I am able to produce pretty good videos containing (5pts & face mask) but would like to have axis as well.
Will try to use img2pose as base detector for autoannotation with multiple detectors since its quite robust according to current experiments
from img2pose.
No problem!
Everything looks good in the snippets you sent. And I believe the draw axis code will work as well.
Just one thing, if you care about the bbox at all, instead of giving the 5 pts 3D reference in _"threed_68_points=threed_points,", give the 68 pts one, as the bbox will be capture better the face.
On early experiments, I have used the following code to draw axis:
def draw_axis(img, yaw, pitch, roll, tdx=None, tdy=None, size=50):
yaw = -yaw
if tdx != None and tdy != None:
tdx = tdx
tdy = tdy
else:
height, width = img.shape[:2]
tdx = width / 2
tdy = height / 2
# X-Axis pointing to right drawn in red
x1 = size * (cos(yaw) * cos(roll)) + tdx
y1 = size * (cos(pitch) * sin(roll) + cos(roll) * sin(pitch) * sin(yaw)) + tdy
# Y-Axis | drawn in green
x2 = size * (-cos(yaw) * sin(roll)) + tdx
y2 = size * (cos(pitch) * cos(roll) - sin(pitch) * sin(yaw) * sin(roll)) + tdy
# Z-Axis (out of the screen) drawn in blue
x3 = size * (sin(yaw)) + tdx
y3 = size * (-cos(yaw) * sin(pitch)) + tdy
cv2.line(img, (int(tdx), int(tdy)), (int(x1),int(y1)),(0,0,255),3)
cv2.line(img, (int(tdx), int(tdy)), (int(x2),int(y2)),(0,255,0),3)
cv2.line(img, (int(tdx), int(tdy)), (int(x3),int(y3)),(255,0,0),2)
return img
Calling like:
pitch, yaw, roll, _, _, scale = pose
tdx = bbox[0] + ((bbox[2] - bbox[0]) / 2)
tdy = bbox[1] + ((bbox[3] - bbox[1]) / 2)
res_img = draw_axis(np.asarray(img), yaw, pitch, roll, tdx=tdx, tdy=tdy, size=1000 / scale)
from img2pose.
Thank you very much, i will try your code and post results
from img2pose.
Regarding the bounding box, I need 5pts format and conversion to widerface so i can retrain some models
from img2pose.
Regarding the bounding box, I need 5pts format and conversion to widerface so i can retrain some models
Yes, you can still use the 5 pts to that, but change this part so that the output bbox captures more the face:
threed_68_points = np.load('/app/detectors/img2pose/pose_references/reference_3d_68_points_trans.npy')
img2pose_model = img2poseModel(
DEPTH, MIN_SIZE, MAX_SIZE,
pose_mean=pose_mean, pose_stddev=pose_stddev,
threed_68_points=threed_68_points,
)
Then, you can continue to give the 5 pts version to
aligned_faces = align_faces_lm(threed_points, img, poses)
from img2pose.
Related Issues (20)
- Question about this work HOT 2
- Question on fine-tuning for face pose evaluation HOT 14
- Question for face alignment HOT 2
- Question about 300W-LP labels acquirements HOT 6
- Pose to angle HOT 1
- jaw data HOT 1
- Hi, I'm confused about the definition of the output pose HOT 1
- Slow inference HOT 2
- Question about Visualizing the Activation Map on each layer of the model HOT 1
- Question on fine-tuning HOT 1
- Question about Fine tuning Model with 300W-LP HOT 1
- Bug in readme file HOT 1
- img2pose_v1.pth convert onnx? HOT 1
- A Question about the conversion of t .
- A question about K_box and K_img. HOT 1
- A Question about fine-tuning HOT 2
- TypeError: a bytes-like object is required, not 'NoneType' HOT 4
- A Question about the 6DoF HOT 1
- A Question about the convert_to_aflw
- ONNX output is giving incorrect DOF values HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from img2pose.