Code implementation of advanced human pose estimation using MediaPipe, OpenCV and Matplotlib
Human Pose Estimation is a cutting-edge computer vision technology that translates visual data into actionable insights about human movement. By leveraging advanced machine learning models such as MediaPipe’s Blazepose and OpenCV, such as OPENCV, developers can track body key points with unprecedented accuracy. In this tutorial, we explore the seamless integration of these contents, showing how a Python-based framework can enable complex posture detection in various fields from sports analysis to healthcare monitoring and interactive applications.
First, we installed the basic library:
!pip install mediapipe opencv-python-headless matplotlib
Then we import the important libraries needed for the implementation:
import cv2
import mediapipe as mp
import matplotlib.pyplot as plt
import numpy as np
We initialized the MediaPipe pose model in static image mode with segmentation enabled with a minimum detection confidence of 0.5. It also imports utilities for drawing landmarks and applying drawing styles.
mp_pose = mp.solutions.pose
mp_drawing = mp.solutions.drawing_utils
mp_drawing_styles = mp.solutions.drawing_styles
pose = mp_pose.Pose(
static_image_mode=True,
model_complexity=1,
enable_segmentation=True,
min_detection_confidence=0.5
)
Here we define the readect_pose function, which reads the image, processes it to detect a human pose landmark using MediaPipe, and returns the image with the belt to the landmark with the detection. If landmarks are found, they are drawn using the default style.
def detect_pose(image_path):
image = cv2.imread(image_path)
image_rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
results = pose.process(image_rgb)
annotated_image = image_rgb.copy()
if results.pose_landmarks:
mp_drawing.draw_landmarks(
annotated_image,
results.pose_landmarks,
mp_pose.POSE_CONNECTIONS,
landmark_drawing_spec=mp_drawing_styles.get_default_pose_landmarks_style()
)
return annotated_image, results.pose_landmarks
We define a visualization function that uses matplotlib to display the original image and the image whose pose is logged out side by side. The Extract_keypoints function converts the detected pose landmark into a named keyboard dictionary with X, Y, Z coordinates and visibility scores.
def visualize_pose(original_image, annotated_image):
plt.figure(figsize=(16, 8))
plt.subplot(1, 2, 1)
plt.title('Original Image')
plt.imshow(cv2.cvtColor(original_image, cv2.COLOR_BGR2RGB))
plt.axis('off')
plt.subplot(1, 2, 2)
plt.title('Pose Estimation')
plt.imshow(annotated_image)
plt.axis('off')
plt.tight_layout()
plt.show()
def extract_keypoints(landmarks):
if landmarks:
keypoints = {}
for idx, landmark in enumerate(landmarks.landmark):
keypoints[mp_pose.PoseLandmark(idx).name] = {
'x': landmark.x,
'y': landmark.y,
'z': landmark.z,
'visibility': landmark.visibility
}
return keypoints
return None
Finally, we load the image from the specified path, detect and visualize the human pose landmark using MediaPipe, and then extract and print the coordinates and visibility of each detected key point.
image_path="/content/Screenshot 2025-03-26 at 12.56.05 AM.png"
original_image = cv2.imread(image_path)
annotated_image, landmarks = detect_pose(image_path)
visualize_pose(original_image, annotated_image)
keypoints = extract_keypoints(landmarks)
if keypoints:
print("Detected Keypoints:")
for name, details in keypoints.items():
print(f"{name}: {details}")
In this tutorial, we explored human posture estimation using MediaPipe and OpenCV, demonstrating a comprehensive method for detecting key points in humans. We implemented a powerful pipeline that converts images into detailed skeleton maps covering key steps including library installation, pose detection feature creation, visualization techniques, and keyboard extraction. Using advanced machine learning models, we show how developers can transform raw visual data into meaningful sports insights such as sports analytics and healthcare monitoring.
This is COLAB notebook. Also, don’t forget to follow us twitter And join us Telegram Channel and LinkedIn GrOUP. Don’t forget to join us 85k+ ml reddit.
The post uses MediaPipe, OpenCV and Matplotlib to perform code implementations for advanced human pose estimation, first appearing on Marktechpost.