[go: nahoru, domu]

CN111798486B - Multi-view human motion capture method based on human motion prediction - Google Patents

Multi-view human motion capture method based on human motion prediction Download PDF

Info

Publication number
CN111798486B
CN111798486B CN202010546274.6A CN202010546274A CN111798486B CN 111798486 B CN111798486 B CN 111798486B CN 202010546274 A CN202010546274 A CN 202010546274A CN 111798486 B CN111798486 B CN 111798486B
Authority
CN
China
Prior art keywords
dimensional
human body
human
key point
frame
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010546274.6A
Other languages
Chinese (zh)
Other versions
CN111798486A (en
Inventor
周晓巍
鲍虎军
方琦
帅青
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN202010546274.6A priority Critical patent/CN111798486B/en
Publication of CN111798486A publication Critical patent/CN111798486A/en
Application granted granted Critical
Publication of CN111798486B publication Critical patent/CN111798486B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/207Analysis of motion for motion estimation over a hierarchy of resolutions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image
    • G06T2207/10012Stereo images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20024Filtering details

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a multi-view human motion capture method based on human motion prediction, which carries out three-dimensional human reconstruction on pictures synchronously shot by a plurality of cameras from different views to obtain a three-dimensional skeleton of each human body; for the subsequent frame, according to the reconstructed three-dimensional skeleton of the previous frame, a prediction result and confidence coefficient are given to the position of the human body three-dimensional key point of the current frame; operating a human body detector for the pictures of the key frames to detect a two-dimensional bounding box of each human body; for the pictures of the non-key frames, projecting the three-dimensional skeleton predicted by the motion of the previous frame into the images of all the visual angles of the current frame to quickly obtain a two-dimensional bounding box of a human body under all the visual angles of the current frame, thereby reducing the expense of a human body detector and improving the algorithm efficiency; the invention also utilizes the time sequence information to calculate the visibility of key points of the human body, and improves the accuracy of the reconstruction of the three-dimensional skeleton of the human body based on the visibility.

Description

Multi-view human motion capture method based on human motion prediction
Technical Field
The invention belongs to the technical field of computers, and particularly relates to a multi-view human motion capture method based on human motion prediction.
Background
The multi-view human motion capture refers to restoring the three-dimensional motion of a human body based on multi-view video, most of the existing related methods comprise two stages of detection and reconstruction, although good effects are achieved on a public data set, the detection and reconstruction modules of the existing related methods are separated, a detector cannot benefit from the result of the reconstruction of a previous frame, and motion prediction is lacked. Furthermore, the visibility of key points is not well exploited.
Disclosure of Invention
The invention aims to provide a human body detection method based on motion prediction aiming at the defects of the prior art, and the visibility of each key point is calculated by using time sequence information so as to improve the accuracy and the algorithm efficiency of multi-user motion capture.
The purpose of the invention is realized by the following technical scheme: a multi-view human motion capture method based on human motion prediction comprises the following steps:
(1) inputting pictures synchronously shot by a plurality of cameras at different visual angles, and performing three-dimensional human body reconstruction to obtain a three-dimensional skeleton of each human body as an initial frame result;
(2) and (3) motion prediction: for a subsequent frame, maintaining a series of motion related variables according to a three-dimensional skeleton reconstructed from a previous frame, and giving a prediction result and confidence coefficient to the human body three-dimensional key point position of the current frame by using a motion prediction method;
(3) for the picture of the key frame, operating a human body detector to detect a two-dimensional bounding box of each human body; for the picture of the non-key frame, projecting the three-dimensional skeleton predicted by the motion of the previous frame into the image of each visual angle of the current frame to quickly obtain a two-dimensional bounding box of the human body under each visual angle of the current frame;
(4) intercepting a single human body from an image by using a two-dimensional bounding box, inputting the single human body into a human body two-dimensional key point detector, outputting a heat map of each key point, and taking the position and the confidence coefficient of the human body two-dimensional key point from the heat map; calculating the visibility of each key point by using the time sequence information, and setting the confidence coefficient of the key points which are judged to be invisible to zero; a three-dimensional skeleton is reconstructed by triangulation, and multi-view human motion capture is achieved.
Further, the step (1) is specifically: and (3) carrying out two-dimensional key point detection on each picture, establishing a matching relation between the two-dimensional key points in each visual angle, and reconstructing a three-dimensional human body by utilizing triangulation to obtain a three-dimensional coordinate of each human body key point.
Further, in the step (2), the motion-related variables include positions and velocities of three-dimensional key points of the human body.
Further, in the step (2), a kalman filter is adopted for the motion prediction, the position and the speed of the three-dimensional key point are used as state variables, and the position of the three-dimensional key point is used as an observation variable, so as to establish a linear system; the Kalman filter can give a prediction result of the position of a three-dimensional key point on the basis of considering the movement speed, and the obtained covariance matrix can give confidence information; the prediction process of the kalman filter can be expressed as:
xt=Fxt-1
Pt=FPt-1FT+Q
Figure BDA0002540895990000021
wherein x ist-1,xtState vectors at t-1 and t, respectively, Pt-1,PtAre respectively xt-1,xtQ is a noise covariance matrix, F is a state transition matrix, and Δ t represents the time interval of adjacent frames.
Further, in the step (2), the motion prediction adopts a neural network, the network memorizes a reconstruction result of a past period of time, outputs the predicted position of the three-dimensional key point of the current time according to time sequence memory, and gives a confidence.
Further, in the step (3), the determining of the key frame specifically includes: and taking a frame as a key frame at regular intervals, or adjusting according to the confidence coefficient in the prediction process, and if the confidence coefficient is not ideal all the time, increasing the density of the key frame.
Further, in the step (3), under the condition that the three-dimensional key point position of each human body estimated in the previous frame is available, each section of human body bone is approximated by using a cylinder, and the visibility of each human body key point is judged according to the shielding relation between the cylinders and the front-back relation of the human body in a certain camera system.
Further, in the step (3), the visibility judgment specifically includes: approximating the human skeleton by using a cylinder with the radius r and the height h, wherein the center of the cylinder is positioned on the average value of three-dimensional positions reconstructed in a frame before two adjacent key points of the corresponding bone; in each view angle, a line segment is determined from the center of the camera to a certain key point, whether the line segment is intersected with each cylinder in a three-dimensional space is calculated, the shielding relation between the line segment and each cylinder is determined according to the front-back position relation of the human body reconstructed from the previous frame, namely whether the joints of the following people are shielded by the joints of the previous people is judged, and the visibility of all the key points in the view angle is calculated.
The invention has the beneficial effects that: according to the invention, the three-dimensional human body skeleton of the previous frame is projected to the image of each visual angle of the current frame after motion prediction to obtain the two-dimensional human body bounding box, so that the expense of a human body detector is reduced, and the algorithm efficiency is improved. The invention also utilizes the time sequence information to calculate the visibility of key points of the human body, and improves the accuracy of the reconstruction of the three-dimensional skeleton of the human body based on the visibility.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is a schematic overall flow chart of an embodiment of the present invention, wherein, sub-diagram (a) is a schematic diagram of a detection result of a human body two-dimensional bounding box; the subgraph (b) is a schematic diagram of a human body two-dimensional key point prediction result; and the sub-graph (c) gives a reconstruction result of the three-dimensional key points.
Fig. 2 is a schematic diagram of a human body three-dimensional key point reconstruction result and a motion prediction result thereof according to an embodiment of the present invention.
FIG. 3 is a schematic diagram of a two-dimensional key point heat map of a human body according to an embodiment of the invention.
Fig. 4 is a schematic view illustrating the visibility determination of key points of a human body according to an embodiment of the present invention.
Detailed Description
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in detail below.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, however, the present invention may be practiced otherwise than as specifically described herein, and it will be appreciated by those skilled in the art that the present invention may be practiced without departing from the spirit and scope of the present invention and that the present invention is not limited by the specific embodiments disclosed below.
As shown in fig. 1, the present invention provides a multi-view human motion capture method based on human motion prediction, which specifically includes the following steps:
1. inputting a plurality of pictures synchronously shot by different visual angles of a calibrated camera, and firstly using the existing multi-visual angle human body motion capture method (such as mvpos) to carry out three-dimensional human body reconstruction as the result of an initial frame.
Specifically, two-dimensional key point detection is carried out on each picture, a matching relation between two-dimensional key points among all visual angles is established by using a matching algorithm, a three-dimensional human body is reconstructed by using triangulation, and a three-dimensional coordinate of each human body key point is obtained, namely a three-dimensional skeleton of each human body is finally obtained.
2. And (3) motion prediction: for a subsequent frame, a series of motion-related variables including the position, speed and the like of a human body three-dimensional key point are maintained according to a reconstructed three-dimensional framework of a previous frame, a prediction result and confidence coefficient are given to the position of the human body three-dimensional key point of a current frame by utilizing a motion prediction method such as a Kalman filter or a neural network, and the part is shown in an arrow pointing to the current frame picture from a previous frame reconstruction result in the picture. The advantage of this step is that if no motion prediction is performed, for a fast moving human body, the situation of tracking loss is easily caused by only using the projection of the previous frame and the observation of the current frame for matching tracking, and the problem can be solved by motion prediction.
Specifically, for the kalman filter, a linear system may be established using the position and velocity of the three-dimensional key point as state variables and the position of the three-dimensional key point as an observation variable. The filter can give the prediction result of the position of the three-dimensional key point on the basis of considering the movement speed, and the obtained covariance matrix can give confidence coefficient information. Let xt-1,xtState vectors at t-1 and t, respectively, Pt-1,PtAre respectively xt-1,xtQ is a noise covariance matrix, F is a state transition matrix, and Δ t represents the time interval of adjacent frames. The prediction process of the kalman filter can be expressed as:
xt=Fxt-1
Pt=FPt-1FT+Q
wherein,
Figure BDA0002540895990000041
specifically, for a neural network, such as LSTM, the network will remember the reconstruction results from a past time, output the predicted location of the three-dimensional keypoint at the current time based on these time series memory, and give confidence.
3. For the key frame pictures, the invention operates a human body detector (such as yolo) to detect the two-dimensional bounding box of each person, which is consistent with the existing method, the input of the human body detector is the current frame picture, and the two-dimensional bounding box of each person is output. For the picture of the non-key frame, the invention does not need to operate the human body detector, as shown in fig. 1(a), the three-dimensional skeleton predicted by the motion of the previous frame is projected to the image of each visual angle of the current frame, and the two-dimensional bounding box of the human body under each visual angle of the current frame can be obtained quickly. The advantage of this step is that if the position of the human body in the image at the next moment is estimated according to the motion prediction, the human body detector can be prevented from being operated in each frame, and only a part of key frames need to be detected, so that the operation time can be effectively reduced.
In particular, the determination of the key frame is related to the actual demand. Most of the time, one frame can simply be taken as a key frame at regular intervals. The judgment can also be carried out according to the confidence coefficient in the prediction process, and if the confidence coefficient is not high all the time, which indicates that the human body possibly moves too fast or has complex movement at the moment, the key frame can be considered to be denser.
As shown in fig. 1(b), a single person is cut out from an image by using a two-dimensional bounding box, and the cut-out is used as an input to a human body two-dimensional key point detector (such as HRNet), and a heat map of each key point is output, and the position and confidence of the human body two-dimensional key point can be extracted from the heat map. And finally, reconstructing a three-dimensional skeleton by using triangulation (matching is not required because two-dimensional detection results are projected by the same three-dimensional skeleton at the moment), and realizing multi-view human motion capture, as shown in fig. 1 (c).
As shown in fig. 2, a human three-dimensional skeleton at a certain time and a prediction skeleton of a motion prediction method for a future time are shown in the figure. With the motion prediction mechanism, even if the motion with larger amplitude like jumping is carried out, the two-dimensional bounding box of the prediction projection can be close to the human body of the current frame as much as possible, and the situation of tracking loss is avoided.
The invention also provides a method for improving the reconstruction accuracy of the human three-dimensional skeleton by utilizing the visibility of the key points. As shown in fig. 3, it is mentioned that the intercepted single person is input to the two-dimensional keypoint detector, and the result directly output by the detector is a two-dimensional keypoint heat map. The local peak point of the heat map provides the location of the two-dimensional keypoint, whose value provides the confidence of that keypoint. However, the confidence may be affected by a number of factors, such as whether a keypoint is occluded, the size of the keypoint, whether the overall pose of the person is common, and the like. For example, in fig. 3, the head of the following person is blocked due to the existence of the head of the previous person, but the response area of the corresponding heat map still appears to be large, which may cause unreliable estimation. Confidence does not fully represent the visibility of the keypoint. Because the visibility plays a weighting role in the subsequent triangularization reconstruction process, the method is different from the prior method which directly adopts confidence coefficient, and the method utilizes the time sequence information to calculate the visibility of each key point, thereby inhibiting the invisible two-dimensional key point and preventing the two-dimensional key point from interfering the subsequent reconstruction process.
The visibility judgment method specifically comprises the following steps: under the condition that the three-dimensional key point position of each person estimated in the previous frame exists, the method utilizes the cylinders to approximate each section of human bone, and can judge the visibility of each human key point according to the shielding relation among the cylinders and the front-back relation of the human body under a certain camera system. And then, a part of two-dimensional key points are suppressed by utilizing the obtained key point visibility, so that a more reliable result is provided for the calculation of reconstruction.
Specifically, as shown in fig. 4, the human skeleton is approximated by a cylinder with radius r and height h, where r and h are determined according to the physical size of an actual person, or can be obtained by statistical data. For example, for the human body statistical model SMPL, a cylinder parameterized by r and h is used to fit a point cloud corresponding to a corresponding bone of the SMPL, so that statistical parameters of r and h can be obtained. The center of the cylinder is located at the average of the three-dimensional positions reconstructed in the previous frame corresponding to two adjacent key points of the bone. In each view angle, a line segment can be determined from the center of the camera to a certain key point, whether the line segment intersects each cylinder in a three-dimensional space can be calculated through geometrical knowledge (the relation between the distance to the axis of the cylinder and the radius is judged), the occlusion relation between the line segment and each cylinder is determined according to the front-back position relation of the person obtained through reconstruction in the previous frame, namely whether the joints of the following person are occluded by the joints of the preceding person is judged, and the visibility of all the key points in the view angle is calculated. And for the key points judged to be invisible, the confidence coefficient of the key points is set to zero, so that the key points cannot be considered in the subsequent triangularization process.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments.
The foregoing is only a preferred embodiment of the present invention, and although the present invention has been disclosed in the preferred embodiments, it is not intended to limit the present invention. Those skilled in the art can make numerous possible variations and modifications to the present teachings, or modify equivalent embodiments to equivalent variations, without departing from the scope of the present teachings, using the methods and techniques disclosed above. Therefore, any simple modification, equivalent change and modification made to the above embodiments according to the technical essence of the present invention are still within the scope of the protection of the technical solution of the present invention, unless the contents of the technical solution of the present invention are departed.

Claims (8)

1. A multi-view human motion capture method based on human motion prediction is characterized by comprising the following steps:
(1) inputting pictures shot by a plurality of cameras synchronously at different visual angles, and performing three-dimensional human body reconstruction to obtain a three-dimensional skeleton of each human body as an initial frame result;
(2) and (3) motion prediction: for a subsequent frame, maintaining a series of motion related variables according to a three-dimensional skeleton reconstructed from a previous frame, and giving a prediction result and confidence coefficient to the human body three-dimensional key point position of the current frame by using a motion prediction method;
(3) for the picture of the key frame, operating a human body detector to detect a two-dimensional bounding box of each human body; for the picture of the non-key frame, projecting the three-dimensional skeleton predicted by the motion of the previous frame into the image of each visual angle of the current frame to quickly obtain a two-dimensional bounding box of the human body under each visual angle of the current frame;
(4) intercepting a single human body from an image by using a two-dimensional bounding box, inputting the single human body into a human body two-dimensional key point detector, outputting a heat map of each key point, and taking the position and the confidence coefficient of the human body two-dimensional key point from the heat map; calculating the visibility of each key point by using the time sequence information, and setting the confidence coefficient of the key points which are judged to be invisible to zero; a three-dimensional skeleton is reconstructed by triangulation, and multi-view human motion capture is achieved.
2. The method for capturing multi-view human motion based on human motion prediction as claimed in claim 1, wherein the step (1) is specifically as follows: and (3) carrying out two-dimensional key point detection on each picture, establishing a matching relation between the two-dimensional key points in each visual angle, and reconstructing a three-dimensional human body by utilizing triangulation to obtain a three-dimensional coordinate of each human body key point.
3. The multi-view human motion capture method based on human motion prediction of claim 1, wherein in the step (2), the motion-related variables comprise positions and velocities of three-dimensional key points of the human body.
4. The multi-view human motion capture method based on human motion prediction of claim 1, wherein in the step (2), a kalman filter is adopted for motion prediction, the position and the speed of the three-dimensional key point are used as state variables, and the position of the three-dimensional key point is used as an observation variable, so as to establish a linear system; the Kalman filter can give a prediction result of the position of a three-dimensional key point on the basis of considering the movement speed, and the obtained covariance matrix can give confidence information; the prediction process of the kalman filter can be expressed as:
xt=Fxt-1
Pt=FPt-1FT+Q
Figure FDA0002540895980000011
wherein x ist-1,xtState vectors at t-1 and t, respectively, Pt-1,PtAre respectively xt-1,xtQ is a noise covariance matrix, F is a state transition matrix, and Δ t represents the time interval of adjacent frames.
5. The multi-view human motion capture method based on human motion prediction of claim 1, wherein in the step (2), the motion prediction employs a neural network, the neural network memorizes the reconstruction result of a past period of time, outputs the predicted position of the three-dimensional key point of the current time according to time sequence memory, and gives a confidence.
6. The method for capturing multi-view human motion based on human motion prediction as claimed in claim 1, wherein in the step (3), the determination of the keyframe specifically is: and taking a frame as a key frame at regular intervals, or adjusting according to the confidence coefficient in the prediction process, and if the confidence coefficient is not ideal all the time, increasing the density of the key frame.
7. The method for capturing multi-view human body motion based on human body motion prediction as claimed in claim 1, wherein in the step (3), under the condition of having the three-dimensional key point position of each human body estimated from the previous frame, each segment of human body bone is approximated by a cylinder, and the visibility of each human body key point is determined according to the occlusion relationship between cylinders and the front-back relationship of the human body under a certain camera system.
8. The method for capturing multi-view human motion based on human motion prediction as claimed in claim 7, wherein in the step (3), the visibility judgment is specifically as follows: approximating the human skeleton by using a cylinder with the radius r and the height h, wherein the center of the cylinder is positioned on the average value of three-dimensional positions reconstructed in a frame before two adjacent key points of the corresponding bone; in each view angle, a line segment is determined from the center of the camera to a certain key point, whether the line segment is intersected with each cylinder in a three-dimensional space is calculated, the shielding relation between the line segment and each cylinder is determined according to the front-back position relation of the human body reconstructed from the previous frame, namely whether the joints of the following people are shielded by the joints of the previous people is judged, and the visibility of all the key points in the view angle is calculated.
CN202010546274.6A 2020-06-16 2020-06-16 Multi-view human motion capture method based on human motion prediction Active CN111798486B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010546274.6A CN111798486B (en) 2020-06-16 2020-06-16 Multi-view human motion capture method based on human motion prediction

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010546274.6A CN111798486B (en) 2020-06-16 2020-06-16 Multi-view human motion capture method based on human motion prediction

Publications (2)

Publication Number Publication Date
CN111798486A CN111798486A (en) 2020-10-20
CN111798486B true CN111798486B (en) 2022-05-17

Family

ID=72804764

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010546274.6A Active CN111798486B (en) 2020-06-16 2020-06-16 Multi-view human motion capture method based on human motion prediction

Country Status (1)

Country Link
CN (1) CN111798486B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112381003B (en) * 2020-11-16 2023-08-22 网易(杭州)网络有限公司 Motion capture method, motion capture device, motion capture equipment and storage medium
CN112837409B (en) * 2021-02-02 2022-11-08 浙江大学 Method for reconstructing three-dimensional human body by using mirror
CN113642565B (en) * 2021-10-15 2022-02-11 腾讯科技(深圳)有限公司 Object detection method, device, equipment and computer readable storage medium
CN117372471A (en) * 2022-07-01 2024-01-09 上海青瞳视觉科技有限公司 Label-free human body posture optical detection method

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108710868A (en) * 2018-06-05 2018-10-26 中国石油大学(华东) A kind of human body critical point detection system and method based under complex scene
CN109242950A (en) * 2018-07-11 2019-01-18 天津大学 Multi-angle of view human body dynamic three-dimensional reconstruction method under more close interaction scenarios of people
CN110544301A (en) * 2019-09-06 2019-12-06 广东工业大学 Three-dimensional human body action reconstruction system, method and action training system
CN110796699A (en) * 2019-06-18 2020-02-14 叠境数字科技(上海)有限公司 Optimal visual angle selection method and three-dimensional human skeleton detection method of multi-view camera system
CN110874865A (en) * 2019-11-14 2020-03-10 腾讯科技(深圳)有限公司 Three-dimensional skeleton generation method and computer equipment

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8023726B2 (en) * 2006-11-10 2011-09-20 University Of Maryland Method and system for markerless motion capture using multiple cameras
KR101307341B1 (en) * 2009-12-18 2013-09-11 한국전자통신연구원 Method and apparatus for motion capture of dynamic object

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108710868A (en) * 2018-06-05 2018-10-26 中国石油大学(华东) A kind of human body critical point detection system and method based under complex scene
CN109242950A (en) * 2018-07-11 2019-01-18 天津大学 Multi-angle of view human body dynamic three-dimensional reconstruction method under more close interaction scenarios of people
CN110796699A (en) * 2019-06-18 2020-02-14 叠境数字科技(上海)有限公司 Optimal visual angle selection method and three-dimensional human skeleton detection method of multi-view camera system
CN110544301A (en) * 2019-09-06 2019-12-06 广东工业大学 Three-dimensional human body action reconstruction system, method and action training system
CN110874865A (en) * 2019-11-14 2020-03-10 腾讯科技(深圳)有限公司 Three-dimensional skeleton generation method and computer equipment

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Deep Volumetric Video From Very Sparse Multi-View Performance Capture;Zeng Huang 等;《CVF》;20181231;全文 *
Fast and Robust Multi-Person 3D Pose Estimation from Multiple Views;Junting Dong 等;《CVPR》;20191231;全文 *
基于多视角的运动捕捉技术研究;洪佳枫;《中国优秀硕士学位论文全文数据库》;20190515;全文 *

Also Published As

Publication number Publication date
CN111798486A (en) 2020-10-20

Similar Documents

Publication Publication Date Title
CN111798486B (en) Multi-view human motion capture method based on human motion prediction
Lipton Local application of optic flow to analyse rigid versus non-rigid motion
JP2915894B2 (en) Target tracking method and device
JP6030617B2 (en) Image processing apparatus and image processing method
Singh et al. Action recognition in cluttered dynamic scenes using pose-specific part models
KR101348680B1 (en) Target acquisition method for video tracker, and target acquisition apparatus using the same
US11645777B2 (en) Multi-view positioning using reflections
Tyagi et al. Kernel-based 3d tracking
CN110569785A (en) Face recognition method based on fusion tracking technology
CN112861808B (en) Dynamic gesture recognition method, device, computer equipment and readable storage medium
Amer Voting-based simultaneous tracking of multiple video objects
CN110827320A (en) Target tracking method and device based on time sequence prediction
CN111932600A (en) Real-time loop detection method based on local subgraph
Jean et al. Body tracking in human walk from monocular video sequences
Bodor et al. Image-based reconstruction for view-independent human motion recognition
JP2016081252A (en) Image processor and image processing method
Xu-Wei et al. Real-time hand tracking based on YOLOv4 model and Kalman filter
ELBAŞI et al. Control charts approach for scenario recognition in video sequences
CN114548224A (en) 2D human body pose generation method and device for strong interaction human body motion
Nakatsuka et al. Denoising 3d human poses from low-resolution video using variational autoencoder
Li et al. Unsupervised learning of human perspective context using ME-DT for efficient human detection in surveillance
Cordea et al. 3-D head pose recovery for interactive virtual reality avatars
Srijeyanthan et al. Skeletonization in a real-time gesture recognition system
CN118314162B (en) Dynamic visual SLAM method and device for time sequence sparse reconstruction
Lin et al. Integrating bottom-up and top-down processes for accurate pedestrian counting

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant