CN111798486B - Multi-view human motion capture method based on human motion prediction - Google Patents
Multi-view human motion capture method based on human motion prediction Download PDFInfo
- Publication number
- CN111798486B CN111798486B CN202010546274.6A CN202010546274A CN111798486B CN 111798486 B CN111798486 B CN 111798486B CN 202010546274 A CN202010546274 A CN 202010546274A CN 111798486 B CN111798486 B CN 111798486B
- Authority
- CN
- China
- Prior art keywords
- dimensional
- human body
- human
- key point
- frame
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 39
- 230000000007 visual effect Effects 0.000 claims abstract description 15
- 239000011159 matrix material Substances 0.000 claims description 9
- 230000008569 process Effects 0.000 claims description 9
- 238000001514 detection method Methods 0.000 claims description 8
- 210000000988 bone and bone Anatomy 0.000 claims description 7
- 238000013528 artificial neural network Methods 0.000 claims description 5
- 230000007704 transition Effects 0.000 claims description 3
- 239000013598 vector Substances 0.000 claims description 3
- 238000010586 diagram Methods 0.000 description 5
- 230000008901 benefit Effects 0.000 description 4
- 230000004048 modification Effects 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000002401 inhibitory effect Effects 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 230000009191 jumping Effects 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000013179 statistical model Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/207—Analysis of motion for motion estimation over a hierarchy of resolutions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10004—Still image; Photographic image
- G06T2207/10012—Stereo images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20024—Filtering details
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a multi-view human motion capture method based on human motion prediction, which carries out three-dimensional human reconstruction on pictures synchronously shot by a plurality of cameras from different views to obtain a three-dimensional skeleton of each human body; for the subsequent frame, according to the reconstructed three-dimensional skeleton of the previous frame, a prediction result and confidence coefficient are given to the position of the human body three-dimensional key point of the current frame; operating a human body detector for the pictures of the key frames to detect a two-dimensional bounding box of each human body; for the pictures of the non-key frames, projecting the three-dimensional skeleton predicted by the motion of the previous frame into the images of all the visual angles of the current frame to quickly obtain a two-dimensional bounding box of a human body under all the visual angles of the current frame, thereby reducing the expense of a human body detector and improving the algorithm efficiency; the invention also utilizes the time sequence information to calculate the visibility of key points of the human body, and improves the accuracy of the reconstruction of the three-dimensional skeleton of the human body based on the visibility.
Description
Technical Field
The invention belongs to the technical field of computers, and particularly relates to a multi-view human motion capture method based on human motion prediction.
Background
The multi-view human motion capture refers to restoring the three-dimensional motion of a human body based on multi-view video, most of the existing related methods comprise two stages of detection and reconstruction, although good effects are achieved on a public data set, the detection and reconstruction modules of the existing related methods are separated, a detector cannot benefit from the result of the reconstruction of a previous frame, and motion prediction is lacked. Furthermore, the visibility of key points is not well exploited.
Disclosure of Invention
The invention aims to provide a human body detection method based on motion prediction aiming at the defects of the prior art, and the visibility of each key point is calculated by using time sequence information so as to improve the accuracy and the algorithm efficiency of multi-user motion capture.
The purpose of the invention is realized by the following technical scheme: a multi-view human motion capture method based on human motion prediction comprises the following steps:
(1) inputting pictures synchronously shot by a plurality of cameras at different visual angles, and performing three-dimensional human body reconstruction to obtain a three-dimensional skeleton of each human body as an initial frame result;
(2) and (3) motion prediction: for a subsequent frame, maintaining a series of motion related variables according to a three-dimensional skeleton reconstructed from a previous frame, and giving a prediction result and confidence coefficient to the human body three-dimensional key point position of the current frame by using a motion prediction method;
(3) for the picture of the key frame, operating a human body detector to detect a two-dimensional bounding box of each human body; for the picture of the non-key frame, projecting the three-dimensional skeleton predicted by the motion of the previous frame into the image of each visual angle of the current frame to quickly obtain a two-dimensional bounding box of the human body under each visual angle of the current frame;
(4) intercepting a single human body from an image by using a two-dimensional bounding box, inputting the single human body into a human body two-dimensional key point detector, outputting a heat map of each key point, and taking the position and the confidence coefficient of the human body two-dimensional key point from the heat map; calculating the visibility of each key point by using the time sequence information, and setting the confidence coefficient of the key points which are judged to be invisible to zero; a three-dimensional skeleton is reconstructed by triangulation, and multi-view human motion capture is achieved.
Further, the step (1) is specifically: and (3) carrying out two-dimensional key point detection on each picture, establishing a matching relation between the two-dimensional key points in each visual angle, and reconstructing a three-dimensional human body by utilizing triangulation to obtain a three-dimensional coordinate of each human body key point.
Further, in the step (2), the motion-related variables include positions and velocities of three-dimensional key points of the human body.
Further, in the step (2), a kalman filter is adopted for the motion prediction, the position and the speed of the three-dimensional key point are used as state variables, and the position of the three-dimensional key point is used as an observation variable, so as to establish a linear system; the Kalman filter can give a prediction result of the position of a three-dimensional key point on the basis of considering the movement speed, and the obtained covariance matrix can give confidence information; the prediction process of the kalman filter can be expressed as:
xt=Fxt-1
Pt=FPt-1FT+Q
wherein x ist-1,xtState vectors at t-1 and t, respectively, Pt-1,PtAre respectively xt-1,xtQ is a noise covariance matrix, F is a state transition matrix, and Δ t represents the time interval of adjacent frames.
Further, in the step (2), the motion prediction adopts a neural network, the network memorizes a reconstruction result of a past period of time, outputs the predicted position of the three-dimensional key point of the current time according to time sequence memory, and gives a confidence.
Further, in the step (3), the determining of the key frame specifically includes: and taking a frame as a key frame at regular intervals, or adjusting according to the confidence coefficient in the prediction process, and if the confidence coefficient is not ideal all the time, increasing the density of the key frame.
Further, in the step (3), under the condition that the three-dimensional key point position of each human body estimated in the previous frame is available, each section of human body bone is approximated by using a cylinder, and the visibility of each human body key point is judged according to the shielding relation between the cylinders and the front-back relation of the human body in a certain camera system.
Further, in the step (3), the visibility judgment specifically includes: approximating the human skeleton by using a cylinder with the radius r and the height h, wherein the center of the cylinder is positioned on the average value of three-dimensional positions reconstructed in a frame before two adjacent key points of the corresponding bone; in each view angle, a line segment is determined from the center of the camera to a certain key point, whether the line segment is intersected with each cylinder in a three-dimensional space is calculated, the shielding relation between the line segment and each cylinder is determined according to the front-back position relation of the human body reconstructed from the previous frame, namely whether the joints of the following people are shielded by the joints of the previous people is judged, and the visibility of all the key points in the view angle is calculated.
The invention has the beneficial effects that: according to the invention, the three-dimensional human body skeleton of the previous frame is projected to the image of each visual angle of the current frame after motion prediction to obtain the two-dimensional human body bounding box, so that the expense of a human body detector is reduced, and the algorithm efficiency is improved. The invention also utilizes the time sequence information to calculate the visibility of key points of the human body, and improves the accuracy of the reconstruction of the three-dimensional skeleton of the human body based on the visibility.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is a schematic overall flow chart of an embodiment of the present invention, wherein, sub-diagram (a) is a schematic diagram of a detection result of a human body two-dimensional bounding box; the subgraph (b) is a schematic diagram of a human body two-dimensional key point prediction result; and the sub-graph (c) gives a reconstruction result of the three-dimensional key points.
Fig. 2 is a schematic diagram of a human body three-dimensional key point reconstruction result and a motion prediction result thereof according to an embodiment of the present invention.
FIG. 3 is a schematic diagram of a two-dimensional key point heat map of a human body according to an embodiment of the invention.
Fig. 4 is a schematic view illustrating the visibility determination of key points of a human body according to an embodiment of the present invention.
Detailed Description
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in detail below.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, however, the present invention may be practiced otherwise than as specifically described herein, and it will be appreciated by those skilled in the art that the present invention may be practiced without departing from the spirit and scope of the present invention and that the present invention is not limited by the specific embodiments disclosed below.
As shown in fig. 1, the present invention provides a multi-view human motion capture method based on human motion prediction, which specifically includes the following steps:
1. inputting a plurality of pictures synchronously shot by different visual angles of a calibrated camera, and firstly using the existing multi-visual angle human body motion capture method (such as mvpos) to carry out three-dimensional human body reconstruction as the result of an initial frame.
Specifically, two-dimensional key point detection is carried out on each picture, a matching relation between two-dimensional key points among all visual angles is established by using a matching algorithm, a three-dimensional human body is reconstructed by using triangulation, and a three-dimensional coordinate of each human body key point is obtained, namely a three-dimensional skeleton of each human body is finally obtained.
2. And (3) motion prediction: for a subsequent frame, a series of motion-related variables including the position, speed and the like of a human body three-dimensional key point are maintained according to a reconstructed three-dimensional framework of a previous frame, a prediction result and confidence coefficient are given to the position of the human body three-dimensional key point of a current frame by utilizing a motion prediction method such as a Kalman filter or a neural network, and the part is shown in an arrow pointing to the current frame picture from a previous frame reconstruction result in the picture. The advantage of this step is that if no motion prediction is performed, for a fast moving human body, the situation of tracking loss is easily caused by only using the projection of the previous frame and the observation of the current frame for matching tracking, and the problem can be solved by motion prediction.
Specifically, for the kalman filter, a linear system may be established using the position and velocity of the three-dimensional key point as state variables and the position of the three-dimensional key point as an observation variable. The filter can give the prediction result of the position of the three-dimensional key point on the basis of considering the movement speed, and the obtained covariance matrix can give confidence coefficient information. Let xt-1,xtState vectors at t-1 and t, respectively, Pt-1,PtAre respectively xt-1,xtQ is a noise covariance matrix, F is a state transition matrix, and Δ t represents the time interval of adjacent frames. The prediction process of the kalman filter can be expressed as:
xt=Fxt-1
Pt=FPt-1FT+Q
specifically, for a neural network, such as LSTM, the network will remember the reconstruction results from a past time, output the predicted location of the three-dimensional keypoint at the current time based on these time series memory, and give confidence.
3. For the key frame pictures, the invention operates a human body detector (such as yolo) to detect the two-dimensional bounding box of each person, which is consistent with the existing method, the input of the human body detector is the current frame picture, and the two-dimensional bounding box of each person is output. For the picture of the non-key frame, the invention does not need to operate the human body detector, as shown in fig. 1(a), the three-dimensional skeleton predicted by the motion of the previous frame is projected to the image of each visual angle of the current frame, and the two-dimensional bounding box of the human body under each visual angle of the current frame can be obtained quickly. The advantage of this step is that if the position of the human body in the image at the next moment is estimated according to the motion prediction, the human body detector can be prevented from being operated in each frame, and only a part of key frames need to be detected, so that the operation time can be effectively reduced.
In particular, the determination of the key frame is related to the actual demand. Most of the time, one frame can simply be taken as a key frame at regular intervals. The judgment can also be carried out according to the confidence coefficient in the prediction process, and if the confidence coefficient is not high all the time, which indicates that the human body possibly moves too fast or has complex movement at the moment, the key frame can be considered to be denser.
As shown in fig. 1(b), a single person is cut out from an image by using a two-dimensional bounding box, and the cut-out is used as an input to a human body two-dimensional key point detector (such as HRNet), and a heat map of each key point is output, and the position and confidence of the human body two-dimensional key point can be extracted from the heat map. And finally, reconstructing a three-dimensional skeleton by using triangulation (matching is not required because two-dimensional detection results are projected by the same three-dimensional skeleton at the moment), and realizing multi-view human motion capture, as shown in fig. 1 (c).
As shown in fig. 2, a human three-dimensional skeleton at a certain time and a prediction skeleton of a motion prediction method for a future time are shown in the figure. With the motion prediction mechanism, even if the motion with larger amplitude like jumping is carried out, the two-dimensional bounding box of the prediction projection can be close to the human body of the current frame as much as possible, and the situation of tracking loss is avoided.
The invention also provides a method for improving the reconstruction accuracy of the human three-dimensional skeleton by utilizing the visibility of the key points. As shown in fig. 3, it is mentioned that the intercepted single person is input to the two-dimensional keypoint detector, and the result directly output by the detector is a two-dimensional keypoint heat map. The local peak point of the heat map provides the location of the two-dimensional keypoint, whose value provides the confidence of that keypoint. However, the confidence may be affected by a number of factors, such as whether a keypoint is occluded, the size of the keypoint, whether the overall pose of the person is common, and the like. For example, in fig. 3, the head of the following person is blocked due to the existence of the head of the previous person, but the response area of the corresponding heat map still appears to be large, which may cause unreliable estimation. Confidence does not fully represent the visibility of the keypoint. Because the visibility plays a weighting role in the subsequent triangularization reconstruction process, the method is different from the prior method which directly adopts confidence coefficient, and the method utilizes the time sequence information to calculate the visibility of each key point, thereby inhibiting the invisible two-dimensional key point and preventing the two-dimensional key point from interfering the subsequent reconstruction process.
The visibility judgment method specifically comprises the following steps: under the condition that the three-dimensional key point position of each person estimated in the previous frame exists, the method utilizes the cylinders to approximate each section of human bone, and can judge the visibility of each human key point according to the shielding relation among the cylinders and the front-back relation of the human body under a certain camera system. And then, a part of two-dimensional key points are suppressed by utilizing the obtained key point visibility, so that a more reliable result is provided for the calculation of reconstruction.
Specifically, as shown in fig. 4, the human skeleton is approximated by a cylinder with radius r and height h, where r and h are determined according to the physical size of an actual person, or can be obtained by statistical data. For example, for the human body statistical model SMPL, a cylinder parameterized by r and h is used to fit a point cloud corresponding to a corresponding bone of the SMPL, so that statistical parameters of r and h can be obtained. The center of the cylinder is located at the average of the three-dimensional positions reconstructed in the previous frame corresponding to two adjacent key points of the bone. In each view angle, a line segment can be determined from the center of the camera to a certain key point, whether the line segment intersects each cylinder in a three-dimensional space can be calculated through geometrical knowledge (the relation between the distance to the axis of the cylinder and the radius is judged), the occlusion relation between the line segment and each cylinder is determined according to the front-back position relation of the person obtained through reconstruction in the previous frame, namely whether the joints of the following person are occluded by the joints of the preceding person is judged, and the visibility of all the key points in the view angle is calculated. And for the key points judged to be invisible, the confidence coefficient of the key points is set to zero, so that the key points cannot be considered in the subsequent triangularization process.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments.
The foregoing is only a preferred embodiment of the present invention, and although the present invention has been disclosed in the preferred embodiments, it is not intended to limit the present invention. Those skilled in the art can make numerous possible variations and modifications to the present teachings, or modify equivalent embodiments to equivalent variations, without departing from the scope of the present teachings, using the methods and techniques disclosed above. Therefore, any simple modification, equivalent change and modification made to the above embodiments according to the technical essence of the present invention are still within the scope of the protection of the technical solution of the present invention, unless the contents of the technical solution of the present invention are departed.
Claims (8)
1. A multi-view human motion capture method based on human motion prediction is characterized by comprising the following steps:
(1) inputting pictures shot by a plurality of cameras synchronously at different visual angles, and performing three-dimensional human body reconstruction to obtain a three-dimensional skeleton of each human body as an initial frame result;
(2) and (3) motion prediction: for a subsequent frame, maintaining a series of motion related variables according to a three-dimensional skeleton reconstructed from a previous frame, and giving a prediction result and confidence coefficient to the human body three-dimensional key point position of the current frame by using a motion prediction method;
(3) for the picture of the key frame, operating a human body detector to detect a two-dimensional bounding box of each human body; for the picture of the non-key frame, projecting the three-dimensional skeleton predicted by the motion of the previous frame into the image of each visual angle of the current frame to quickly obtain a two-dimensional bounding box of the human body under each visual angle of the current frame;
(4) intercepting a single human body from an image by using a two-dimensional bounding box, inputting the single human body into a human body two-dimensional key point detector, outputting a heat map of each key point, and taking the position and the confidence coefficient of the human body two-dimensional key point from the heat map; calculating the visibility of each key point by using the time sequence information, and setting the confidence coefficient of the key points which are judged to be invisible to zero; a three-dimensional skeleton is reconstructed by triangulation, and multi-view human motion capture is achieved.
2. The method for capturing multi-view human motion based on human motion prediction as claimed in claim 1, wherein the step (1) is specifically as follows: and (3) carrying out two-dimensional key point detection on each picture, establishing a matching relation between the two-dimensional key points in each visual angle, and reconstructing a three-dimensional human body by utilizing triangulation to obtain a three-dimensional coordinate of each human body key point.
3. The multi-view human motion capture method based on human motion prediction of claim 1, wherein in the step (2), the motion-related variables comprise positions and velocities of three-dimensional key points of the human body.
4. The multi-view human motion capture method based on human motion prediction of claim 1, wherein in the step (2), a kalman filter is adopted for motion prediction, the position and the speed of the three-dimensional key point are used as state variables, and the position of the three-dimensional key point is used as an observation variable, so as to establish a linear system; the Kalman filter can give a prediction result of the position of a three-dimensional key point on the basis of considering the movement speed, and the obtained covariance matrix can give confidence information; the prediction process of the kalman filter can be expressed as:
xt=Fxt-1
Pt=FPt-1FT+Q
wherein x ist-1,xtState vectors at t-1 and t, respectively, Pt-1,PtAre respectively xt-1,xtQ is a noise covariance matrix, F is a state transition matrix, and Δ t represents the time interval of adjacent frames.
5. The multi-view human motion capture method based on human motion prediction of claim 1, wherein in the step (2), the motion prediction employs a neural network, the neural network memorizes the reconstruction result of a past period of time, outputs the predicted position of the three-dimensional key point of the current time according to time sequence memory, and gives a confidence.
6. The method for capturing multi-view human motion based on human motion prediction as claimed in claim 1, wherein in the step (3), the determination of the keyframe specifically is: and taking a frame as a key frame at regular intervals, or adjusting according to the confidence coefficient in the prediction process, and if the confidence coefficient is not ideal all the time, increasing the density of the key frame.
7. The method for capturing multi-view human body motion based on human body motion prediction as claimed in claim 1, wherein in the step (3), under the condition of having the three-dimensional key point position of each human body estimated from the previous frame, each segment of human body bone is approximated by a cylinder, and the visibility of each human body key point is determined according to the occlusion relationship between cylinders and the front-back relationship of the human body under a certain camera system.
8. The method for capturing multi-view human motion based on human motion prediction as claimed in claim 7, wherein in the step (3), the visibility judgment is specifically as follows: approximating the human skeleton by using a cylinder with the radius r and the height h, wherein the center of the cylinder is positioned on the average value of three-dimensional positions reconstructed in a frame before two adjacent key points of the corresponding bone; in each view angle, a line segment is determined from the center of the camera to a certain key point, whether the line segment is intersected with each cylinder in a three-dimensional space is calculated, the shielding relation between the line segment and each cylinder is determined according to the front-back position relation of the human body reconstructed from the previous frame, namely whether the joints of the following people are shielded by the joints of the previous people is judged, and the visibility of all the key points in the view angle is calculated.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010546274.6A CN111798486B (en) | 2020-06-16 | 2020-06-16 | Multi-view human motion capture method based on human motion prediction |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010546274.6A CN111798486B (en) | 2020-06-16 | 2020-06-16 | Multi-view human motion capture method based on human motion prediction |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111798486A CN111798486A (en) | 2020-10-20 |
CN111798486B true CN111798486B (en) | 2022-05-17 |
Family
ID=72804764
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010546274.6A Active CN111798486B (en) | 2020-06-16 | 2020-06-16 | Multi-view human motion capture method based on human motion prediction |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111798486B (en) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112381003B (en) * | 2020-11-16 | 2023-08-22 | 网易(杭州)网络有限公司 | Motion capture method, motion capture device, motion capture equipment and storage medium |
CN112837409B (en) * | 2021-02-02 | 2022-11-08 | 浙江大学 | Method for reconstructing three-dimensional human body by using mirror |
CN113642565B (en) * | 2021-10-15 | 2022-02-11 | 腾讯科技(深圳)有限公司 | Object detection method, device, equipment and computer readable storage medium |
CN117372471A (en) * | 2022-07-01 | 2024-01-09 | 上海青瞳视觉科技有限公司 | Label-free human body posture optical detection method |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108710868A (en) * | 2018-06-05 | 2018-10-26 | 中国石油大学(华东) | A kind of human body critical point detection system and method based under complex scene |
CN109242950A (en) * | 2018-07-11 | 2019-01-18 | 天津大学 | Multi-angle of view human body dynamic three-dimensional reconstruction method under more close interaction scenarios of people |
CN110544301A (en) * | 2019-09-06 | 2019-12-06 | 广东工业大学 | Three-dimensional human body action reconstruction system, method and action training system |
CN110796699A (en) * | 2019-06-18 | 2020-02-14 | 叠境数字科技(上海)有限公司 | Optimal visual angle selection method and three-dimensional human skeleton detection method of multi-view camera system |
CN110874865A (en) * | 2019-11-14 | 2020-03-10 | 腾讯科技(深圳)有限公司 | Three-dimensional skeleton generation method and computer equipment |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8023726B2 (en) * | 2006-11-10 | 2011-09-20 | University Of Maryland | Method and system for markerless motion capture using multiple cameras |
KR101307341B1 (en) * | 2009-12-18 | 2013-09-11 | 한국전자통신연구원 | Method and apparatus for motion capture of dynamic object |
-
2020
- 2020-06-16 CN CN202010546274.6A patent/CN111798486B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108710868A (en) * | 2018-06-05 | 2018-10-26 | 中国石油大学(华东) | A kind of human body critical point detection system and method based under complex scene |
CN109242950A (en) * | 2018-07-11 | 2019-01-18 | 天津大学 | Multi-angle of view human body dynamic three-dimensional reconstruction method under more close interaction scenarios of people |
CN110796699A (en) * | 2019-06-18 | 2020-02-14 | 叠境数字科技(上海)有限公司 | Optimal visual angle selection method and three-dimensional human skeleton detection method of multi-view camera system |
CN110544301A (en) * | 2019-09-06 | 2019-12-06 | 广东工业大学 | Three-dimensional human body action reconstruction system, method and action training system |
CN110874865A (en) * | 2019-11-14 | 2020-03-10 | 腾讯科技(深圳)有限公司 | Three-dimensional skeleton generation method and computer equipment |
Non-Patent Citations (3)
Title |
---|
Deep Volumetric Video From Very Sparse Multi-View Performance Capture;Zeng Huang 等;《CVF》;20181231;全文 * |
Fast and Robust Multi-Person 3D Pose Estimation from Multiple Views;Junting Dong 等;《CVPR》;20191231;全文 * |
基于多视角的运动捕捉技术研究;洪佳枫;《中国优秀硕士学位论文全文数据库》;20190515;全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN111798486A (en) | 2020-10-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111798486B (en) | Multi-view human motion capture method based on human motion prediction | |
Lipton | Local application of optic flow to analyse rigid versus non-rigid motion | |
JP2915894B2 (en) | Target tracking method and device | |
JP6030617B2 (en) | Image processing apparatus and image processing method | |
Singh et al. | Action recognition in cluttered dynamic scenes using pose-specific part models | |
KR101348680B1 (en) | Target acquisition method for video tracker, and target acquisition apparatus using the same | |
US11645777B2 (en) | Multi-view positioning using reflections | |
Tyagi et al. | Kernel-based 3d tracking | |
CN110569785A (en) | Face recognition method based on fusion tracking technology | |
CN112861808B (en) | Dynamic gesture recognition method, device, computer equipment and readable storage medium | |
Amer | Voting-based simultaneous tracking of multiple video objects | |
CN110827320A (en) | Target tracking method and device based on time sequence prediction | |
CN111932600A (en) | Real-time loop detection method based on local subgraph | |
Jean et al. | Body tracking in human walk from monocular video sequences | |
Bodor et al. | Image-based reconstruction for view-independent human motion recognition | |
JP2016081252A (en) | Image processor and image processing method | |
Xu-Wei et al. | Real-time hand tracking based on YOLOv4 model and Kalman filter | |
ELBAŞI et al. | Control charts approach for scenario recognition in video sequences | |
CN114548224A (en) | 2D human body pose generation method and device for strong interaction human body motion | |
Nakatsuka et al. | Denoising 3d human poses from low-resolution video using variational autoencoder | |
Li et al. | Unsupervised learning of human perspective context using ME-DT for efficient human detection in surveillance | |
Cordea et al. | 3-D head pose recovery for interactive virtual reality avatars | |
Srijeyanthan et al. | Skeletonization in a real-time gesture recognition system | |
CN118314162B (en) | Dynamic visual SLAM method and device for time sequence sparse reconstruction | |
Lin et al. | Integrating bottom-up and top-down processes for accurate pedestrian counting |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |