-
Notifications
You must be signed in to change notification settings - Fork 609
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Videos with various length and fps #741
Comments
Hi, |
Hi @kkjh0723 , If I understand correctly, these are the feature you are asking for:
|
Thanks, @JanuszL, Currently, I'm using the 3D-Resnet based code. For data loading, they extract the frames from videos and save into jpeg file. Then load and process frames using multiple workers as Pytorch dataloader. Thanks @Kh4L , |
I am also looking for a feature that can support loading videos with different length. I want to mention that an alternative way of dealing with these kind of data is to load full videos in a mini-batch and then pad the shorter ones to the longest video length. This is also very common in video tasks such as captioning. It would be very appreciated if one day this can be supported in DALI. |
@aBlueDragon - what kind of padding do you have in mind, dummy, replication of the last frame, something else? |
@JanuszL Padding zeros would be fine, but the dataloader has to return the actually length of each sample so that we know the real length of the padded videos. |
@aBlueDragon - thanks for the explanation. |
vote for various length. We could encode videos to the same resolution and the same fps, but NO way to the same length. |
@raofengyun Usually you need sample N frames from the video by one of the following approaches:
@JanuszL I think adding these three sampling techniques to your framework would be quite helpful for video understanding community.
|
@mzolfaghari I agree your idea, for action recognition, we need the different frame sampler. |
Hi,
|
the second one implemented in https://github.com/SunGaofeng/DALI |
@huangjun12 - if you think that is useful for the rest of the community you can fill a PR using the code from https://github.com/SunGaofeng/DALI. |
The VideoReader for TSN model is developed by modifying the codes of the original DALI repo. I did this in such a hurry that few thoughts was considered on how to be compatible with the origin code and how to fit more models. Maybe @huangjun12 can put more time on this subject to fit for more classical models. |
Hi,
Thanks for the nice library. I found DALI while looking for a video loader for action recognition. I found that DALI yet cannot handle various resolution as in the issue #725 which is necessary for public dataset such as Kinetics.
Another necessary component might be processing videos with various length and fps.
It seems
VideoReader
only support extract whole video into batch ofsequence_length
of sequences. I'm not sure because I've just testedvideo_test.py
only.sequence_length
andstep
) from one video. It seems that this way is commonly used in training phase of Kinetics dataset.For evaluation, people often extract several clips along the whole video with equal interval.
ffmpeg
withfps
filter when I extract the frames manually.Hopefully, those process can be possible already or do you have any plan to support those features?
The text was updated successfully, but these errors were encountered: