[go: nahoru, domu]

Skip to content

Latest commit

 

History

History
35 lines (32 loc) · 19.8 KB

README.md

File metadata and controls

35 lines (32 loc) · 19.8 KB

Gato Datasets

The various datasets mentionned in the Gato paper are not all publicly available and some (like 'Playroom') are not even detailed. Here is what we could find on the various datasets.

Control Environments

Environment Tasks Episodes Approx Tokens Sample Weight Agent used Open-Source Repo Additional information
DM LAB 254 16.4M 194B IMPALA DM Lab Appendix F.5 of the Gato paper mentions that they trained an IMPALA agent on a set of 18 parent DM Lab levels. “Data was collected by executing the agent on these 18 levels, as well as an additional set of 237 levels handcrafted to test a diverse set of skills”. We don’t have much information on the definition of those 18 “parent levels” and the 237 “handcrafted levels”. But there are a lot of different levels here: https://github.com/deepmind/lab/tree/master/game_scripts/levelsCheck out this paper which claims SOTA with an IMPALA agent on DM Lab 30: https://arxiv.org/pdf/1809.04474v1.pdf
ALE Atari 51 63.4K 1.26B Muesli agent for 200M steps per environment ALE Atari
ALE Atari Extended 28 28.4K 565M Muesli agent for 200M steps per environment ALE Atari
Sokoban 1 27.2K 298M Muesli agent Sokoban
Baby AI 46 4.61M 22.8B Built-in BabyAI bot with 100 000 episodes for each level Baby AI
DM Control Suite 30 395K 22.5B DM Control In Appendix F.4 of the Gato paper, the authors mention that “for each task in the control suite, they collect two disjoint sets of data, one using only state features and another using only pixels'’ . They use a D4PG agent to collect data from tasks with state features, and an MPO based agent to collect data with pixels. They also collect data for randomized versions of the control suite tasks with a D4PG agent. They randomize the actuator gear, joint range, stiffness, and damping and geom size and density from a small interval and a large interval.There are some SOTA agents here :https://paperswithcode.com/dataset/deepmind-control-suite
DM Control Suite Pixels 28 485K 35.5B D4PG for tasks with state feature, MPO for data using pixels. Randomized versions with D4PG DM Control
DM Control Suite Random Small 26 10.6M 313B DM Control
DM Control Suite Random Large 26 26.1M 791B DM Control
Meta-World 45 94.6K 3.39B MPO agent Meta-World Appendix F.9 of the Gato paper mention that they collected data from all train and test tasks in the MT50 mode by training a MPO agent with unlimited environment seeds and access to state of the MuJoCo physics engine. The collected data also contains the MuJoCo physics engine state.
Procgen Benchmark 16 1.6M 4.46B R2D2 agent Procgen Appendix F.6 from the Gato paper mention that they trained a R2D2 agent on the 16 environments at the hard difficulty setting except for the maze and heist which they set to easy. OpenRL has some benchmarks here: https://wandb.ai/openrlbenchmark/openrlbenchmark/reportlist
RGB Stacking Simulator 1 387K 24.4B RGB Stacking The repo contains specialist agent
RGB Stacking real robot 1 15.7K 980M
Modular RL 38 843K 69.6B D4PG for a total of 140M steps with 30 random seeds Modular RL Appendix F.7 of the Gato paper mentions that the authors trained a D4PG agent on each variant for a total of 140M actor steps with 30 random seeds per variant.
DM Manipulation Playground 4 286K 6.58B The Gato paper mentions it contains 4 tasks of simulated Kinova Jaco arm but I cant find any specific repo or source for the “DM Manipulation playgroun”. Searching for ‘jaco’ in the DM control suite repo yields multiple results…. so maybe it is included in the DM Control suite repo?
Playroom 1 829K 118B The word “Playroom” literally appears only once in the paper… I found a reference to a “Playroom” environment in a repo from Google Research: https://github.com/google-research/google-research/tree/master/playrooms

Vision/Language datasets

Dataset Sample Weight Open-Source? Repo Open-Source equivalent Additional info
MassiveText No ThePile Web, Books, news articles and code https://vaclavkosar.com/ml/massivetext-dataset-pretraining-deepminds-gopher
MultiModal MassiveWeb (M3W) No Maybe this? Big interleaved Dataset Introduced in the Flamingo paper: https://openreview.net/pdf?id=EbMuimAbPbs
ALIGN No Cant find any Introduced by Google: https://ai.googleblog.com/2021/05/align-scaling-up-visual-and-vision.html
MS-COCO Captions Yes Pretty sure its in there: MS-COCO
Conceptual Captions Yes Google
LTIP No Proprietary from Deepmind, introduced in Flamingo paper
OKVQA Yes OKQVA
VQAV2 Yes VisualQA