TL;DR; The existing AXTreeData and AXTreeUpdate structures can be walked in the browser process using ax_tree.h and ax_tree_update.h when no alternatives are available. The trees describe complex documents and can be supplied by compromised renderers. Care should be taken when processing AX data, especially when processing updates. New features using the tree should be hosted in a sandboxed process.
One of Chromium's key features is its multi-process architecture. This brings numerous benefits but presents one key challenge for accessibility.
Operating system accessibility apis expect the ability to synchronously request data about the screen presented as a tree of semantic nodes. These nodes contain numerous attributes describing the node such as its role, display name, bounds, and more. Such an api is at the heart of technologies used for persons with disabilities. This includes software that reads the web page to those with visual impairments, highlights words to those with print disabilities, auto-scans the actionable views on a page for those with motor impairments, and much more.
Since Chromium’s renderers hold the semantics about the web page and the browser requires this data synchronously, Chromium developers had to come up with a way to bridge this gap while keeping the system itself performant.
After trying various approaches, Chromium settled on a model where accessibility trees get serialized from renderers and deserialized in a destination process such as the browser.
Although designed for features and extensions that improve Chrome’s accessibility, many other features make use of the trees and updates. The AX tree provides a convenient way to snapshot embedded frames, and surface page structure, without the features needing to walk the DOM or inject scripts into renderers.
In Chrome, the rule-of-two requires any C++ code in the browser process to only act on trustworthy data from renderers. This is usually achieved by passing data to the browser over mojo, which validates the data’s format and enforces that it is properly typed. AXTreeData and AXTreeUpdate (along with other AX mojom types) serve this purpose for accessibility tree snapshots and updates. Consumers of these structures should access them via ax_tree.h and ax_tree_update.h and need not further validate the /format/ of the data they represent. Normal web contents and javascript should not be able to generate or send invalid snapshots or updates.
However the AX Tree is complex and its attributes fields are not validated. Fundamentally the rule-of-two exists to prevent us acting on web content with bugprone C++ code. As the AXTree embeds and is controlled by web content we would implement it differently today. Its current design is necessary to work with accessibility tooling on operating systems that were not designed from the ground up to host web content.
Today, a compromised renderer can cause unexpected trees or updates to be sent to the browser process. Any code performing complex processing of the trees such as comparison of snapshots, or updating of internal data structures from updates, should guard itself against receiving manipulated snapshots or updates. Code using ax_tree.h can trust that the data structures are well-formed, but cannot trust that they hold valid data. This includes the contents of free-form fields such as strings, or the objects or types of objects referenced by a given ID. The only safe way to process this information is in a sandboxed environment or memory-safe language.
If your feature is doing simple things with the AX data you should be fine - it will make security review quicker if you briefly highlight what you are doing and why it is safe in your Security Considerations section.
The existing ax_tree.h, ax_tree_update.h and associated wrappers are accepted for limited uses in the browser.
Consumers must use the ax_tree.h methods to deserialize trees or apply updates and not reimplement this.
Consumers should not use the mojom structure of the tree directly and instead should rely on the deserialization & updating methods in ui/accessibility/.
Some fields in the tree represent sensitive data from the origin in a given renderer, or tokens that can drive features in the browser, and should not be shared into unrelated renderer processes:
If you are doing complicated things with the structures - or if you relate more than one snapshot, or update internal structures from snapshots or updates, then you enter hairy territory - a compromised render has significant control over the tree and can send malicious snapshots or updates.
You should do this processing in a sandboxed service process or the renderer, and your security considerations section should outline how your feature deals with malicious incoming data.
The content of the tree is untrustworthy and may come from a compromised renderer.
String might not have expected values or formats:
A compromised renderer might set or add unexpected attributes or values to updates or snapshots, for instance:
The AX Tree ID can sometimes come from a renderer - it is also the embedding token of the frame - so be very careful when tracking or observing these IDs. These tokens should not be shared between renderers and are unguessable.
No. Accessibility software needs to access and control applications for the person using Chrome. This is intended and supported behavior.
No. Accessibility software needs to access and control applications at the same level as a person using Chrome. This is intended and supported behavior.
No. This data represents the contents and structure of the page for accessibility tools, and other Chrome features. Data is safely passed via mojo.
The tree represents a collection of data from all renderer processes in a tab’s frame tree, and holds sensitive content that should not be passed back down to renderers to avoid data leaks between them. The accessibility tree is complex and any new complex processing should happen in a sandboxed process.
No. This data represents the contents and structure of the page for accessibility tools. Data is safely passed via mojo.
Like all IPC in Chrome a compromised renderer can lie about some fields. Privileged processes accessing this data do so via the ax_tree.h APIs, and treat any strings or offsets as untrustworthy. The tree itself is complex and features comparing trees should do so in a sandboxed process or memory safe language. See “Guidance” above for more discussion.
Frames should only pass data to frames that should have it. For instance the accessibility tree id is also the frame’s embedder token, and should not be available to unrelated frames.
All approaches have merits. Extensions are intrinsically safe and allow for experimentation outside of the Chromium. Content script injection is more fragile but can access everything it needs within the injected site. The ax tree provides an easy way to access page structure and contents without needing to reinvent methods for transmitting the structure, or updates to the structure. The accessibility tree is intended for accessibility surfaces and should not be extended for other features in Chrome. The ax tree can also be queried safely within a renderer.
Both approaches have merits. If a feature needs only limited data from a renderer then it will be better to query the DOM or use the ax tree within the renderer, then pass information to the browser via feature-specific mojo definitions. If a feature needs to know the full structure of the page, then it might be best to use the ax tree (via the ax_tree.h APIs) as they provide a semi-hardened way to access this.
A compromised renderer can supply whatever string or list attributes it wants - even when you access these through the ax_tree.h apis - and there is no guarantee they will have the format or contents you expect. Features in the browser or a web ui should sanitize or validate the data (e.g. by rejecting input containing invalid characters, or displaying fields using safe html elements in web ui). Accessibility helpers installed in the OS should perform their own validation.
See “Guidance” above for more discussion.