[go: nahoru, domu]

Open Bug 1898053 Opened 3 months ago Updated 3 months ago

Unclosed BDC can lead to have too much depth in the text layer

Categories

(Firefox :: PDF Viewer, defect, P1)

defect

Tracking

()

ASSIGNED

People

(Reporter: rodolfo.orlandini, Assigned: calixte)

References

()

Details

Attachments

(2 files, 1 obsolete file)

Attached file Docs Viewer.zip (obsolete) (deleted) —

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/125.0.0.0 Safari/537.36

Steps to reproduce:

Configuration:

Web browser and its version:

  • Chromium-based: 125.0.6422.60 (Crash with STATUS_STACK_OVERFLOW)
  • Mozilla Firefox 126.0 (Slow and consumes a lot of CPU, but not crash)

Operating system and its version:

  • Windows Pro 11 Version: 10.0.22631 Build: 22631

PDF.js version:

  • v4.2.67
  • v3.11.174
  • v2.9.359

Open the document in pdf.js by browsers:

  • Firefox
  • Chromium-based

Information I have and can provide:

  • This PDF was powered by Crystal Reports; modified using iText® 7.1.13 @ 2000-2020 iText Group NV (AGPL-version)
  • This PDF has been altered or corrupted since it was signed (Information obtained when I open it in Adobe Acrobat)

Actual results:

Using Firefox browser:

In this case, I try to open the specific document in the Mozilla Firefox browser, using the browser's own native pdf.js - It spends time consuming a lot of CPU and the entire browser interface slows down and freezes, but does not crash - The message is also displayed that this tab is slowing down the browser and asks me if I want to stop it.

[ATTACHED VIDEO: Mozilla Firefox browser.mp4]

Using Chromium-based browser:

In this case, I try to open the specific document in Chromium-based browsers (In the video I used the Brave browser, but the same problem occurred in Edge and Chorme) - And after a few seconds the browser crashes with STATUS_STACK_OVERFLOW.

[ATTACHED VIDEO: Chromium-based browser.mp4]

Expected results:

  • Firefox: Do not cause slowness.
  • Chromium-based: Don't crash.
Component: Untriaged → PDF Viewer
Group: firefox-core-security → mozilla-employee-confidential

This was filed to share confidential data to help with debugging https://github.com/mozilla/pdf.js/issues/18135

I checked few pdfs and they contain some BDC commands but without the closing EMC:

BT
  BDC ...
ET

This is invalid (see https://opensource.adobe.com/dc-acrobat-sdk-docs/pdfstandards/PDF32000_2008.pdf#M13.9.21059.Table.caption.wide.Table85.Marked.content.operators).
For each BDC we build a span and its parent is the span corresponding to the previous BDC, consequently the DOM tree is too deep and cause a crash in Chrome but it's ok in Firefox even if it's probably a bad idea because it could crash or be slow to render with an other pdf.
So we should auto-close any pending BDC when we have an ET.

Severity: -- → S3
Priority: -- → P3

(In reply to Calixte Denizet (:calixte) from comment #2)

I checked few pdfs and they contain some BDC commands but without the closing EMC:

BT
  BDC ...
ET

This is invalid (see https://opensource.adobe.com/dc-acrobat-sdk-docs/pdfstandards/PDF32000_2008.pdf#M13.9.21059.Table.caption.wide.Table85.Marked.content.operators).
For each BDC we build a span and its parent is the span corresponding to the previous BDC, consequently the DOM tree is too deep and cause a crash in Chrome but it's ok in Firefox even if it's probably a bad idea because it could crash or be slow to render with an other pdf.
So we should auto-close any pending BDC when we have an ET.

You're right, to prove your idea, I went to pdf.worker.js and removed the BDC map from opMap() and it no longer stops me when opening PDFs with problems.

...
},
//BDC: {
//  id: _util.OPS.beginMarkedContentProps,
//  numArgs: 2,
//  variableArgs: false
//},
EMC: {
...

But I still have doubts if this is a bug in pdf.js or if it is in Chromium-based browsers, because even if it is a bug in pdf.js, it seems strange to me that Firefox handles the problem and the others don't.

The real bug is in the PDF itself.
In pdf.js we should handle such a case in a better way to avoid any problems.
Probably there is a function in Chrome which is calling itself with its children as an argument and then having a too deep DOM induces to much recursion and then a stack overflow. I know we've some functions like this in Gecko. So it's probably possible in having more BDC to have a crash in Firefox.
That said if you're able to identify what's the tool which is adding such BDC, you should file a bug for it.

(In reply to Calixte Denizet (:calixte) from comment #2)

So we should auto-close any pending BDC when we have an ET.

Yes, but implementing that may be less straightforward since the specification also says: "Marked-content sequences may be nested one within another, [...]"

Attached file bug1898053_minimal.pdf (obsolete) (deleted) —

Here's a minimal test case.
:rodolfo could you remove your attachment ? And then I'll make this bug public, thank you.

Attachment #9403046 - Attachment is obsolete: true

(In reply to Calixte Denizet (:calixte) from comment #6)

Created attachment 9403364 [details]
bug1898053_minimal.pdf

Here's a minimal test case.
:rodolfo could you remove your attachment ? And then I'll make this bug public, thank you.

Apparently I can only make it obsolete, will that solve the problem?

The content of attachment 9403364 [details] has been deleted for the following reason:

Deleted at author request
The content of attachment 9403046 [details] has been deleted for the following reason:

Deleting the right one
Attached file bug1898053_minimal.pdf
Assignee: nobody → cdenizet
Group: mozilla-employee-confidential
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Priority: P3 → P1
Summary: Opening PDF documents in Chromium-based browsers causes a STATUS_STACK_OVERFLOW failure → Unclosed BDC can lead to have too much depth in the text layer
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: