Desktop: Importing from OneNote: Fix PDF printouts are imported as broken images by personalizedrefrigerator · Pull Request #15124 · laurent22/joplin

personalizedrefrigerator · 2026-04-16T15:28:32Z

Problem

When attaching a PDF file, OneNote prompts users to insert the PDF as a "printout". These printouts were imported into Joplin as broken image links.

In .one files, PDF printout pages seem to be PNGs, but are stored as attachments with a PDF filename. Joplin preserved the PDF extension and, as a result, the attachments would fail to display in Joplin (since they're actually PNGs).

Solution

Add an additional check before creating an image with a .PDF extension. If the image is actually a PNG, add a .png extension.

Update: It seems that PDF printouts are only sometimes inserted as PNGs. In other cases (e.g. one of the test files from this issue), they seem to be XPS files (which Electron doesn't support displaying as images).

Testing

Automated testing: A new automated test checks that a single-page PDF printout is imported as a PNG.
Manual testing:

Create a .one file from OneNote by attaching a PDF to a note and choosing the "printout" option.
Import the .one file from Joplin. Verify that the PDF printout displays in the note viewer.

coderabbitai · 2026-04-16T15:33:56Z

📝 Walkthrough

Walkthrough

Render pipeline now detects PNG data in image bytes and uses that to adjust output filenames: if an image is originally named with a .pdf extension but its bytes are PNG, the filename is suffixed with .png. A test was added to validate this behaviour.

Changes

Cohort / File(s)	Summary
PNG detection & image filename logic `packages/onenote-converter/renderer/src/utils.rs`, `packages/onenote-converter/renderer/src/page/image.rs`	Added `pub(crate) fn detect_png(header: &[u8]) -> bool`. `determine_image_filename` now accepts `initial_bytes: &[u8]` and, when an original filename has a `.pdf` extension but `detect_png(initial_bytes)` is true, the generated name is suffixed with `.png`. `render_image` now passes image bytes into filename determination; calls to `section.to_unique_safe_filename` use the possibly-modified name.
Test coverage `packages/onenote-converter/renderer/tests/convert.rs`	Added `#[test] fn convert_printout()` which runs conversion on `Printout.one` and asserts the rendered HTML references the PNG-suffixed filename (and does not reference the original PDF-only name).

Suggested labels

renderer

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly describes the main change: fixing the import of PDF printouts from OneNote that were being imported as broken images by detecting PNG data and adding the .png extension.
Description check	✅ Passed	The pull request description clearly relates to the changeset, explaining the problem (PDF printouts imported as broken images), the solution (detect PNG data and add .png extension), and testing approach.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

packages/onenote-converter/renderer/tests/convert.rs (1)

141-153: Strengthen the test by asserting the renamed asset file exists.

The current checks validate HTML content, but Line 151 passing does not prove test4_1.pdf.png was actually written to disk.

Suggested test enhancement

     // Should convert the input page to an HTML file
     let content_file = output_dir.join("Printout").join("Test.html");
     assert!(content_file.exists());
+    assert!(
+        output_dir.join("Printout").join("test4_1.pdf.png").exists(),
+        "renamed printout asset should exist on disk"
+    );

     let rendered_file = fs::read_to_string(content_file).expect("should read the content file");

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@packages/onenote-converter/renderer/tests/convert.rs` around lines 141 - 153,
The test currently asserts the rendered HTML contains "test4_1.pdf.png" but
doesn't ensure the asset file was written; add an assertion that the actual file
exists on disk by checking output_dir.join("Printout").join("test4_1.pdf.png")
(use Path::exists or fs::metadata) after creating rendered_file, referencing the
existing variables content_file, rendered_file, and output_dir in convert.rs to
locate the Printout folder and verify the renamed asset file is present.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/onenote-converter/renderer/src/page/image.rs`:
- Around line 58-63: The extension check currently compares
fs_driver().get_file_extension(name) == ".pdf" which misses uppercase variants
like ".PDF"; change the comparison to use a case-insensitive check (e.g., call
eq_ignore_ascii_case on the returned extension) so the conditional that appends
".png" runs for any case variant; update the conditional around
fs_driver().get_file_extension(name) and detect_png(initial_bytes) (the block
that sets name = if ... { format!("{name}.png") } else { ... }) to use the
case-insensitive comparison.

In `@packages/onenote-converter/renderer/src/utils.rs`:
- Around line 118-125: The PNG detection in detect_png incorrectly requires
header.len() > 4 causing 4-byte headers to be rejected; change the length check
to header.len() >= 4 so indices 0..3 are valid, keeping the existing comparisons
for header[0]..header[3] unchanged.

---

Nitpick comments:
In `@packages/onenote-converter/renderer/tests/convert.rs`:
- Around line 141-153: The test currently asserts the rendered HTML contains
"test4_1.pdf.png" but doesn't ensure the asset file was written; add an
assertion that the actual file exists on disk by checking
output_dir.join("Printout").join("test4_1.pdf.png") (use Path::exists or
fs::metadata) after creating rendered_file, referencing the existing variables
content_file, rendered_file, and output_dir in convert.rs to locate the Printout
folder and verify the renamed asset file is present.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 6ea6cf58-92fb-4726-9e65-0605fea5d870

📥 Commits

Reviewing files that changed from the base of the PR and between 7f2e78b and d65163a.

📒 Files selected for processing (4)

packages/onenote-converter/renderer/src/page/image.rs
packages/onenote-converter/renderer/src/utils.rs
packages/onenote-converter/renderer/tests/convert.rs
packages/onenote-converter/test-data/Printout.one

coderabbitai

♻️ Duplicate comments (1)

packages/onenote-converter/renderer/src/utils.rs (1)

118-128: ⚠️ Potential issue | 🟡 Minor

Validate the full PNG signature (all 8 bytes).

Line 120 and Lines 121-127 currently validate only the first 7 PNG signature bytes. Adding the final byte check (0x0A) avoids edge-case misclassification.

Suggested patch

 pub(crate) fn detect_png(header: &[u8]) -> bool {
     // PNGs start with a specific set of bytes. See https://en.wikipedia.org/wiki/PNG
-    header.len() > 6
+    header.len() >= 8
         && header[0] == 0x89
         && header[1] == 0x50 // 'P'
         && header[2] == 0x4E // 'N'
         && header[3] == 0x47 // 'G'
         && header[4] == 0x0D // \r
         && header[5] == 0x0A // \n
-        && header[6] == 0x1A
+        && header[6] == 0x1A
+        && header[7] == 0x0A // \n
 }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@packages/onenote-converter/renderer/src/utils.rs` around lines 118 - 128, The
PNG signature check in detect_png currently only validates 7 bytes and can
misclassify files; update it to require at least 8 bytes (use header.len() >= 8)
and add a check for header[7] == 0x0A so the function verifies the full 8-byte
PNG signature (function detect_png, update the length guard and append the
header[7] == 0x0A condition).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@packages/onenote-converter/renderer/src/utils.rs`:
- Around line 118-128: The PNG signature check in detect_png currently only
validates 7 bytes and can misclassify files; update it to require at least 8
bytes (use header.len() >= 8) and add a check for header[7] == 0x0A so the
function verifies the full 8-byte PNG signature (function detect_png, update the
length guard and append the header[7] == 0x0A condition).

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 4021fcb0-770b-4a33-ada0-f73814996e29

📥 Commits

Reviewing files that changed from the base of the PR and between d65163a and 9d52d18.

📒 Files selected for processing (2)

packages/onenote-converter/renderer/src/page/image.rs
packages/onenote-converter/renderer/src/utils.rs

personalizedrefrigerator added 4 commits April 16, 2026 07:54

Add test

8de0532

Fix file type detection for printouts

3e84373

Improve comment

8015ab5

Update comment

d65163a

coderabbitai bot added bug It's a bug desktop All desktop platforms import Related to importing files such as ENEX, JEX, etc. labels Apr 16, 2026

coderabbitai bot reviewed Apr 16, 2026

View reviewed changes

Comment thread packages/onenote-converter/renderer/src/page/image.rs

Comment thread packages/onenote-converter/renderer/src/utils.rs

personalizedrefrigerator added 3 commits April 16, 2026 09:20

Apply Coderabbit feedback: Use eq_ignore_ascii_case

ef10fa5

Apply CodeRabbit feedback: Check more of the file signature

d34d161

Fix clippy warning

9d52d18

coderabbitai bot added renderer About the note renderer and removed bug It's a bug desktop All desktop platforms import Related to importing files such as ENEX, JEX, etc. labels Apr 16, 2026

coderabbitai bot reviewed Apr 16, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Desktop: Importing from OneNote: Fix PDF printouts are imported as broken images#15124

Desktop: Importing from OneNote: Fix PDF printouts are imported as broken images#15124
personalizedrefrigerator wants to merge 7 commits intolaurent22:devfrom
personalizedrefrigerator:pr/desktop/onenote-import/fix-printout-file-type-detection

personalizedrefrigerator commented Apr 16, 2026 •

edited

Loading

Uh oh!

coderabbitai bot commented Apr 16, 2026 •

edited

Loading

Walkthrough

Changes

Suggested labels

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

personalizedrefrigerator commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Solution

Testing

Uh oh!

coderabbitai bot commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Suggested labels

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

personalizedrefrigerator commented Apr 16, 2026 •

edited

Loading

coderabbitai bot commented Apr 16, 2026 •

edited

Loading