Skip to content

Desktop: Importing from OneNote: Fix PDF printouts are imported as broken images#15124

Open
personalizedrefrigerator wants to merge 7 commits intolaurent22:devfrom
personalizedrefrigerator:pr/desktop/onenote-import/fix-printout-file-type-detection
Open

Desktop: Importing from OneNote: Fix PDF printouts are imported as broken images#15124
personalizedrefrigerator wants to merge 7 commits intolaurent22:devfrom
personalizedrefrigerator:pr/desktop/onenote-import/fix-printout-file-type-detection

Conversation

@personalizedrefrigerator
Copy link
Copy Markdown
Collaborator

@personalizedrefrigerator personalizedrefrigerator commented Apr 16, 2026

Problem

When attaching a PDF file, OneNote prompts users to insert the PDF as a "printout". These printouts were imported into Joplin as broken image links.

In .one files, PDF printout pages seem to be PNGs, but are stored as attachments with a PDF filename. Joplin preserved the PDF extension and, as a result, the attachments would fail to display in Joplin (since they're actually PNGs).

Solution

Add an additional check before creating an image with a .PDF extension. If the image is actually a PNG, add a .png extension.

Update: It seems that PDF printouts are only sometimes inserted as PNGs. In other cases (e.g. one of the test files from this issue), they seem to be XPS files (which Electron doesn't support displaying as images).

Testing

Automated testing: A new automated test checks that a single-page PDF printout is imported as a PNG.
Manual testing:

  1. Create a .one file from OneNote by attaching a PDF to a note and choosing the "printout" option.
  2. Import the .one file from Joplin. Verify that the PDF printout displays in the note viewer.

@coderabbitai coderabbitai bot added bug It's a bug desktop All desktop platforms import Related to importing files such as ENEX, JEX, etc. labels Apr 16, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 16, 2026

📝 Walkthrough

Walkthrough

Render pipeline now detects PNG data in image bytes and uses that to adjust output filenames: if an image is originally named with a .pdf extension but its bytes are PNG, the filename is suffixed with .png. A test was added to validate this behaviour.

Changes

Cohort / File(s) Summary
PNG detection & image filename logic
packages/onenote-converter/renderer/src/utils.rs, packages/onenote-converter/renderer/src/page/image.rs
Added pub(crate) fn detect_png(header: &[u8]) -> bool. determine_image_filename now accepts initial_bytes: &[u8] and, when an original filename has a .pdf extension but detect_png(initial_bytes) is true, the generated name is suffixed with .png. render_image now passes image bytes into filename determination; calls to section.to_unique_safe_filename use the possibly-modified name.
Test coverage
packages/onenote-converter/renderer/tests/convert.rs
Added #[test] fn convert_printout() which runs conversion on Printout.one and asserts the rendered HTML references the PNG-suffixed filename (and does not reference the original PDF-only name).

Suggested labels

renderer

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly describes the main change: fixing the import of PDF printouts from OneNote that were being imported as broken images by detecting PNG data and adding the .png extension.
Description check ✅ Passed The pull request description clearly relates to the changeset, explaining the problem (PDF printouts imported as broken images), the solution (detect PNG data and add .png extension), and testing approach.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
packages/onenote-converter/renderer/tests/convert.rs (1)

141-153: Strengthen the test by asserting the renamed asset file exists.

The current checks validate HTML content, but Line 151 passing does not prove test4_1.pdf.png was actually written to disk.

Suggested test enhancement
     // Should convert the input page to an HTML file
     let content_file = output_dir.join("Printout").join("Test.html");
     assert!(content_file.exists());
+    assert!(
+        output_dir.join("Printout").join("test4_1.pdf.png").exists(),
+        "renamed printout asset should exist on disk"
+    );

     let rendered_file = fs::read_to_string(content_file).expect("should read the content file");
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/onenote-converter/renderer/tests/convert.rs` around lines 141 - 153,
The test currently asserts the rendered HTML contains "test4_1.pdf.png" but
doesn't ensure the asset file was written; add an assertion that the actual file
exists on disk by checking output_dir.join("Printout").join("test4_1.pdf.png")
(use Path::exists or fs::metadata) after creating rendered_file, referencing the
existing variables content_file, rendered_file, and output_dir in convert.rs to
locate the Printout folder and verify the renamed asset file is present.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/onenote-converter/renderer/src/page/image.rs`:
- Around line 58-63: The extension check currently compares
fs_driver().get_file_extension(name) == ".pdf" which misses uppercase variants
like ".PDF"; change the comparison to use a case-insensitive check (e.g., call
eq_ignore_ascii_case on the returned extension) so the conditional that appends
".png" runs for any case variant; update the conditional around
fs_driver().get_file_extension(name) and detect_png(initial_bytes) (the block
that sets name = if ... { format!("{name}.png") } else { ... }) to use the
case-insensitive comparison.

In `@packages/onenote-converter/renderer/src/utils.rs`:
- Around line 118-125: The PNG detection in detect_png incorrectly requires
header.len() > 4 causing 4-byte headers to be rejected; change the length check
to header.len() >= 4 so indices 0..3 are valid, keeping the existing comparisons
for header[0]..header[3] unchanged.

---

Nitpick comments:
In `@packages/onenote-converter/renderer/tests/convert.rs`:
- Around line 141-153: The test currently asserts the rendered HTML contains
"test4_1.pdf.png" but doesn't ensure the asset file was written; add an
assertion that the actual file exists on disk by checking
output_dir.join("Printout").join("test4_1.pdf.png") (use Path::exists or
fs::metadata) after creating rendered_file, referencing the existing variables
content_file, rendered_file, and output_dir in convert.rs to locate the Printout
folder and verify the renamed asset file is present.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 6ea6cf58-92fb-4726-9e65-0605fea5d870

📥 Commits

Reviewing files that changed from the base of the PR and between 7f2e78b and d65163a.

📒 Files selected for processing (4)
  • packages/onenote-converter/renderer/src/page/image.rs
  • packages/onenote-converter/renderer/src/utils.rs
  • packages/onenote-converter/renderer/tests/convert.rs
  • packages/onenote-converter/test-data/Printout.one

Comment thread packages/onenote-converter/renderer/src/page/image.rs
Comment thread packages/onenote-converter/renderer/src/utils.rs
@coderabbitai coderabbitai bot added renderer About the note renderer and removed bug It's a bug desktop All desktop platforms import Related to importing files such as ENEX, JEX, etc. labels Apr 16, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
packages/onenote-converter/renderer/src/utils.rs (1)

118-128: ⚠️ Potential issue | 🟡 Minor

Validate the full PNG signature (all 8 bytes).

Line 120 and Lines 121-127 currently validate only the first 7 PNG signature bytes. Adding the final byte check (0x0A) avoids edge-case misclassification.

Suggested patch
 pub(crate) fn detect_png(header: &[u8]) -> bool {
     // PNGs start with a specific set of bytes. See https://en.wikipedia.org/wiki/PNG
-    header.len() > 6
+    header.len() >= 8
         && header[0] == 0x89
         && header[1] == 0x50 // 'P'
         && header[2] == 0x4E // 'N'
         && header[3] == 0x47 // 'G'
         && header[4] == 0x0D // \r
         && header[5] == 0x0A // \n
-        && header[6] == 0x1A
+        && header[6] == 0x1A
+        && header[7] == 0x0A // \n
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/onenote-converter/renderer/src/utils.rs` around lines 118 - 128, The
PNG signature check in detect_png currently only validates 7 bytes and can
misclassify files; update it to require at least 8 bytes (use header.len() >= 8)
and add a check for header[7] == 0x0A so the function verifies the full 8-byte
PNG signature (function detect_png, update the length guard and append the
header[7] == 0x0A condition).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@packages/onenote-converter/renderer/src/utils.rs`:
- Around line 118-128: The PNG signature check in detect_png currently only
validates 7 bytes and can misclassify files; update it to require at least 8
bytes (use header.len() >= 8) and add a check for header[7] == 0x0A so the
function verifies the full 8-byte PNG signature (function detect_png, update the
length guard and append the header[7] == 0x0A condition).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 4021fcb0-770b-4a33-ada0-f73814996e29

📥 Commits

Reviewing files that changed from the base of the PR and between d65163a and 9d52d18.

📒 Files selected for processing (2)
  • packages/onenote-converter/renderer/src/page/image.rs
  • packages/onenote-converter/renderer/src/utils.rs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

renderer About the note renderer

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant