Disable pdf-text searching with pdfBox

江枫思渺然 提交于 2021-01-28 06:07:10

问题


I have a pdf document (no form) where I want to disable the text searching using pdfBox (java). Following possibilities I can imagine:

  • Flatten text
  • Remove Text information (without removing text itself)
  • Add overlay to document.

Currently I've no idea how I can implement that. Does anyone has an idea how to solve that?


回答1:


many thanks for your help here. I guess I found a way that fit to the requirements. (Honestly, not really clean):

  1. Add the rectangle to the address sections
  2. convert PDF to image
  3. convert image back to pdf.

While losing all text information, the user isn't able to see the critical information anymore. Due to the reason, that this is only for display (the initial PDF document doesn't get changed) this is ok for now.




回答2:


It depends on your goals:

  • avoid everything on some texts: print, mark with black ink, and scan again;

  • delete sensible text: you have to scan inside text, and remove/replace it (with pdfbox), but it is risky (some text are splitted);

  • mask some text for viewer : find text and add a black rectangle (with pdfbox), but it is not very safe. You can remove the rectangle, or use another tool to read the text. Usually, if text is inside, some tool can find it;

  • avoiding copy/paste the text (but not search / view): use security options, with password:

see: https://pdfbox.apache.org/2.0/cookbook/encryption.html

PDDocument doc = PDDocument.load(new File("filename.pdf"));

// Define the length of the encryption key.
// Possible values are 40, 128 or 256.
int keyLength = 128;
// 256 => plante

AccessPermission ap = new AccessPermission();

// disable printing, everything else is allowed
ap.setCanPrint(false);

ap.setCanExtractContent(false);
ap.setCanExtractForAccessibility(false);

// Owner password (to open the file with all permissions) is "12345"
// User password (to open the file but with restricted permissions, is empty here)
StandardProtectionPolicy spp = new StandardProtectionPolicy("12345", "", ap);
spp.setEncryptionKeyLength(keyLength);
spp.setPermissions(ap);
doc.protect(spp);

doc.save("filename-encrypted2.pdf");
doc.close();


来源:https://stackoverflow.com/questions/49507182/disable-pdf-text-searching-with-pdfbox

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!