问题
I have successfully changed the color of underlines using below link code. Can anyone help me how to remove underlines from PDF, the underlines i have find using below link code.
Traverse whole PDF and change blue color to black ( Change color of underlines as well) + iText
Below is my code that are finding hyperlinks and changing their colors to black. I have to modify this code to remove those underlines.
PdfCanvasEditor editor = new PdfCanvasEditor() {
@Override
protected void write(PdfCanvasProcessor processor, PdfLiteral operator, List<PdfObject> operands)
{
String operatorString = operator.toString();
if (SET_FILL_RGB.equals(operatorString) && operands.size() == 4) {
if (isApproximatelyEqual(operands.get(0), 0) &&
isApproximatelyEqual(operands.get(1), 0) &&
isApproximatelyEqual(operands.get(2), 1)) {
super.write(processor, new PdfLiteral("g"), Arrays.asList(new PdfNumber(0), new PdfLiteral("g")));
return;
}
}
if (SET_STROKE_RGB.equals(operatorString) && operands.size() == 4) {
if (isApproximatelyEqual(operands.get(0), 0) &&
isApproximatelyEqual(operands.get(1), 0) &&
isApproximatelyEqual(operands.get(2), 1)) {
super.write(processor, new PdfLiteral("G"), Arrays.asList(new PdfNumber(0), new PdfLiteral("G")));
return;
}
}
super.write(processor, operator, operands);
}
boolean isApproximatelyEqual(PdfObject number, float reference) {
return number instanceof PdfNumber && Math.abs(reference - ((PdfNumber)number).floatValue()) < 0.01f;
}
final String SET_FILL_RGB = "rg";
final String SET_STROKE_RGB = "RG";
};
for (int i = 1; i <= pdfDocument.getNumberOfPages(); i++) {
editor.editPage(pdfDocument, i);
}
Edited:
Accepted answer is not working for below files:
https://raad-dev-test.s3.ap-south-1.amazonaws.com/36/2019-08-30/021549Orig1s025_aprepitant_clinpharm_prea_Mac.pdf (Page 41)
https://raad-dev-test.s3.ap-south-1.amazonaws.com/36/2019-08-30/400_206494S5_avibactam_and_ceftazidine_unireview_prea_Mac.pdf (Page 60).
Please Help.
回答1:
As described in a comment in the context of the referenced question
it is easy to make the editor class above remove vector graphics by replacing fill or stroke instructions by instructions dropping the current path without drawing it. If only doing so in case of the applicable current color being blue, that would likely do the job in case of your example PDFs. But beware, in documents with other graphics with blue elements (e.g. logos), these would be mutilated, too.
This is what the following content editor does:
class PdfGraphicsRemoverByColor extends PdfCanvasEditor {
public PdfGraphicsRemoverByColor(Color color) {
this.color = color;
}
@Override
protected void write(PdfCanvasProcessor processor, PdfLiteral operator, List<PdfObject> operands)
{
String operatorString = operator.toString();
if (color.equals(getGraphicsState().getFillColor())) {
switch (operatorString) {
case "f":
case "f*":
case "F":
operatorString = "n";
break;
case "b":
case "b*":
operatorString = "s";
break;
case "B":
case "B*":
operatorString = "S";
break;
}
}
if (color.equals(getGraphicsState().getStrokeColor())) {
switch (operatorString) {
case "s":
case "S":
operatorString = "n";
break;
case "b":
case "B":
operatorString = "f";
break;
case "b*":
case "B*":
operatorString = "f*";
break;
}
}
operator = new PdfLiteral(operatorString);
operands.set(operands.size() - 1, operator);
super.write(processor, operator, operands);
}
final Color color;
}
(RemoveGraphicsByColor helper class)
Applied like this:
try ( PdfReader pdfReader = new PdfReader(INPUT);
PdfWriter pdfWriter = new PdfWriter(OUTPUT);
PdfDocument pdfDocument = new PdfDocument(pdfReader, pdfWriter) )
{
PdfCanvasEditor editor = new PdfGraphicsRemoverByColor(ColorConstants.BLUE);
for (int i = 1; i <= pdfDocument.getNumberOfPages(); i++)
{
editor.editPage(pdfDocument, i);
}
}
(RemoveGraphicsByColor tests)
to the example files Control_of_nitrosamine_impurities_in_sartans__rev.pdf, EDQM_reports_issues_of_non-compliance_with_tooth__Mac.pdf, and originalFile.pdf from the referenced question, one gets:
and
and
Beware, this is merely a proof-of-concept, not a final and complete solution. In particular:
Only RGB blue is considered. This might be an issue particularly in case of documents explicitly designed for printing (likely using CMYK colors).
All path fills and strokes are dropped as long as they were blue. Depending on your documents this may have to be filtered.
PdfCanvasEditoronly inspects and edits the content stream of the page itself, not the content streams of displayed form XObjects or patterns; thus, some content may not be found. It can be generalized fairly easily.
Different shades of blue from other RGB'ish color spaces
Testing the code above you found documents in which the blue lines were not removed. As it turned out, these blue colors were not from the DeviceRGB standard RGB but instead from ICCBased colorspaces, profiled RGB color spaces to be more exact. Furthermore, in one document not a pure blue 0 0 1 but instead a .17255 .3098 .63529 blue was used.
To also be able to deal with these documents, the approach above must be generalized; e.g. we can use a Predicate<Color> instead of a single, specific Color, e.g. like this:
class PdfGraphicsRemoverByColorPredicate extends PdfCanvasEditor {
public PdfGraphicsRemoverByColorPredicate(Predicate<Color> colorPredicate) {
this.colorPredicate = colorPredicate;
}
@Override
protected void write(PdfCanvasProcessor processor, PdfLiteral operator, List<PdfObject> operands)
{
String operatorString = operator.toString();
if (colorPredicate.test(getGraphicsState().getFillColor())) {
switch (operatorString) {
case "f":
case "f*":
case "F":
operatorString = "n";
break;
case "b":
case "b*":
operatorString = "s";
break;
case "B":
case "B*":
operatorString = "S";
break;
}
}
if (colorPredicate.test(getGraphicsState().getStrokeColor())) {
switch (operatorString) {
case "s":
case "S":
operatorString = "n";
break;
case "b":
case "B":
operatorString = "f";
break;
case "b*":
case "B*":
operatorString = "f*";
break;
}
}
operator = new PdfLiteral(operatorString);
operands.set(operands.size() - 1, operator);
super.write(processor, operator, operands);
}
final Predicate<Color> colorPredicate;
}
(RemoveGraphicsByColor helper class)
Applied like this:
try ( PdfReader pdfReader = new PdfReader(INPUT);
PdfWriter pdfWriter = new PdfWriter(OUTPUT);
PdfDocument pdfDocument = new PdfDocument(pdfReader, pdfWriter) )
{
PdfCanvasEditor editor = new PdfGraphicsRemoverByColorPredicate(RemoveGraphicsByColor::isRgbBlue);
for (int i = 1; i <= pdfDocument.getNumberOfPages(); i++)
{
editor.editPage(pdfDocument, i);
}
}
(RemoveGraphicsByColor testRemoveAllBlueLinesFrom* tests)
to the new example files using this predicate method
public static boolean isRgbBlue(Color color) {
if (color instanceof CalRgb || color instanceof DeviceRgb || (color instanceof IccBased && color.getNumberOfComponents() == 3)) {
float[] components = color.getColorValue();
float r = components[0];
float g = components[1];
float b = components[2];
return b > .5f && r < .9f*b && g < .9f*b;
}
return false;
}
(RemoveGraphicsByColor helper method)
one gets
and
Beware, the warnings from above still apply.
来源:https://stackoverflow.com/questions/58029533/traverse-whole-pdf-and-remove-underlines-of-hyperlinks-annotations-only-itex