extract images from pdf using pdfbox

前端 未结 8 1982
刺人心
刺人心 2020-11-28 09:22

I m trying to extract images from a pdf using pdfbox. The example pdf here

But i m getting blank images only.

The code i m trying:-

public st         


        
8条回答
  •  隐瞒了意图╮
    2020-11-28 10:08

    This is a kotlin version of @Matt's answer.

    fun  PDResources.onImageResources(block: (RenderedImage) -> (R)): List =
            this.xObjectNames.flatMap {
                when (val xObject = this.getXObject(it)) {
                    is PDFormXObject -> xObject.resources.onImageResources(block)
                    is PDImageXObject -> listOf(block(xObject.image))
                    else -> emptyList()
                }
            }
    

    You can use it on PDPage Resources like this:

    page.resources.onImageResources { image ->
        Files.createTempFile("image", "xxx").also { path-> 
            if(!ImageIO.write(it, "xxx", file.toFile()))
                IllegalStateException("Couldn't write image to file")
        }
    }
    

    Where "xxx" is the format you need (like "jpeg")

提交回复
热议问题