Wednesday, November 28, 2007

Generating a thumbnail image of a PDF document

Like it exists for HTML pages, I wanted to created thumbnail (preview) image of a pdf document, same as I can get on a Wind*ws Explorer for example.
On the web, given the time it takes to open a PDF document in the Reader, better know before what it looks like, so I don't have to download wrong one.

So I used the PDF Box library. It can convert easily a PDF document to an image.

First need to load the document, get the first page of the document, and convert it to an image:

// Converting first Page to an image
url = new URL(documentPath);
document = PDDocument.load(url);
PDDocumentCatalog catalog = document.getDocumentCatalog();
List allPages = catalog.getAllPages();
if (null == allPages || (allPages.size() == 0)) {
throw new Exception("The document is Empty");
PDPage page = (PDPage) allPages.get(0);
BufferedImage image = page.convertToImage();
Then just write the image.

All this can be done calling directly the PDFToImage Main class passing the right arguments (Start page and end page =1).

I tried to reduce the size of the image on a second step (to make it small like a thumbnail) using the code from here, but it often gets to a really bad quality. On web page , better simply size the image.

Also, the underlying code is "beta quality". So shouldn't expect too much from it. ( I tried random PDF from the net, and some of them didn't work. Basically if it standard text it should work.)