has many :code_blocks

Random thoughts and postings to keep track of my learning process

How to extract images from a pdf

leave a comment »


Let’s say you have a pdf document containing a number of jpeg pictures which you want to extract. For simplicity, let’s say you have a 20 page pdf document and there’s only one jpg on each page.

The easiest way I found how to do this is by using ghostscript which you can download from here http://pages.cs.wisc.edu/~ghost/

Ghostscript provides a set of command line tools you can use to extract the jpg files. I will use the command tool gswin32c.exe.

Create a folder, let’s say at c:\extract_files, and put in that folder your pdf . Cd into that folder and type the following command

gswin32c.exe -dBATCH -dNOPAUSE -r200x200 -dSAFER -dJPEG=95 -sDEVICE=jpeg -sOutputFile=%d.jpg  yourpdf.pdf

Make sure that gswin32c.exe is in your path for this command tool to work.

The parameter -r200x200 tells gswin32c.exe to extract the files using 200×200 dots per inch resolution.  Try different resolutions to see what happens.

Advertisements

Written by nkartcode

16/11/2011 at 12:21 PM

Posted in Ghostscript

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: