Sunday, December 2, 2007

[Tech tips] Convert a scanned book pdf file (A3) into a sequence of individual pages (A4)

I recently needed to scan sections of a book, and compile them into a pdf file. Scanning is much faster if I scan two pages at a time (i.e. both pages, with the book flat on the scanner glass). However, the final pdf needed to be one book page per one pdf page.



I couldn't find any utilities or Acrobat functions to split scanned pdf pages in half, so I quickly wrote my own simple JavaScript for this purpose. (I have significant Java programming experience, but this was my first practical use of JavaScript--I hope I haven't made any stupid mistakes.)

Feel free to use it and modify it-- it would be great if you can post any improvements/corrections.


Script
/* Adobe Acrobat JavaScript for dividing A3 scanned pdf to A4 of half width*/

var inch = 72; var centimeter = 28.3464567;
/* ==========================*/
/* Modify the following four variables based on your own document properties */
/* Units are specified in points (i.e. 1 inch = 72 points) */
/* ==========================*/
var cropFromTop = 0.5 * centimeter;
var cropFromBottom = 3 * centimeter;
/*Smaller LR crop margin (e.g. right crop for odd pages)*/
var cropFromLRSmaller = 1 * centimeter;
/*Bigger LR crop margin (e.g. left crop for odd pages)*/
var cropFromLRBigger = 22 * centimeter;
/* ==========================*/

var originalNumPages = this.numPages;

/*the following loop processes pages from the back of the document*/
/*note that page indexes are 0-based, as is the java convention*/
/* (i.e. first page is page 0)*/
for (var i=originalNumPages-1; i >= 0; i--) {

//First: insert a duplicate page after page i
this.insertPages({
nPage: i,
cPath: this.path,
nStart: i,
nEnd: i
});

//Second: crop page i and page i+1 so that they only show the right or left half, respectively
var aRect = this.getPageBox({
cBox: "Media",
nPage: i
});
//var width = aRect[2] - aRect[0];
//var height = aRect[1] - aRect[3];

var cropOddArray =
new Array( aRect[0] + cropFromLRSmaller,
aRect[1] - cropFromTop,
aRect[2] - cropFromLRBigger,
aRect[3] + cropFromBottom);
var cropEvenArray =
new Array( aRect[0] + cropFromLRBigger,
aRect[1] - cropFromTop,
aRect[2] - cropFromLRSmaller,
aRect[3] + cropFromBottom);

this.setPageBoxes({
cBox: "Media",
nStart: i,
nEnd: i,
rBox: cropOddArray
});
this.setPageBoxes({
cBox: "Media",
nStart: i + 1,
nEnd: i + 1,
rBox: cropEvenArray
});

}




How to use the script

First, open the batch processing dialog box.



Make a new batch processing sequence.



Choose "select commands" to populate the processing sequence.



Add an "execute JavaScript" command to the processing sequence. (By the way, here, I rotate the pdf before the cropping procedure, and afterwards, run OCR.) Select the "execute JavaScript" command, and press "Edit."



Paste the JavaScript code from above into the dialog box. Don't forget to set the document margins within the script (four variables) to meet your particular needs.



The batch processing sequence should now work, when run.



Further information

Acrobat JavaScript Guide and Reference Manual
About page crop boxes

3 comments:

Anonymous said...

Your blog keeps getting better and better! Your older articles are not as good as newer ones you have a lot more creativity and originality now keep it up!

Anonymous said...

Will try this out today..thanks.
And clicked on your Ads :)

Anonymous said...

It is truly a nice and helpful piece of information. I am satisfied
that you shared this useful info with us. Please keep us up to date like this.
Thank you for sharing.

Feel free to visit my website ... maigrir vite