Monday, August 27, 2018

Find bad or corrupted PDF files

Hi everyone,

 I work with several thousand PDF files and wanted a good way to find which ones are bad or corrupted due to various issues. I came up with this batch script that will work well on windows. It uses CPDF, which must be downloaded separately, and installed. Then my batch file will call the CPDF.exe file and proceed appropriately. Download here: https://www.coherentpdf.com/

NOTE:

 You will need access to the location of CPDF in the filesystem. I chose to place it in the C:\Windows\System32 directory. You may choose to place it somewhere else. You may need to add the DOS path to the location of CPDF, in order for the batch script to correctly run, or you could probably place the path in the script itself, for example:

 REPLACE THE LINE: for /f "delims=?" %%W in ('cpdf -page-info "%%Z" 2^>^&1 ^| find /c "error"') do set NumberOfErrors=%%W

 WITH SOMETHING LIKE THIS: for /f "delims=?" %%W in ('c:\windows\system32\cpdf -page-info "%%Z" 2^>^&1 ^| find /c "error"') do set NumberOfErrors=%%W

 USAGE: Copy the script below and save in a .bat file, and then run in the folder that has all your PDF files... =============================================================================================================

 Here is the actual Batch script:


:: This will cycle through PDF files and if any have errors or are corrupted

:: it will rename the problem files with a prefix.

:: EXAMPLE: myFile.pdf will be renamed to _BadPDF_myFile.pdf

:: Uses the CPDF application, which must be downloaded...

 setlocal enabledelayedexpansion
 for /f "delims=?" %%Z in ('dir /b *.pdf') do (
   
     rem @echo "%%Z"
     for /f "delims=?" %%W in ('cpdf -page-info "%%Z" 2^>^&1 ^| find /c "error"') do set NumberOfErrors=%%W
     rem @echo !NumberOfErrors!
     if !NumberOfErrors! GTR 0 ren "%%Z" "_BadPDF_%%Z"

)

timeout /T 15

exit

No comments:

Post a Comment