440Forums  |  MacMusic.org  |  PcMusic.org  |  440tv  |  Zicos  |  AudioLexic
and   {key13}


Verifying Web Links in PDF Files

TidBITS

Friday March 14, 2008. 09:31 PM
TidBITS

Our ebooks have tons of Web links in them, and for a long time, one of the most tedious production tasks was verifying that the links were still valid since the author added them in the manuscript. In an effort to simplify this task, I came up with the following process. Unfortunately for those trying to replicate it, my process relies on an expensive plug-in, the $699 Aerialist Pro from ARTS PDF. I initially purchased Aerialist Pro because it can generate PDF links from page numbers to the associated pages in a PDF; I used it to link up all the page numbers in the index of the ebook version of iPhoto '08: Visual QuickStart Guide. That task would have taken many hours using the astonishingly bad linking tool in Acrobat Professional 8, so I was able to justify the price. On the Mac, Aerialist Pro runs only in Acrobat Professional 7, so I was glad I kept that version around, and copies still seem to be available via Amazon.com. Aerialist Pro has other useful features, including the capability to produce a report listing all external links, which gave me what I needed to develop the rest of my process. (Unhappily, another Aerialist Pro feature that I would love to use - the capability to set link properties like zoom level and appearance en masse - turns out to have a bug that causes problems with documents viewed in Continuous mode. ARTS PDF has confirmed the bug, and I hope they fix it, along with enabling Aerialist Pro to work inside Acrobat Professional 8.) Aerialist Pro's external link report is itself a PDF, so my first step is to save the report from Acrobat as a plain text file, called Dependency Report.txt (the extension isn't optional). But in the end, I need a .html file, so I set up Noodlesoft's Hazel to look for a file called Dependency Report.txt in a specific folder, rename it uniquely and with a .html extension, and open it in BBEdit. Once I have the file in BBEdit, I run a text factory that takes the rather plain output from Aerialist Pro, strips out the cosmetic parts, and turns all of the links into proper HREFs. It's a lot of grep pattern matching, and while it wasn't trivial to create, it wasn't all that hard. The next trick is to check all the links. After much searching and testing, I found a $25 utility called Braxton's Link Tester (BLT) that does a nice job of checking links and reporting back on which ones have problems. After running the BBEdit text factory and saving the file, I drag the file's proxy icon (the little icon in the title bar of every window; just click, hold, and drag to use it just as though you were dragging the file's Finder icon) to BLT's Dock icon. In BLT, I then click the Check Links button and go do something else for a few minutes while it visits all the links. What I like about BLT is that it's easy to deselect the green checkmark tab that shows all the good links, since I don't care about those, and focus in on broken links (for the screenshot below, I left the good ones showing). BLT goes beyond a simple thumbs-up/thumbs-down display, identifying failed links, forbidden links, links that time out, links forbidden by robots.txt, server errors, email links that must be verified manually, and protocols that BLT doesn't recognize. Most of the time there are only a couple of broken links, if any, and then it's just a matter of going back into the original Word document and the working PDF file and either removing the links or replacing them with correct links. I won't pretend this is the only way to automate link checking. It might be possible, for instance, to write an AppleScript that would identify and check the links, reporting back on which ones had troubles. But I do hope this will give you a sense of how you might be able to eliminate a manual step in producing PDF files that work as they should. Copyright © 2008 Adam C. Engst. TidBITS is copyright © 2008 TidBITS Publishing Inc. If you're reading this article on a Web site other than TidBITS.com, please let us know, because if it was republished without attribution, by a commercial site, or in modified form, it violates our Creative Commons License. Fetch Softworks: Fetch 5.3 makes FTP and SFTP easy!Upload, download, mirror, and manage your Web site. Dozens ofnew features to make file transfers easier and more reliable.Get your free trial version at !   ...
ebooks have tons links them long time most tedious production Verifying Links Files
Verifying Web Links in PDF Files Read more at TidBITS
db.tidbits.com/article/9500?rss

 

 Related News 
Force 'new window' links to open in new tabs in Safari Force 'new window' links to open in new tabs in Safari
 MacOsxHints 03/26/08 04 PM 
10.5: Automatically Quick Look certain downloaded files 10.5: Automatically Quick Look certain downloaded files
 MacOsxHints 03/26/08 04 PM 
Free Text: Clipboard Management, Search Inside Files and More Free Text: Clipboard Management, Search Inside Files and More
 Mac Merc 03/21/08 03 AM 
Chipmunk: byte-by-byte comparison of duplicate files Chipmunk: byte-by-byte comparison of duplicate files
 MacNN 03/20/08 05 PM 
Apple files patent for holographic 3-D display Apple files patent for holographic 3-D display
 Mac Daily News 03/20/08 04 PM 
News: Apple clarifies iPhone dev status, adds links News: Apple clarifies iPhone dev status, adds links
 iPod Lounge 03/18/08 09 PM 
Save Flash video files from local cache Save Flash video files from local cache
 MacOsxHints 03/18/08 04 PM 
ZapMedia files patent infringement lawsuit against Apple over... ZapMedia files patent infringement lawsuit against Apple over...
 Mac Daily News 03/12/08 10 PM 
Apps: Domainer, Vvidget, Art Files Apps: Domainer, Vvidget, Art Files
 MacNN 03/12/08 08 PM 
Apple files trademark application for ?thinnovation? Apple files trademark application for ?thinnovation?
 Mac Daily News 03/11/08 10 PM 
0KB of disk space available even after deleting files (follow-up) 0KB of disk space available even after deleting files (follow-up)
 MacFixIt 03/07/08 08 PM 
Apple?s iPhone SDK event live coverage links Apple?s iPhone SDK event live coverage links
 Mac Daily News 03/06/08 07 PM 
Enclose lets you attach files without size limits Enclose lets you attach files without size limits
 Mac Central 03/06/08 05 PM 
Gracion Enclose 1.0 Released - Mac Users Transfer Files w... Gracion Enclose 1.0 Released - Mac Users Transfer Files w...
 AppleLinks 03/06/08 07 AM 
Apple files iMac, iPod shuffle trademarks Apple files iMac, iPod shuffle trademarks
 MacNN 03/05/08 10 PM 
0KB of disk space available, even after deleting files 0KB of disk space available, even after deleting files
 MacFixIt 03/05/08 09 PM 
Compress Files In Any Format Directly From The Finder Compress Files In Any Format Directly From The Finder
 AppleLinks 03/04/08 08 AM 
Compress Files '08 Adds Contextual Menu Support Compress Files '08 Adds Contextual Menu Support
 TheMacObserver 03/03/08 06 PM 
A shell function to make 'rm' move files to the trash A shell function to make 'rm' move files to the trash
 MacOsxHints 02/29/08 05 PM 
10.5: Share files and create sharing users in one step 10.5: Share files and create sharing users in one step
 MacOsxHints 02/27/08 05 PM 
Sort files into date-labeled subfolders using Perl Sort files into date-labeled subfolders using Perl
 MacOsxHints 02/25/08 05 PM 

Search

Mac Zicos
Tue October 14, 09:03 AM
and   {key13}