wreed
Mon Aug 14 15:11:28 CDT 2006
What I am trying to do is have a crawler so to speak to check if links
work, they are all internal links and their are tons of them, I prefer
not to have a txt file listing the links....I have used programs to
check the links... Xenu's Link Sleuth is great but I need to run a
script on the server to run and produce a report of broken links....I
am just not really sure how to really approach it, I would assume I
would open a file system object and just loop through folders and what
not but I cannot actully check for 404's that way.
McKirahan wrote:
> "wreed" <whreed@gmail.com> wrote in message
> news:1155577191.992082.305270@74g2000cwt.googlegroups.com...
> > I prefer it to be checking each link because we generate a lot of excel
> > files that if the program bombs we want to know that the link is no
> > longer going to work cause the excel file was not generated....we have
> > multiple apps that generate excel files that are then linked off the
> > site to.
> >
> > We have some pages that have hardcoded links (Absolute) and I would
> > like to see these and know which to change to relative links.
>
> [snip]
>
> I'm not sure I understand the appeoach you'd like to use.
> Could you clarify -- perhaps with examples?
>
> Here is a script that will read a file and, for each line that contains
> "http", it will fetch the URL and generate a log file of the results;
> (a "+" prefix indicates success; a "-" prefix indicates a failure.)
>
> Option Explicit
> '*
> '* Declare Constants
> '*
> Const cVBS = "links.vbs"
> Const cTXT = "links.txt"
> Const cLOG = "links.log"
> '*
> '* Declare Variables
> '*
> Dim intINS
> Dim arrTXT
> Dim intTXT
> Dim strTXT
> Dim intURL
> intURL = 0
> Dim strURL
> Dim booXML
> '*
> '* Declare Objects
> '*
> Dim objFSO
> Set objFSO = CreateObject("Scripting.FileSystemObject")
> Dim objLOG
> Set objLOG = objFSO.OpenTextFile(cLOG,2,true)
> Dim objTXT
> Set objTXT = objFSO.OpenTextFile(cTXT,1)
> Dim objXML
> Set objXML = CreateObject("Microsoft.XMLHTTP")
> '*
> '* Check Links
> '*
> strTXT = objTXT.ReadAll
> arrTXT = Split(strTXT,vbCrLf)
> For intTXT = 0 To UBound(arrTXT)
> strTXT = arrTXT(intTXT)
> intINS = InStr(strTXT,"http")
> If intINS > 0 Then
> strURL = Trim(Mid(strTXT,intINS))
> intURL = intURL + 1
> '*
> On Error Resume Next
> Err.Clear
> objXML.Open "GET",strURL,False
> objXML.Send
> If Err.Number <> 0 Or objXML.Status <> 200 Then
> booXML = "-"
> Else
> booXML = "+"
> End If
> On Error GoTo 0
> '*
> objLOG.WriteLine(booXML & " " & strTXT)
> End If
> Next
> '*
> '* Destroy Objects
> '*
> Set objXML = Nothing
> Set objTXT = Nothing
> Set objLOG = Nothing
> Set objFSO = Nothing
> '*
> '* Finish
> '*
> MsgBox intURL & " links checked",vbInformation,cVBS
>
>
> The "links.txt" file can either just identify a list of URLS:
>
http://www.google.com
>
http://www.msn.com
> or it may identfy the page that each URL is on; for example:
> Google Home Page
http://www.google.com
> MSN Home Page
http://www.msn.com
> it just expects the URL ("http" prefix) to be at the end of the line.
>
>
> Note that in some environments "Microsoft.XMLHTTP" may
> not work; if it doesn't, then try one of the following:
> "MSXML2.XMLHTTP.5.0"
> "MSXML2.XMLHTTP.4.0"
> "MSXML2.XMLHTTP.3.0"
> "MSXML2.XMLHTTP"