I am working on an ASP page that parses text using the VBScript.RegExp
regular expression object. My reg expression right now is as follows:

[a-z]+\.[a-z]+\.[a-z]+/

And if find URL's no problem like: windowsupdate.microsoft.com,
www.cnn.com, etc.

But I need to also find any URL, like these:

www.amazon.com/books/atoz/index.html
OR
msdn.microsoft.com/newsgroups/default.aspx

Some URL with a deeper path than something.something.com if that makes
sense. Any ideas?

Re: Regular Expressions to find URL's in text by David

David
Fri Oct 08 09:04:53 CDT 2004

Nah...

What happens if someone writes a sentence and forgets to put a space between
the last word of the sentence, the period and the first word of the next
sentence?

URLs can take many forms and definitely don't need three parts. Some have
two some have four. What happens if someone puts in an IP address?

To get round the path/page name problem you should able to say where your
pattern matches anywhere in the string, not matches exactly.

Sorry to be the bearer of bad news.


"SROSeaner" <SROSeaner@discussions.microsoft.com> wrote in message
news:FAE7D48A-2E06-424E-B469-90E5E40037D8@microsoft.com...
> I am working on an ASP page that parses text using the VBScript.RegExp
> regular expression object. My reg expression right now is as follows:
>
> [a-z]+\.[a-z]+\.[a-z]+/
>
> And if find URL's no problem like: windowsupdate.microsoft.com,
> www.cnn.com, etc.
>
> But I need to also find any URL, like these:
>
> www.amazon.com/books/atoz/index.html
> OR
> msdn.microsoft.com/newsgroups/default.aspx
>
> Some URL with a deeper path than something.something.com if that makes
> sense. Any ideas?



Re: Regular Expressions to find URL's in text by larrybud2002

larrybud2002
Fri Oct 08 09:43:53 CDT 2004

SROSeaner <SROSeaner@discussions.microsoft.com> wrote in message news:<FAE7D48A-2E06-424E-B469-90E5E40037D8@microsoft.com>...
> I am working on an ASP page that parses text using the VBScript.RegExp
> regular expression object. My reg expression right now is as follows:
>
> [a-z]+\.[a-z]+\.[a-z]+/
>
> And if find URL's no problem like: windowsupdate.microsoft.com,
> www.cnn.com, etc.
>
> But I need to also find any URL, like these:
>
> www.amazon.com/books/atoz/index.html
> OR
> msdn.microsoft.com/newsgroups/default.aspx
>
> Some URL with a deeper path than something.something.com if that makes
> sense. Any ideas?

Why don't you just parse it to the first / character, and see if that conforms?

Re: Regular Expressions to find URL's in text by SROSeaner

SROSeaner
Sun Oct 10 19:11:01 CDT 2004

Thanks for your help. I got my parser to get all URL's in many forms
including IP addresses all from a disorganized html file. It is possible,
just a bugger to get going.

"Larry Bud" wrote:

> SROSeaner <SROSeaner@discussions.microsoft.com> wrote in message news:<FAE7D48A-2E06-424E-B469-90E5E40037D8@microsoft.com>...
> > I am working on an ASP page that parses text using the VBScript.RegExp
> > regular expression object. My reg expression right now is as follows:
> >
> > [a-z]+\.[a-z]+\.[a-z]+/
> >
> > And if find URL's no problem like: windowsupdate.microsoft.com,
> > www.cnn.com, etc.
> >
> > But I need to also find any URL, like these:
> >
> > www.amazon.com/books/atoz/index.html
> > OR
> > msdn.microsoft.com/newsgroups/default.aspx
> >
> > Some URL with a deeper path than something.something.com if that makes
> > sense. Any ideas?
>
> Why don't you just parse it to the first / character, and see if that conforms?
>