Greetings,

Example Code:
-----------------------------------------------
testString = "<div>Hi there, what's new?</div>"
objRegEx.Pattern = "<div>(.*)</div>"
objRegEx.Global = true
objRegEx.IgnoreCase = true

str = objRegEx.Replace(testString, "The results are: $1")
----------------------------------------------

Unless I have something wrong, it appears that $1 is only "returned"
by VBScript. In other words, I can't put it into a variable or call
functions on it. I need to put the value of $1 into a javascript
variable and need to escape all of the single quotes, backslashes,
newlines, etc. Is there any way I can change the last line to:

str = objRegEx.Replace(testString, "The results are:" & replace("$1",
"\", "\\'")

and also have it work?

Help appreciated!
DRN

Re: VBScript/Regex Question by Ayush

Ayush
Fri Feb 23 13:05:36 CST 2007

Replied to [Drn]s message :
> Greetings,

> Example Code:
> -----------------------------------------------
> testString = "<div>Hi there, what's new?</div>"
> objRegEx.Pattern = "<div>(.*)</div>"
> objRegEx.Global = true
> objRegEx.IgnoreCase = true

> str = objRegEx.Replace(testString, "The results are: $1")
> ----------------------------------------------

> Unless I have something wrong, it appears that $1 is only "returned"
> by VBScript. In other words, I can't put it into a variable or call
> functions on it. I need to put the value of $1 into a javascript
> variable and need to escape all of the single quotes, backslashes,
> newlines, etc. Is there any way I can change the last line to:

> str = objRegEx.Replace(testString, "The results are:" & replace("$1",
> "\", "\\'")

> and also have it work?

> Help appreciated!
> DRN

I think this is the only way in vbs :

testString = "<div>Hi there\, what's new?</div>"
Set objRegEx = New RegExp
objRegEx.Pattern = "<div>(.*)</div>"
objRegEx.IgnoreCase = true

Set coll = objRegEx.Execute(testString)
str=""
For Each match in coll
str = str & replace(match.SubMatches(0),"\","\\")
Next

msgbox str

Good Luck, Ayush.
--
Scripting Solutions Center : http://snipurl.com/Scripting_Solutions

Re: VBScript/Regex Question by ekkehard

ekkehard
Fri Feb 23 13:31:13 CST 2007

Drn wrote:

> Greetings,
>
> Example Code:
> -----------------------------------------------
> testString = "<div>Hi there, what's new?</div>"
> objRegEx.Pattern = "<div>(.*)</div>"
> objRegEx.Global = true
> objRegEx.IgnoreCase = true
>
> str = objRegEx.Replace(testString, "The results are: $1")
> ----------------------------------------------
>
> Unless I have something wrong, it appears that $1 is only "returned"
> by VBScript. In other words, I can't put it into a variable or call
> functions on it. I need to put the value of $1 into a javascript
> variable and need to escape all of the single quotes, backslashes,
> newlines, etc. Is there any way I can change the last line to:
>
> str = objRegEx.Replace(testString, "The results are:" & replace("$1",
> "\", "\\'")
>
> and also have it work?
>
> Help appreciated!
> DRN
>
Some code

Dim aTests : aTests = Array( _
"<div>Hi there, what's new?</div>" _
, "<div>Hi there, what's new?</div>REs are greedy<div>by default.</div>" _
, "<div>Hi there, what's new? And CRLF" + vbCrLf + " is nasty too</div>" _
)
Dim aPatts : aPatts = Array( _
"<div>(.*)</div>" _
, "<div>(.*?)</div>" _
, "<div>([\s\S]*?)</div>" _
)
Dim sTest, sPatt
For Each sTest In aTests
For Each sPatt In aPatts
WScript.Echo Replace( Embrace( Array( sTest, sPatt ), "|" ), vbCrLf, "\r\n" )
Dim oRE : Set oRE = New RegExp
oRE.Pattern = sPatt
Dim oMTS : Set oMTS = oRE.Execute( sTest )
If 1 = oMTS.Count Then
WScript.Echo Replace( " |" + oMTS( 0 ).SubMatches( 0 ) + "|", vbCrLf,
"\r\n" )
Else
WScript.Echo "Not: 1 = oMTS.Count"
End If
WScript.Echo "-----------------"
Next
Next

Function Embrace( aItems, sDelim )
Embrace = sDelim + Join( aItems, sDelim ) + sDelim
End Function

and output:

=== greedyRE: Regexps are greedy by default ===================================
|<div>Hi there, what's new?</div>|<div>(.*)</div>|
|Hi there, what's new?|
-----------------
|<div>Hi there, what's new?</div>|<div>(.*?)</div>|
|Hi there, what's new?|
-----------------
|<div>Hi there, what's new?</div>|<div>([\s\S]*?)</div>|
|Hi there, what's new?|
-----------------
|<div>Hi there, what's new?</div>REs are greedy<div>by default.</div>|<div>(.*)</div>|
|Hi there, what's new?</div>REs are greedy<div>by default.|
-----------------
|<div>Hi there, what's new?</div>REs are greedy<div>by default.</div>|<div>(.*?)</div>|
|Hi there, what's new?|
-----------------
|<div>Hi there, what's new?</div>REs are greedy<div>by default.</div>|<div>([\s\S]*?)</div>|
|Hi there, what's new?|
-----------------
|<div>Hi there, what's new? And CRLF\r\n is nasty too</div>|<div>(.*)</div>|
Not: 1 = oMTS.Count
-----------------
|<div>Hi there, what's new? And CRLF\r\n is nasty too</div>|<div>(.*?)</div>|
Not: 1 = oMTS.Count
-----------------
|<div>Hi there, what's new? And CRLF\r\n is nasty too</div>|<div>([\s\S]*?)</div>|
|Hi there, what's new? And CRLF\r\n is nasty too|
-----------------
=== greedyRE: 0 done (00:00:00) ===============================================

(sorry about the word wrap) to show you:

(1) your pattern is dangerous because .* is greedy (tries to match
the longest possible sequence

(2) .*? is non-greedy (matches the shortest possible sequence)

(3) . means: everything except newline; so to match across lines you need to
use the tricky [\s\S] (all spaces and all non spaces - i.e. really
everything)

(4) you can use the MatchCollection object to get the Matches (here just
one) and a Match to get at it's SubMatches (= the material collected
by the () in the pattern; just be aware of $1, $2, .. but .SubMatches(0),
.SubMatches(1), ..)

I don't understand the necessity of mixing languages and all the things you
want to do to the resulting string, but I'd consider carefully, whether using
the DOM to access/manipulate the elements would be easier/safer.

Re: VBScript/Regex Question by Miyahn

Miyahn
Fri Feb 23 15:53:44 CST 2007

"Drn" wrote in message news:1172255309.743382.295190@s48g2000cws.googlegroups.com
> Greetings,
>
> Example Code:
> -----------------------------------------------
> testString = "<div>Hi there, what's new?</div>"
> objRegEx.Pattern = "<div>(.*)</div>"
> objRegEx.Global = true
> objRegEx.IgnoreCase = true
>
> str = objRegEx.Replace(testString, "The results are: $1")
> ----------------------------------------------
>
> Unless I have something wrong, it appears that $1 is only "returned"
> by VBScript. In other words, I can't put it into a variable or call
> functions on it. I need to put the value of $1 into a javascript
> variable and need to escape all of the single quotes, backslashes,
> newlines, etc.

You can use GetRef Function like this.

Option Explicit
Dim testString
testString = "<div>Hi there\, what's new?</div>"
With New RegExp
.Pattern = "<div>(.*)</div>"
.Global = true
.IgnoreCase = true
WScript.Echo "The results are:" & _
.Replace(testString, GetRef("MyFormat"))
End With
'
Function MyFormat(Match, sMatch1, Pos, Src)
With New RegExp
.Pattern = "['\\]"
.Global = true
MyFormat = .Replace(sMatch1, "\$&")
End With
End Function

--
Miyahn
Microsoft MVP for Microsoft Office - Excel(Jan 2004 - Dec 2007)
https://mvp.support.microsoft.com/profile=e971f039-a892-426c-9544-83d372c269b4

Re: VBScript/Regex Question by Paul

Paul
Fri Feb 23 16:21:50 CST 2007


"Drn" <dspafford@adelphia.net> wrote in message
news:1172255309.743382.295190@s48g2000cws.googlegroups.com...
> Greetings,
>
> Example Code:
> -----------------------------------------------
> testString = "<div>Hi there, what's new?</div>"
> objRegEx.Pattern = "<div>(.*)</div>"
> objRegEx.Global = true
> objRegEx.IgnoreCase = true
>
> str = objRegEx.Replace(testString, "The results are: $1")
> ----------------------------------------------
>
> Unless I have something wrong, it appears that $1 is only "returned"
> by VBScript. In other words, I can't put it into a variable or call
> functions on it. I need to put the value of $1 into a javascript
> variable and need to escape all of the single quotes, backslashes,
> newlines, etc. Is there any way I can change the last line to:
>
> str = objRegEx.Replace(testString, "The results are:" & replace("$1",
> "\", "\\'")
>
> and also have it work?
>
> Help appreciated!
> DRN

I'm not good with Regular Expressions and I haven't tested this myself, but
isn't there a read-only RegExp.$n property? Or does it only work in
JScript?

RegExp.$n
Returns the nine most-recently memorized portions found during pattern
matching.

-Paul Randall



Re: VBScript/Regex Question by Anthony

Anthony
Fri Feb 23 16:54:18 CST 2007


"Drn" <dspafford@adelphia.net> wrote in message
news:1172255309.743382.295190@s48g2000cws.googlegroups.com...
> Greetings,
>
> Example Code:
> -----------------------------------------------
> testString = "<div>Hi there, what's new?</div>"
> objRegEx.Pattern = "<div>(.*)</div>"
> objRegEx.Global = true
> objRegEx.IgnoreCase = true
>
> str = objRegEx.Replace(testString, "The results are: $1")
> ----------------------------------------------
>
> Unless I have something wrong, it appears that $1 is only "returned"
> by VBScript. In other words, I can't put it into a variable or call
> functions on it. I need to put the value of $1 into a javascript
> variable and need to escape all of the single quotes, backslashes,
> newlines, etc. Is there any way I can change the last line to:
>
> str = objRegEx.Replace(testString, "The results are:" & replace("$1",
> "\", "\\'")
>
> and also have it work?

You've had quite a bit of input already but it needs putting together.
However there is a important piece of information missing. Can the content
of the DIVs you are collecting also contain inner DIVs? If so it's going to
be real tricky. On the other hand, is the content of the DIVs purely text
with no other HTML mark up at all. If so it's more straight forward.

As Ekkehard has pointed out your pattern is flawed in that . won't include
new lines and </div> actually matches (.*). A pattern to extract all the
content of divs which do not have html themselves is:-

"<div>([^<]*)</div>"

Ayush has shown you can for each a collection of matches from the execute
then use sub-matches to get the inner content. However you want to perform
a replace on more than just \ found in the text. Miyahn demonstrated using
a function reference as the second parameter in the replace method however
I've used it below for a different purpose.

Set RegExpDivs = New RegExp
RegExpDivs.Pattern = "<div>([^<]*)</div>"
RegExpDivs.Global = True
RegExpDivs.IgnoreCase = True

Set RegExpText = New RegExp
RegExpText.Pattern = " \\|\r|\n|\' " ' Trim spaces (stupid 'inteligence' in
OE)
RegExpText.Global = True

oMatches = RegExpDivs.Execute(Input)

ReDim arrJSLiterals(oMatches.Count - 1)
i = 0
For Each oMatch In oMatches
arrJSLiterals(i) = RegExpText.Replace(oMatch.SubMatches(0),
GetRef("ReplaceTextItem"))
i = i + 1
Next

Function ReplaceTextItem(sMatch, lPos, sInput)

Select Case sMatch
Case vbCR : ReplaceTextItem = ""
Case vbLF : ReplaceTextItem = "\n"
Case Else : ReplaceTextItem = "\" & sMatch
End Select

End Function

If the divs can contain other html mark up then the RegExpDivs pattern will
need the non-greedy:-

"<div>([\s\S]*?)</div>"

If the divs can contain other DIVs then well ... can they?




Re: VBScript/Regex Question by Ayush

Ayush
Fri Feb 23 16:53:11 CST 2007

Replied to [Paul Randall]s message :
> I'm not good with Regular Expressions and I haven't tested this myself, but
> isn't there a read-only RegExp.$n property? Or does it only work in
> JScript?


Only in JS :(
support for Regular Expressions is much better in JS..

Good Luck, Ayush.
--
Windows Script Host Reference : http://snipurl.com/WSH_Reference

Re: VBScript/Regex Question by Anthony

Anthony
Sat Feb 24 09:02:14 CST 2007


"Ayush" <"ayushmaan.j[aatt]gmail.com"> wrote in message
news:u%23ePZ45VHHA.388@TK2MSFTNGP04.phx.gbl...
> Replied to [Paul Randall]s message :
> > I'm not good with Regular Expressions and I haven't tested this myself,
but
> > isn't there a read-only RegExp.$n property? Or does it only work in
> > JScript?
>
>
> Only in JS :(
> support for Regular Expressions is much better in JS..

I wouldn't say it was 'much better' just marginally better. You can't
simply get a collection of matches each with it's set of submatches in JS.
You need more code and multple calls into the regex object to do it.

However the replace callback function is much more flexible in JS.



Re: VBScript/Regex Question by Drn

Drn
Tue Feb 27 10:55:04 CST 2007

THE SOLUTION:

Thank you all very much for your input. Nothing like posting a problem
to a group like this to realize what little I really know LOL.

Here is the ASP function I ended up using:

function turnOnEditableAreas(str)
dim objRegEx
set objRegEx = new regexp

objRegEx.Pattern = "<div(?:.*?)class\s*=\s*""*editable""*.*?>([\s
\S]*?)</div>"
objRegEx.Global = true
objRegEx.IgnoreCase = true

set matches = objRegEx.execute(str)
matchText = matches(0)
matchTextNew = replace(replace(replace(replace(replace(matchText,
"\", "\\"), "'", "\'"), vbcr, "\r"), vblf, "\n"), vbtab, "\t")

str = objRegEx.Replace(str, "document.write('" &
objRegex.Replace(matchTextNew, "$1") & "');"

turnOnEditableAreas = str
end function

To clarify, for the future confused, my goal is to pass in the text of
a complete HTML page as a string. Buried in the HTML is a certain
<div> tag that ALWAYS has the attribute/value of class="editable". I
wanted to pluck out that <div> from the code, get the contents, escape
non-save javascript characters, and use the escaped information in a
javascript (hope that makes sense).

To clarify early points, as of right now, there is only 1 <div> tag in
the document that will match. And it is always guaranteed to be there.
My original, and still ideal, goal is to be able to do this to 0 or
more <div> tags in the document. Also, the contents of the <div>s DO
include HTML but at this point in time, do NOT include and inner <div>
tags.

So, to explain the code, the regex pattern matches any <div> that has
the attribute/value class="editable". I then execute the regex and
return the first (and only for now) match. I then call the
regex .replace function which replaces the appropriate <div> tag in
the string passed in to the function and replaces it with the
javascript escaped innards of the original <div> tag.

Phew. Thanks again for the help and my apologies for being so
confusing :)

Peace,
DRN