Hi,

This seems a bit of an old chestnut but please humour me ;)

My aim is to compare two arrays and identify values in List A which
don't appear in List B. And vice versa. My arrays are one dimensional
and contain names. The number of elements in each array is not fixed.

I've done some searching and there are two basic algorithms. The first
contains a double loop where each element of the inner loop is compared
to elements in the outer loop. At best it kicks out of the inner loop
when a match is found.

The second method seems to be more elegant but assumes the two arrays
have already been sorted. Basically the indeces of the two arrays are
incremented when an unmatched value is found. An example code can be
found below which works ... until you place a unique value as the last
element :(

For example, if I have:

ListA = Array("Andy", "Bob", "Cath", "Dexter")
ListB = Array("Andy", "Bob", "Cath")

BOTH "Cath" and "Dexter" are found to be unique! I switched the values
of list A to B and came up with the same poor result.

I'd appreciate it if someone could give this a fresh look and find the
logic error. I've been staring at it for hours :(

Many thanks in advance,

Andy

Script:
Option Explicit

Dim ListA ' To contain a sorted list of names, unknown size.
Dim ListB ' To contain a sorted list of names, unknown size.
Dim lngAIdx ' Index for list A.
Dim lngBIdx ' Index for list B.
Dim lngMaxAIdx ' Upper value of the A list index.
Dim lngMaxBIdx ' Upper value of the B list index.
Dim intComparison ' Result of comparing two strings
'
ListA = Array("Andy", "Bob", "Cath", "Dexter")
ListB = Array("Bob", "Cath", "Dexter")

lngAIdx = LBound(ListA)
lngBIdx = LBound(ListB)
lngMaxAIdx = UBound(ListA)
lngMaxBIdx = UBound(ListB)
'
Do While (lngAIdx < lngMaxAIdx And lngBIdx < lngMaxBIdx)
'Perform a case insensitive comparison.
intComparison = strComp(ListA(lngAIdx), ListB(lngBIdx), vbTextCompare)
If intComparison = -1 Then ' ListA(lngAIdx) < ListB(lngBIdx)
' Value in A list is not present in the B list.
WScript.Echo "Unique A Value: " & ListA(lngAIdx)
lngAIdx = lngAIdx + 1 ' Update the A list index.
ElseIf intComparison = 1 Then ' ListA(lngAIdx) > ListB(lngBIdx)
' Value in B list is not present in the A list.
WScript.Echo "Unique B Value: " & ListB(lngBIdx)
lngBIdx = lngBIdx + 1 ' Update the B list index.
Else ' intComparison = 0
' The value appears in both lists
lngAIdx = lngAIdx + 1
lngBIdx = lngBIdx + 1
End If
Loop
'
'Having reached this point, one of the lists has finished.
'Display the remaining unique values.
If lngAIdx < lngMaxAIdx Then
For lngAIdx = lngAIdx to lngMaxAIdx
WScript.Echo "Remaining Unique A value: " & ListA(lngAIdx)
Next
End If
'
If lngBIdx < lngMaxBIdx Then
For lngBIdx = lngBIdx To lngMaxBIdx
WScript.Echo "Remaining Unique B value: " & ListB(lngBIdx)
Next
End If

WScript.Quit

Re: Compare the values of two sorted arrays of variable size. by Richard

Richard
Mon Jun 20 20:05:16 CDT 2005

Hi,

I don't know where your arrays come from, so perhaps it would help if one
(or both) are dictionary objects. Dictionary objects are key, item pairs,
where the key values must be unique. The object has an efficient Exists
method that makes it easy to check for duplicates.

Even if you start with your two arrays, maybe you can create dictionary
objects, then use them to check the arrays. In brief (and not tested):

' Create two dictionary objects.
Set objListA = CreateObject("Scripting.Dictionary")
Set objListB = CreateObject("Scripting.Dictionary")
' Make comparisons case insensitive.
objListA.CompareMode = vbTextCompare
objListB.CompareMode = vbTextCompare

' Enumerate all elements of array ListA.
For j = 0 To UBound(ListA)
' Check if this element in dictionary object A.
If Not objListA.Exists(ListA(j)) Then
' Add to dictionary object.
objListA(ListA(j)) = True
End If
Next

' Enumerate the second array and find unique elements.
For j = 0 To UBound(ListB)
' Check if this element in first array.
If Not objListA.Exists(ListB(j)) Then
Wscript.Echo ListB(j) & " is in ListA but not ListB"
End If
' Check if this element in dictionary object B.
If Not objListB.Exists(ListB(j)) Then
' Add to dictionary object.
objListB(ListB(j)) = True
End If
Next

' Enumerate the second array and find unique elements.
For j = 0 To UBound(ListA)
' Check if this element in second array.
If Not objListB.Exists(ListA(j)) Then
Wscript.Echo ListA(j) & " is in ListB but not ListA"
End If
Next

In these dictionary objects, the name in the array is the key value. For the
item value I simply use True, as I only care about existence.

--
Richard
Microsoft MVP Scripting and ADSI
Hilltop Lab web site - http://www.rlmueller.net
--
"Andy" <andy.and.suzanne.cooper@gmail.com> wrote in message
news:1119306333.719399.87100@f14g2000cwb.googlegroups.com...
> Hi,
>
> This seems a bit of an old chestnut but please humour me ;)
>
> My aim is to compare two arrays and identify values in List A which
> don't appear in List B. And vice versa. My arrays are one dimensional
> and contain names. The number of elements in each array is not fixed.
>
> I've done some searching and there are two basic algorithms. The first
> contains a double loop where each element of the inner loop is compared
> to elements in the outer loop. At best it kicks out of the inner loop
> when a match is found.
>
> The second method seems to be more elegant but assumes the two arrays
> have already been sorted. Basically the indeces of the two arrays are
> incremented when an unmatched value is found. An example code can be
> found below which works ... until you place a unique value as the last
> element :(
>
> For example, if I have:
>
> ListA = Array("Andy", "Bob", "Cath", "Dexter")
> ListB = Array("Andy", "Bob", "Cath")
>
> BOTH "Cath" and "Dexter" are found to be unique! I switched the values
> of list A to B and came up with the same poor result.
>
> I'd appreciate it if someone could give this a fresh look and find the
> logic error. I've been staring at it for hours :(
>
> Many thanks in advance,
>
> Andy
>
> Script:
> Option Explicit
>
> Dim ListA ' To contain a sorted list of names, unknown size.
> Dim ListB ' To contain a sorted list of names, unknown size.
> Dim lngAIdx ' Index for list A.
> Dim lngBIdx ' Index for list B.
> Dim lngMaxAIdx ' Upper value of the A list index.
> Dim lngMaxBIdx ' Upper value of the B list index.
> Dim intComparison ' Result of comparing two strings
> '
> ListA = Array("Andy", "Bob", "Cath", "Dexter")
> ListB = Array("Bob", "Cath", "Dexter")
>
> lngAIdx = LBound(ListA)
> lngBIdx = LBound(ListB)
> lngMaxAIdx = UBound(ListA)
> lngMaxBIdx = UBound(ListB)
> '
> Do While (lngAIdx < lngMaxAIdx And lngBIdx < lngMaxBIdx)
> 'Perform a case insensitive comparison.
> intComparison = strComp(ListA(lngAIdx), ListB(lngBIdx), vbTextCompare)
> If intComparison = -1 Then ' ListA(lngAIdx) < ListB(lngBIdx)
> ' Value in A list is not present in the B list.
> WScript.Echo "Unique A Value: " & ListA(lngAIdx)
> lngAIdx = lngAIdx + 1 ' Update the A list index.
> ElseIf intComparison = 1 Then ' ListA(lngAIdx) > ListB(lngBIdx)
> ' Value in B list is not present in the A list.
> WScript.Echo "Unique B Value: " & ListB(lngBIdx)
> lngBIdx = lngBIdx + 1 ' Update the B list index.
> Else ' intComparison = 0
> ' The value appears in both lists
> lngAIdx = lngAIdx + 1
> lngBIdx = lngBIdx + 1
> End If
> Loop
> '
> 'Having reached this point, one of the lists has finished.
> 'Display the remaining unique values.
> If lngAIdx < lngMaxAIdx Then
> For lngAIdx = lngAIdx to lngMaxAIdx
> WScript.Echo "Remaining Unique A value: " & ListA(lngAIdx)
> Next
> End If
> '
> If lngBIdx < lngMaxBIdx Then
> For lngBIdx = lngBIdx To lngMaxBIdx
> WScript.Echo "Remaining Unique B value: " & ListB(lngBIdx)
> Next
> End If
>
> WScript.Quit
>



Re: Compare the values of two sorted arrays of variable size. by Rafael

Rafael
Mon Jun 20 20:37:53 CDT 2005

Andy,

This algorithm does not check for every possible location in each array.
Your easiest way to check would be 2 loops (like the first idea you had)

grab one item on the first array and compare it against all items on the
second array, move to the second item and compare it against all items on
the second array and so on.

if you need an example, let me know. if you need an example of why the code
you have does not work as expected, let me know and i explain it to you.
RT

"Andy" <andy.and.suzanne.cooper@gmail.com> wrote in message
news:1119306333.719399.87100@f14g2000cwb.googlegroups.com...
> Hi,
>
> This seems a bit of an old chestnut but please humour me ;)
>
> My aim is to compare two arrays and identify values in List A which
> don't appear in List B. And vice versa. My arrays are one dimensional
> and contain names. The number of elements in each array is not fixed.
>
> I've done some searching and there are two basic algorithms. The first
> contains a double loop where each element of the inner loop is compared
> to elements in the outer loop. At best it kicks out of the inner loop
> when a match is found.
>
> The second method seems to be more elegant but assumes the two arrays
> have already been sorted. Basically the indeces of the two arrays are
> incremented when an unmatched value is found. An example code can be
> found below which works ... until you place a unique value as the last
> element :(
>
> For example, if I have:
>
> ListA = Array("Andy", "Bob", "Cath", "Dexter")
> ListB = Array("Andy", "Bob", "Cath")
>
> BOTH "Cath" and "Dexter" are found to be unique! I switched the values
> of list A to B and came up with the same poor result.
>
> I'd appreciate it if someone could give this a fresh look and find the
> logic error. I've been staring at it for hours :(
>
> Many thanks in advance,
>
> Andy
>
> Script:
> Option Explicit
>
> Dim ListA ' To contain a sorted list of names, unknown size.
> Dim ListB ' To contain a sorted list of names, unknown size.
> Dim lngAIdx ' Index for list A.
> Dim lngBIdx ' Index for list B.
> Dim lngMaxAIdx ' Upper value of the A list index.
> Dim lngMaxBIdx ' Upper value of the B list index.
> Dim intComparison ' Result of comparing two strings
> '
> ListA = Array("Andy", "Bob", "Cath", "Dexter")
> ListB = Array("Bob", "Cath", "Dexter")
>
> lngAIdx = LBound(ListA)
> lngBIdx = LBound(ListB)
> lngMaxAIdx = UBound(ListA)
> lngMaxBIdx = UBound(ListB)
> '
> Do While (lngAIdx < lngMaxAIdx And lngBIdx < lngMaxBIdx)
> 'Perform a case insensitive comparison.
> intComparison = strComp(ListA(lngAIdx), ListB(lngBIdx), vbTextCompare)
> If intComparison = -1 Then ' ListA(lngAIdx) < ListB(lngBIdx)
> ' Value in A list is not present in the B list.
> WScript.Echo "Unique A Value: " & ListA(lngAIdx)
> lngAIdx = lngAIdx + 1 ' Update the A list index.
> ElseIf intComparison = 1 Then ' ListA(lngAIdx) > ListB(lngBIdx)
> ' Value in B list is not present in the A list.
> WScript.Echo "Unique B Value: " & ListB(lngBIdx)
> lngBIdx = lngBIdx + 1 ' Update the B list index.
> Else ' intComparison = 0
> ' The value appears in both lists
> lngAIdx = lngAIdx + 1
> lngBIdx = lngBIdx + 1
> End If
> Loop
> '
> 'Having reached this point, one of the lists has finished.
> 'Display the remaining unique values.
> If lngAIdx < lngMaxAIdx Then
> For lngAIdx = lngAIdx to lngMaxAIdx
> WScript.Echo "Remaining Unique A value: " & ListA(lngAIdx)
> Next
> End If
> '
> If lngBIdx < lngMaxBIdx Then
> For lngBIdx = lngBIdx To lngMaxBIdx
> WScript.Echo "Remaining Unique B value: " & ListB(lngBIdx)
> Next
> End If
>
> WScript.Quit
>



Re: Compare the values of two sorted arrays of variable size. by Csaba

Csaba
Mon Jun 20 20:54:21 CDT 2005

You're looking for aRay1 - aRay2 and aRay2 - aRay1:

'Assume aRay1 and aRay2
Dim aRay2m1, aRay1m2
'First we need a quick way to test whether an element is in aRay1
Set oDict = CreateObject("Scripting.Dictionary")
For i = LBound(aRay1) to UBound(aRay1)
If Not oDict.Exists(aRay1(i)) Then oDict.Add aRay1(i), True
Next
'This section computes all values in aRay2 not in aRay1
aRay2m1 = Array()
For i = LBound(aRay2) to UBound(aRay2)
If oDict.Exists(aRay2(i)) Then
oDict.Item(aRay2(i)) = False
Else
Append aRay2m1, aRay2(i)
End If
Next
'This section computes all values in aRay1 not in aRay2
aRay1m2 = Array()
For each i in oDict.Keys
If oDict.Item(i) Then Append aRay1m2, i
Next

Sub Append (byRef aRay, val)
If LBound(aRay)<UBound(aRay) Then
aRay = Array(val)
Else
Redim Preserve aRay(UBound(aRay)+1)
aRay(uBound(aRay)) = val
End If
End Sub

Csaba Gabor from Vienna


Re: Compare the values of two sorted arrays of variable size. by Andy

Andy
Tue Jun 21 01:55:50 CDT 2005

Hi Richard, Rafael & Csaba,

Many thanks for your time and help. I was missing the fact that not
every location is checked. Although the possibility of duplicate
entries is unlikely, I'll keep this test too.

For those of you who are curious, the names are actually the addresses
of web sites known to send spam. The two lists represent an old list
and a newly updated list. New entries are identified and email filters
updated.

Thanks again,

Andy


Re: Compare the values of two sorted arrays of variable size. by Rafael

Rafael
Tue Jun 21 02:42:17 CDT 2005

Andy,

It will be better to have a database to store all those web sites as
databases would keep information in a more easy to manage way.
vbscript can interact with access and sql very easily. Other databases also
work as long as a OLE connection can be created.

Also there are websites that track those addresses, so maybe your best bet
would be to link to those databases rather than create one yourself.

RT

"Andy" <andy.and.suzanne.cooper@gmail.com> wrote in message
news:1119336950.691447.184610@g14g2000cwa.googlegroups.com...
> Hi Richard, Rafael & Csaba,
>
> Many thanks for your time and help. I was missing the fact that not
> every location is checked. Although the possibility of duplicate
> entries is unlikely, I'll keep this test too.
>
> For those of you who are curious, the names are actually the addresses
> of web sites known to send spam. The two lists represent an old list
> and a newly updated list. New entries are identified and email filters
> updated.
>
> Thanks again,
>
> Andy
>



Re: Compare the values of two sorted arrays of variable size. by Andy

Andy
Tue Jun 21 04:09:31 CDT 2005

Hi Rafael,

The source of the spamming domains is a web page:
http://spamcheck.freeapp.net/top-sites-domains

The page has one domain name per line and there are approximately 2500
domains listed.

I download the page (already automated) and using a vb script will/can
read the domain names into a dictionary object ;-) The previously
donwloaded page can be opened and its contents placed in another
dictionary object. The two lists are compared and new anti-spam email
rules generated accordingly.

The use of a database (MS Access / sql server) is not yet justified,
but thanks for the advice.

Cheers,

Andy