I made a typo in the code I sent.
This
exPattern = @"(\D+[\.]* +\D* *\d+ *,0x *\d+)";
should be
exPattern = @"(\D+[\.]* +\D* *\d+ *, *\d+)";
Even with the correction, I am experiencing the same behavior.
I made the change that you suggested:
Match mt = rx.Match(inText);
and saw it take a long time. This is a little different than the scenario
that I was seeing. In the program I sent, it would make it past the
match with the blink of an eye. The place that it would hang is
accessing the Count property
mc = rx.Matches(inText);
if (mc != null)
{
count += mc.Count; <=== HERE
If I am using the debugger, and step so I am sitting on the
if (mc != null)
line, drag the mc variable to the watch window, and expand it, the
operation will take about 15 seconds. Looking at the fields, several
of the fields show errors:
Count = "error: cannot obtain value"
"Can you tell me what kind of pattern you are going to match so that we can
figure out another pattern for it?"
I am processing documents in multiple languages with dates / times / numerics
in different formats. The date formats I am handling are:
MM/dd/yy
MM/dd/yyyy
ddd, MMMM dd, yyyy
dd-MMM-yyyy
dd MMMM yyyy
MMM. dd, yy
MMMM dd, yyyy
yyyy/MM/dd
Some languages have 2 words to represent a single month. Taking
this into consideration as well as different seperators and sometimes numerics
preceeded with a Zero, I have boiled the above formats into 4 patterns:
1) exPattern = @"(\d+(?<mark>[-| |/|\.])[^\d|^ ]+ *[^\d|^ ]*\k<mark>\d+)";
2) exPattern = @"(\d+ *(?<mark>[ |/|\.|-]) *\d+ *\k<mark> *\d+)";
3) exPattern = @"(\D+[\.]* +\d+ *, *\d+)";
4) exPattern = @"(\D+[\.]* +\D* *\d+ *, *\d+)";
The first 3 patterns work without any problem. The 4th one is the one that
causes the hang.
Thanks,
Dave