jg
Fri Aug 19 00:38:22 CDT 2005
Great, it works even after taking out the $ and the space around the |.. I
did add \b before the entire expression to make sure the first part of the
date is on the word boundary. This way I can avoid some supposedly low
probability errors like some strange catalogue dot or dash notations
Now all I have to do is to make it work with January, February,... ( fully
spelled month names). I guess I can always add another 12 | parts to the
month expressions
"jg" <junk@mail.pls> wrote in message
news:%23VC0r1BpFHA.3380@TK2MSFTNGP12.phx.gbl...
> that is absolutely wonderful and helpful. Thank you very much. Your
> efforts are well appreciated.
> Thank you very much again for testing and explaining.
>
> I will try that out..
>
> "Oliver Sturm" <oliver@sturmnet.org> wrote in message
> news:%23HZlBO9oFHA.2472@TK2MSFTNGP15.phx.gbl...
>> jg wrote:
>>
>>> I know I have to deal with
>>> yyyy-mm-dd ( and variants thereof with dot or slash as separator
>>> instead of dash, single digit month or day)
>>> yyyy-MMM-dd ( or just space instead of -)
>>> MMM d, yy ( or yyyy)
>>> and the tougher ones like
>>> d MMM yyyy
>>> d MMM yy
>>
>> I have created a regex for you that works with all those samples. Here it
>> is:
>>
>> (?<year>\d{4})[-\./\s](?<month>\d{1,2})[-\./\s](?<day>\d{1,2})$ |
>> (?<year>\d{4})[-\s](?<month>JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC)[-\s](?<day>\d{1,2})$
>> |
>> (?<month>JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC)\s(?<day>\d{1,2}),\s*?(?<year>\d{4}|\d{2})$
>> |
>> (?<day>\d{1,2})\s(?<month>JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC)\s(?<year>\d{4}|\d{2})$
>>
>> I tried this with the following samples, constructed from the templates
>> you gave:
>>
>> 2005-03-08
>> 2005.03.08
>> 2005/03/08
>> 2005 03 08
>> 2005 3 08
>> 2005 3 8
>> 2005 03 8
>> 2005-MAR-08
>> 2005 MAR 08
>> 2005 MAR 8
>> MAR 8, 2005
>> MAR 08, 2005
>> MAR 8, 05
>> MAR 08, 05
>> 8 MAR 2005
>> 8 MAR 05
>> 08 MAR 2005
>> 08 MAR 05
>>
>> As you can see, the expression is comprised of four different parts. Each
>> of these has a $ sign at the end, which you'll want to get rid of before
>> using the expression with your own long string. This is only needed to
>> test the expression in Regulator with multiple samples.
>>
>> I tried this with the IgnoreWhitespace and the IgnoreCase options
>> switched on.
>>
>> Hope this helps!
>>
>> (If you have any trouble with the regex, I could send you the saved
>> Regulator file. Just in case things get mangled in the message or
>> something.)
>>
>>
>> Oliver Sturm
>> --
>> omnibus ex nihilo ducendis sufficit unum
>> Spaces inserted to prevent google email destruction:
>> MSN oliver @ sturmnet.org Jabber sturm @ amessage.de
>> ICQ 27142619
http://www.sturmnet.org/blog
>
>