Home
Regular Expressions' Journal
 
[Most Recent Entries] [Calendar View] [Friends]

Below are the 1 most recent journal entries recorded in Regular Expressions' LiveJournal:

    Thursday, August 2nd, 2007
    5:29 pm
    [owenblacker]
    Possible .Net regex bug?

    For reasons too dull to explain, I’m trying to use regular expressions to postprocess an HTML-stream. I want to find all anchor (<a/>) tags that link within our site, in this case using the domain name.

    My regular expression looks right to me, but .Net is convinced I don’t have enough close-parentheses. I’ve added line breaks for clarity:

       (?<=<a[^>]* href=['"]?)
       (?<before>https?://[a-z0-9.-]*uswitch\.[a-z]+/[-\w_,.%/~]+)
       (?<querystring>\??[-\w&=~]*)
       (?<fragment>#?[-\w&=~]*)
       (?=['"]?[^>]*>)

    I’ve tested both the above code with the line breaks removed and the original code (which is compiled with RegexOptions.IgnorePatternWhitespace and has embedded comments for ease of maintenance. Each time, I get a System.ArgumentException: parsing "..." - Not enough )'s.

    Despite that I’m quite certain they’re perfectly matched.

    Anyone?

    Cross-posted to [info]ms_dot_net.



    Current Mood: frustrated
    Current Music: Elton John — Goodbye Yellow Brick Road
About LiveJournal.com

Advertisement