Regular Expression Problem - PHP
Results 1 to 2 of 2

Thread: Regular Expression Problem - PHP

  1. #1
    Flash M0nkey
    Join Date
    Sep 2001
    Posts
    3,447

    Regular Expression Problem - PHP

    Ok am at my wits end with this one. No luck on google either and feels like am beating my head against a brick wall.

    The problem: I have a string containing some HTML source. Within this source there is various links to different locations.

    I need to get all the links and append a set of variables to them using a regular expression.

    But i need to ensure that the URLs remain properly formed

    Examples:

    <a href="www.somesite.com">Some Site!</a>
    Becomes - <a href="www.somesite.com?var1=foo&var2=bar">Some Site!</a>

    <a href="http://www.somesite.com/mydir/index.php?somevar=value" title="A link" class="myLinks">Some Site!</a>
    Becomes - <a href="http://www.somesite.com/mydir/index.php?var1=foo&var2=bar&somevar=value" title="A link" class="myLinks">Some Site!</a>

    So basically I need to ensure that all other information remains untouched

    any help to save me form going nuts would be appriciated

    cheers

    v_Ln

  2. #2
    Jaded Network Admin nebulus200's Avatar
    Join Date
    Jun 2002
    Posts
    1,356
    Heh, you have my sympathies...matching or s/replacing url's can be very very tricky b/c web masters are often sloppy and certainly not consistent...

    For example:

    href=www.somesite.com/dir/subdir/subdir/../../url
    href=www.somesite.com/dir/url
    href=www.somesite.com/dir/
    href='www.somesite.com/dir'
    href="www.somesite.com/dir"
    href="www.somesite.com/dir%2fsubdir%2fsubdir%2f%2e%2e%2f%2e%2e%2f"
    onClick=document.open('www.somesite.com/dir')

    Those are all the same...and I am sure there are tons of other variations...do you have any control over what is being searched/replaced or do you have to take all the millions of little variations into account?

    EDIT: Somewhere in addicts I posted a script that would parse through and break down urls...you might look for it (it was perl) and on another site that I think you know what I mean I have an updated ruby version...
    There is only one constant, one universal, it is the only real truth: causality. Action. Reaction. Cause and effect...There is no escape from it, we are forever slaves to it. Our only hope, our only peace is to understand it, to understand the 'why'. 'Why' is what separates us from them, you from me. 'Why' is the only real social power, without it you are powerless.

    (Merovingian - Matrix Reloaded)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  

 Security News

     Patches

       Security Trends

         How-To

           Buying Guides