Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-6965

non-captured group \2 in regexp_replace

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 10.0.11
    • Fix Version/s: 10.0.15
    • Component/s: OTHER
    • Labels:

      Description

      select regexp_replace('1 foo and bar', '(\\d+) foo and (\\d+ )?bar', '\\1 this and \\2that')
      

      expected result: 1 this and that
      actual result: 1 this and 2that

        Gliffy Diagrams

          Attachments

            Activity

            Hide
            elenst Elena Stepanova added a comment - - edited

            Thinking about it, it sort of makes sense.
            the second group does not match to an empty value, it is undefined at all; so the ambiguous sequence backslash-backslash-2 defaults to '2'.
            It's counter-intuitive though.

            My quick search through PCRE documentation hasn't found anything saying whether it's intentional, so I'm assigning it to Alexander Barkov to confirm (or not).
            Or, maybe there is a flag that controls it, I haven't found one either.

            Show
            elenst Elena Stepanova added a comment - - edited Thinking about it, it sort of makes sense. the second group does not match to an empty value, it is undefined at all; so the ambiguous sequence backslash-backslash-2 defaults to '2'. It's counter-intuitive though. My quick search through PCRE documentation hasn't found anything saying whether it's intentional, so I'm assigning it to Alexander Barkov to confirm (or not). Or, maybe there is a flag that controls it, I haven't found one either.
            Hide
            julian.ladisch Julian Ladisch added a comment -

            Perl and JavaScript also produce the expected result "1 this and that":

            perl -e '$_="1 foo and bar\n"; s/(\d+) foo and (\d+ )?bar/\1 this and \2that/; print;'
            
            document.write("1 foo and bar".replace(/(\\d+) foo and (\\d+ )?bar/, "$1 this and $2that"));
            

            http://www.pcre.org/pcre.txt
            " PCRE_JAVASCRIPT_COMPAT
            If this option is set […] a back reference to an unset subpattern group matches an empty string (by default this causes the current matching alternative to fail). A pattern such as (\1)(a) succeeds when this option is set (assuming it can find an "a" in the subject), whereas it fails by default, for Perl compatibility.

            Show
            julian.ladisch Julian Ladisch added a comment - Perl and JavaScript also produce the expected result "1 this and that": perl -e '$_="1 foo and bar\n"; s/(\d+) foo and (\d+ )?bar/\1 this and \2that/; print;' document.write("1 foo and bar".replace(/(\\d+) foo and (\\d+ )?bar/, "$1 this and $2that")); http://www.pcre.org/pcre.txt " PCRE_JAVASCRIPT_COMPAT If this option is set […] a back reference to an unset subpattern group matches an empty string (by default this causes the current matching alternative to fail). A pattern such as (\1)(a) succeeds when this option is set (assuming it can find an "a" in the subject), whereas it fails by default, for Perl compatibility.

              People

              • Assignee:
                bar Alexander Barkov
                Reporter:
                julian.ladisch Julian Ladisch
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: