Skip to content Skip to sidebar Skip to footer

How To Get Re.sub To Add A Single Backslash Between Group Placeholders

When trying to build a regex to escape 's in a string, I run into an issue where I can't get the right # of backslashes to get the desired (\') output. data=''' { value1: 'b

Solution 1:

I think it's important to point out there is a fundamental difference between how a string is represented vs how it is printed.

When you run re.sub() in the console the output on screen is showing you the equivalent of the raw of the returned string.

A good way to see the difference:

>>> x = re.sub(r'(.*?)(".*?)',r'\1\\\2',data, re.MULTILINE)
>>> x
'    {\n    value1: \\"blah\\",\n    value2: \'foo<a href=\\"example.com\\">bar</a>\',\n}'
>>> print(x)
    {
    value1: \"blah\",
    value2: 'foo<a href=\"example.com\">bar</a>',
}

Notice the PRINTED string has the right number of backslashes in front of the double quotes.

explanation

The difference is between str() and repr().

repr() shows you the "code equivalent" of the string. If you were to directly copy and paste it into your script it would create the string properly.

str() shows you how the string would look when printing it.

The problem I think that's causing you so much issue is when you run something in console it effectively is doing the following without telling you it's doing so:

>>> x
# is the equivalent of 
>>> print(repr(x))
# but not at all the same thing as 
>>> print(x)

Post a Comment for "How To Get Re.sub To Add A Single Backslash Between Group Placeholders"