re.sub(pattern, repl, string, count=0, flags=0)
Return the string obtained by replacing the leftmost non-overlapping occurrences of pattern in string by the replacement repl. If the pattern isn’t found, string is returned unchanged. repl can be a string or a function; if it is a string, any backslash escapes in it are processed. That is, \n
is converted to a single newline character, \r
is converted to a carriage return, and so forth. Unknown escapes such as \&
are left alone. Backreferences, such as \6
, are replaced with the substring matched by group 6 in the pattern. For example:
1 2 3 4 | >>> re.sub(r 'def\s+([a-zA-Z_][a-zA-Z_0-9]*)\s*\(\s*\):' , ... r 'static PyObject*\npy_\1(void)\n{' , ... 'def myfunc():' ) 'static PyObject*\npy_myfunc(void)\n{' |
If repl is a function, it is called for every non-overlapping occurrence of pattern. The function takes a single match object argument, and returns the replacement string. For example:
1 2 3 4 5 6 7 | >>> def dashrepl(matchobj): ... if matchobj.group(0) == '-' : return ' ' ... else : return '-' >>> re.sub( '-{1,2}' , dashrepl, 'pro----gram-files' ) 'pro--gram files' >>> re.sub(r '\sAND\s' , ' & ' , 'Baked Beans And Spam' , flags=re.IGNORECASE) 'Baked Beans & Spam' |
The pattern may be a string or an RE object.
The optional argument count is the maximum number of pattern occurrences to be replaced; count must be a non-negative integer. If omitted or zero, all occurrences will be replaced. Empty matches for the pattern are replaced only when not adjacent to a previous match, so sub('x*', '-', 'abc')
returns '-a-b-c-'
.
In string-type repl arguments, in addition to the character escapes and backreferences described above, \g<name>
will use the substring matched by the group named name
, as defined by the (?P<name>...)
syntax. \g<number>
uses the corresponding group number; \g<2>
is therefore equivalent to \2
, but isn’t ambiguous in a replacement such as \g<2>0
. \20
would be interpreted as a reference to group 20, not a reference to group 2 followed by the literal character '0'
. The backreference \g<0>
substitutes in the entire substring matched by the RE.
Changed in version 3.1: Added the optional flags argument.
Changed in version 3.5: Unmatched groups are replaced with an empty string.
Deprecated since version 3.5, will be removed in version 3.6: Unknown escapes consist of '\'
and ASCII letter now raise a deprecation warning and will be forbidden in Python 3.6.
Please login to continue.