How to Replace a String in Python :
by:
blow post content copied from Real Python
click here to view original post
Replacing strings in Python is a fundamental skill. You can use the .replace()
method for straightforward replacements, while re.sub()
allows for more advanced pattern matching and replacement. Both of these tools help you clean and sanitize text data.
In this tutorial, you’ll work with a chat transcript to remove or replace sensitive information and unwanted words with emojis. To achieve this, you’ll use both direct replacement methods and more advanced regular expressions.
By the end of this tutorial, you’ll understand that:
- You can replace strings in Python using the
.replace()
method andre.sub()
. - You replace parts of a string by chaining
.replace()
calls or using regex patterns withre.sub()
. - You replace a letter in a string by specifying it as the first argument in
.replace()
. - You remove part of a string by replacing it with an empty string using
.replace()
orre.sub()
. - You replace all occurrences of substrings in a string by using
.replace()
.
You’ll be playing the role of a developer for a company that provides technical support through a one-to-one text chat. You’re tasked with creating a script that’ll sanitize the chat, removing any personal data and replacing any swear words with emojis.
You’re only given one very short chat transcript:
[support_tom] 2025-01-24T10:02:23+00:00 : What can I help you with?
[johndoe] 2025-01-24T10:03:15+00:00 : I CAN'T CONNECT TO MY BLASTED ACCOUNT
[support_tom] 2025-01-24T10:03:30+00:00 : Are you sure it's not your caps lock?
[johndoe] 2025-01-24T10:04:03+00:00 : Blast! You're right!
Even though this transcript is short, it’s typical of the type of chats that agents have all the time. It has user identifiers, ISO time stamps, and messages.
In this case, the client johndoe
filed a complaint, and company policy is to sanitize and simplify the transcript, then pass it on for independent evaluation. Sanitizing the message is your job!
Sample Code: Click here to download the free sample code that you’ll use to replace strings in Python.
The first thing you’ll want to do is to take care of any swear words.
How to Remove or Replace a Python String or Substring
The most basic way to replace a string in Python is to use the .replace()
string method:
>>> "Fake Python".replace("Fake", "Real")
'Real Python'
As you can see, you can chain .replace()
onto any string and provide the method with two arguments. The first is the string that you want to replace, and the second is the replacement.
Note: Although the Python shell displays the result of .replace()
, the string itself stays unchanged. You can see this more clearly by assigning your string to a variable:
>>> name = "Fake Python"
>>> name.replace("Fake", "Real")
'Real Python'
>>> name
'Fake Python'
>>> name = name.replace("Fake", "Real")
'Real Python'
>>> name
'Real Python'
Notice that when you simply call .replace()
, the value of name
doesn’t change. But when you assign the result of name.replace()
to the name
variable, 'Fake Python'
becomes 'Real Python'
.
Now it’s time to apply this knowledge to the transcript:
>>> transcript = """\
... [support_tom] 2025-01-24T10:02:23+00:00 : What can I help you with?
... [johndoe] 2025-01-24T10:03:15+00:00 : I CAN'T CONNECT TO MY BLASTED ACCOUNT
... [support_tom] 2025-01-24T10:03:30+00:00 : Are you sure it's not your caps lock?
... [johndoe] 2025-01-24T10:04:03+00:00 : Blast! You're right!"""
>>> transcript.replace("BLASTED", "😤")
[support_tom] 2025-01-24T10:02:23+00:00 : What can I help you with?
[johndoe] 2025-01-24T10:03:15+00:00 : I CAN'T CONNECT TO MY 😤 ACCOUNT
[support_tom] 2025-01-24T10:03:30+00:00 : Are you sure it's not your caps lock?
[johndoe] 2025-01-24T10:04:03+00:00 : Blast! You're right!
Loading the transcript as a triple-quoted string and then using the .replace()
method on one of the swear words works fine. But there’s another swear word that’s not getting replaced because in Python, the string needs to match exactly:
>>> "Fake Python".replace("fake", "Real")
'Fake Python'
As you can see, even if the casing of one letter doesn’t match, it’ll prevent any replacements. This means that if you’re using the .replace()
method, you’ll need to call it various times with the variations. In this case, you can just chain on another call to .replace()
:
>>> transcript.replace("BLASTED", "😤").replace("Blast", "😤")
[support_tom] 2025-01-24T10:02:23+00:00 : What can I help you with?
[johndoe] 2025-01-24T10:03:15+00:00 : I CAN'T CONNECT TO MY 😤 ACCOUNT
[support_tom] 2025-01-24T10:03:30+00:00 : Are you sure it's not your caps lock?
[johndoe] 2025-01-24T10:04:03+00:00 : 😤! You're right!
Success! But you’re probably thinking that this isn’t the best way to do this for something like a general-purpose transcription sanitizer. You’ll want to move toward some way of having a list of replacements, instead of having to type out .replace()
each time.
Read the full article at https://realpython.com/replace-string-python/ »
[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]
January 15, 2025 at 07:30PM
Click here for more details...
=============================
The original post is available in Real Python by
this post has been published as it is through automation. Automation script brings all the top bloggers post under a single umbrella.
The purpose of this blog, Follow the top Salesforce bloggers and collect all blogs in a single place through automation.
============================
Post a Comment