Let’s say I have this pandas Series:
$ python3 -c 'import pandas as pd; print(pd.Series(["1","2","3","4"]))'
0 1
1 2
2 3
3 4
dtype: object
I’d like to “wrap” the strings “1”,”2″,”3″,”4″ so they are prefixed with “a” and suffixed with “b” -> that is, I want to get “a1b”,”a2b”,”a3b”,”a4b”. So I try https://pandas.pydata.org/docs/reference/api/pandas.Series.str.replace.html
$ python3 -c 'import pandas as pd; print(pd.Series(["1","2","3","4"]).str.replace("(.*)", r"a\1b", regex=True))'
0 a1bab
1 a2bab
2 a3bab
3 a4bab
dtype: object
So – I did get a “wrap” of “1” into “a1b” -> but then “ab” is repeated one more time?
(Trying this regex in regex101.com, I’ve noticed I get the same “ghost copies” of “ab” at end if the g
flag is enabled; so maybe Pandas .str.replace
somehow enables it? But then, default is flags=0
for Pandas .str.replace
as per docs ?!)
How can I get the entire contents of a column cell “wrapped” in only those characters that I want?
Change (.*)
to (.+)
:
andrej@Andrej-PC:~/app$ python3 -c 'import pandas as pd; print(pd.Series(["1","2","3","4"]).str.replace("(.+)", r"a\1b", regex=True))'
0 a1b
1 a2b
2 a3b
3 a4b
dtype: object
A possible solution:
s = pd.Series(range(1,5))
'a' + s.astype(str) + 'b'
Output:
0 a1b
1 a2b
2 a3b
3 a4b
dtype: object