I have following dataframe to process, DF
我有以下数据帧要处理,DF
Name City
Hat, Richards Paris
Adams New york
Tim, Mathews Sanfrancisco
chris, Moya De Las Vegas
kate, Moris Atlanta
Grisham HA Middleton
James, Tom, greval Rome
And my expected dataframe should be as following, DF
我期望的数据框应该如下,DF
Name Last_name City
Hat Richards Paris
Adams New york
Tim Mathews Sanfrancisco
chris Moya De Las Vegas
kate Moris Atlanta
Grisham HA Middleton
James, Tom greval Rome
Splitting should be done on last ',' and if there is no ',' then entire other words or phrase should fall in column 'Last_name' and 'Name' column should remain vacant.
分割应该在最后','进行,如果没有','那么整个其他单词或短语应该落在'Last_name'列中,'Name'列应该保持空白。
3 个解决方案
#1
4
Use str.split
with radd
for add ,
, last str.lstrip
:
使用str.split和radd for add ,, last str.lstrip:
df[['first','last']] = df['Name'].radd(', ').str.rsplit(', ', n=1, expand=True)
df['first'] = df['first'].str.lstrip(', ')
print (df)
Name City first last
0 Hat, Richards Paris Hat Richards
1 Adams New york Adams
2 Tim, Mathews Sanfrancisco Tim Mathews
3 chris, Moya De Las Vegas chris Moya De
4 kate, Moris Atlanta kate Moris
5 Grisham HA Middleton Grisham HA
6 James, Tom, greval Rome James, Tom greval
#2
4
Using str.split
with n=-1
(This is default you can change what you need)
使用str.split,n = -1(这是默认情况下你可以改变你需要的)
newdf=df.Name.str.split(', ',expand=True,n=1).ffill(1)
newdf.loc[newdf[0]==newdf[1],0]=''
newdf
Out[923]:
0 1
0 Hat Richards
1 Adams
2 Tim Mathews
3 chris MoyaDe
4 kate Moris
5 GrishamHA
df[['Name','LastName']]=newdf
df
Out[925]:
Name City LastName
0 Hat Paris Richards
1 Newyork Adams
2 Tim Sanfrancisco Mathews
3 chris LasVegas MoyaDe
4 kate Atlanta Moris
5 Middleton GrishamHA
#3
4
Quick and Dirty
Use pandas.str.split
with str[::-1]
to reverse the order
使用pandas.str.split和str [:: - 1]来反转顺序
df[['Last_name', 'Name']] = df.Name.str.split(', ').str[::-1].apply(pd.Series)
df
Name City Last_name
0 Hat Paris Richards
1 NaN New york Adams
2 Tim Sanfrancisco Mathews
3 chris Las Vegas Moya De
4 kate Atlanta Moris
5 NaN Middleton Grisham HA
#1
4
Use str.split
with radd
for add ,
, last str.lstrip
:
使用str.split和radd for add ,, last str.lstrip:
df[['first','last']] = df['Name'].radd(', ').str.rsplit(', ', n=1, expand=True)
df['first'] = df['first'].str.lstrip(', ')
print (df)
Name City first last
0 Hat, Richards Paris Hat Richards
1 Adams New york Adams
2 Tim, Mathews Sanfrancisco Tim Mathews
3 chris, Moya De Las Vegas chris Moya De
4 kate, Moris Atlanta kate Moris
5 Grisham HA Middleton Grisham HA
6 James, Tom, greval Rome James, Tom greval
#2
4
Using str.split
with n=-1
(This is default you can change what you need)
使用str.split,n = -1(这是默认情况下你可以改变你需要的)
newdf=df.Name.str.split(', ',expand=True,n=1).ffill(1)
newdf.loc[newdf[0]==newdf[1],0]=''
newdf
Out[923]:
0 1
0 Hat Richards
1 Adams
2 Tim Mathews
3 chris MoyaDe
4 kate Moris
5 GrishamHA
df[['Name','LastName']]=newdf
df
Out[925]:
Name City LastName
0 Hat Paris Richards
1 Newyork Adams
2 Tim Sanfrancisco Mathews
3 chris LasVegas MoyaDe
4 kate Atlanta Moris
5 Middleton GrishamHA
#3
4
Quick and Dirty
Use pandas.str.split
with str[::-1]
to reverse the order
使用pandas.str.split和str [:: - 1]来反转顺序
df[['Last_name', 'Name']] = df.Name.str.split(', ').str[::-1].apply(pd.Series)
df
Name City Last_name
0 Hat Paris Richards
1 NaN New york Adams
2 Tim Sanfrancisco Mathews
3 chris Las Vegas Moya De
4 kate Atlanta Moris
5 NaN Middleton Grisham HA