在第一个空格上拆分字符串
我想将一个字符串向量(人名)分成两列(向量).问题是有些人的姓氏是两个字".我想将名字和姓氏分成两列.我可以使用下面的代码分割并取名字,但我不知道姓氏.(查看下面的示例集中的 obs 29 以获得一个想法,因为福特有一个必须保持在一起的 Pantera L 的姓氏")
I'd like to split a vector of character strings (people's names) into two columns (vectors). The problem is some people have a 'two word' last name. I'd like to split the first and last names into two columns. I can slit out and take the first names using the code below but the last name eludes me. (look at obs 29 in the sample set below to get an idea as the Ford has a "last name" of Pantera L that must be kept together)
到目前为止我尝试做的事情;
What I have attempted to do so far;
x<-rownames(mtcars)
unlist(strsplit(x, " .*"))
我想要的样子:
MANUF MAKE
27 Porsche 914-2
28 Lotus Europa
29 Ford Pantera L
30 Ferrari Dino
31 Maserati Bora
32 Volvo 142E
正则表达式 rexp
匹配字符串开头的单词,一个可选的空格,然后是字符串的其余部分.括号是作为反向引用访问的子表达式 \\1
和 \\2
.
The regular expression rexp
matches the word at the start of the string, an optional space, then the rest of the string. The parenthesis are subexpressions accessed as backreferences \\1
and \\2
.
rexp <- "^(\\w+)\\s?(.*)$"
y <- data.frame(MANUF=sub(rexp,"\\1",x), MAKE=sub(rexp,"\\2",x))
tail(y)
# MANUF MAKE
# 27 Porsche 914-2
# 28 Lotus Europa
# 29 Ford Pantera L
# 30 Ferrari Dino
# 31 Maserati Bora
# 32 Volvo 142E