在第一个空格上拆分字符串

问题描述：

我想将一个字符串向量(人名)分成两列(向量).问题是有些人的姓氏是两个字".我想将名字和姓氏分成两列.我可以使用下面的代码分割并取名字，但我不知道姓氏.(查看下面的示例集中的 obs 29 以获得一个想法，因为福特有一个必须保持在一起的 Pantera L 的姓氏")

I'd like to split a vector of character strings (people's names) into two columns (vectors). The problem is some people have a 'two word' last name. I'd like to split the first and last names into two columns. I can slit out and take the first names using the code below but the last name eludes me. (look at obs 29 in the sample set below to get an idea as the Ford has a "last name" of Pantera L that must be kept together)

到目前为止我尝试做的事情；

What I have attempted to do so far;

x<-rownames(mtcars)
unlist(strsplit(x, " .*"))

我想要的样子:

            MANUF       MAKE
27          Porsche     914-2
28          Lotus       Europa
29          Ford        Pantera L
30          Ferrari     Dino
31          Maserati    Bora
32          Volvo       142E

答

正则表达式 rexp 匹配字符串开头的单词，一个可选的空格，然后是字符串的其余部分.括号是作为反向引用访问的子表达式 \\1 和 \\2.

The regular expression rexp matches the word at the start of the string, an optional space, then the rest of the string. The parenthesis are subexpressions accessed as backreferences \\1 and \\2.

rexp <- "^(\\w+)\\s?(.*)$"
y <- data.frame(MANUF=sub(rexp,"\\1",x), MAKE=sub(rexp,"\\2",x))
tail(y)
#       MANUF      MAKE
# 27  Porsche     914-2
# 28    Lotus    Europa
# 29     Ford Pantera L
# 30  Ferrari      Dino
# 31 Maserati      Bora
# 32    Volvo      142E

在第一个空格上拆分字符串

相关推荐