如何在python/pyspark数据框的所有列中添加后缀和前缀

问题描述:

我在pyspark中有一个数据框,其中包含100多个列.我要为所有列名做的事情是,我想在列名的开头和列名的末尾添加回号(`).

I have a data frame in pyspark with more than 100 columns. What I want to do is for all the column names I would like to add back ticks(`) at the start of the column name and end of column name.

例如:

column name  is testing user. I want `testing user`

在pyspark/python中是否有执行此操作的方法.当我们应用代码时,它应该返回一个数据帧.

Is there a method to do this in pyspark/python. when we apply the code it should return a data frame.

您可以将数据框的withColumnRenamed方法与na结合使用以创建新的数据框

You can use withColumnRenamed method of dataframe in combination with na to create new dataframe

df.na.withColumnRenamed('testing user', '`testing user`')

edit:假设您具有列列表,则可以执行以下操作-

edit : suppose you have list of columns, you can do like -

old = "First Last Age"
new = ["`"+field+"`" for field in old.split()]
df.rdd.toDF(new)

输出:

DataFrame[`First`: string, `Last`: string, `Age`: string]