替换 Pandas Dataframe 列中的 Unicode 字符
我有一个熊猫数据框的问题,其中包含公寓中的房间数(字符串类型).
I have a problem with a pandas Dataframe that amongst other things contains the number of rooms in an apartment (type String).
此数据包含一个 unicode 字符 u"\u00BD" (https://www.fileformat.info/info/unicode/char/00bd/index.htm).
This data consists of a unicode character u"\u00BD" (https://www.fileformat.info/info/unicode/char/00bd/index.htm).
我如何有效地用十进制值替换这个字符,以便数据将读取 2.5、3.5、4.5 等(Still String 格式)
,而不是 unicode 字符.
How do i effectively replace this character with decimal values so that instead of the unicode character the data will read 2.5, 3.5, 4.5 etc (Still String format)
.
目前看起来像这样: 2½、3½、4½ 等
我希望列中的值是 2.5、3.5、4.5 等
代码>.
It currently looks like this: 2½, 3½, 4½ etc
And i want the values in the column to be 2.5, 3.5, 4.5 etc
.
您可以通过以下方式修复您的专栏:
You can fix your column with:
df['rooms'] = df['rooms'].str.replace("½", ".5")
使其成为浮动:
df['rooms'] = df['rooms'].str.replace("½", ".5").apply(float)