将表转换为单列值的一键编码

问题描述:

我有一个包含两列的表:

I have a table with two columns:

+---------+--------+
| keyword | color  |
+---------+--------+
| foo     | red    |
| bar     | yellow |
| fobar   | red    |
| baz     | blue   |
| bazbaz  | green  |
+---------+--------+

我需要在PostgreSQL中进行某种形式的单编码和转换表:

I need to do some kind of one-hot encoding and transform table in PostgreSQL to:

+---------+-----+--------+-------+------+
| keyword | red | yellow | green | blue |
+---------+-----+--------+-------+------+
| foo     |   1 |      0 |     0 |    0 |
| bar     |   0 |      1 |     0 |    0 |
| fobar   |   1 |      0 |     0 |    0 |
| baz     |   0 |      0 |     0 |    1 |
| bazbaz  |   0 |      0 |     1 |    0 |
+---------+-----+--------+-------+------+

是否可以仅使用SQL?有关入门的任何提示?

Is it possible to do with SQL only? Any tips on how to get started?

如果我正确理解,则需要条件聚合:

If I correctly understand, you need conditional aggregation:

select keyword,
count(case when color = 'red' then 1 end) as red,
count(case when color = 'yellow' then 1 end) as yellow
-- another colors here
from t
group by keyword