在R中的热图2中选择树状图的叶节点数

问题描述:

在Matlab中,您可以指定要绘制的树状图中的节点数,作为dendrogram函数的一部分:dendrogram(tree,P)生成不超过P个叶节点的树状图.

In Matlab you can designate the number of nodes in a dendrogram that you wish to plot as part of the dendrogram function: dendrogram(tree,P) generates a dendrogram plot with no more than P leaf nodes.

我对R中的heatmap2进行相同操作的尝试失败了.建议给stackoverflow和biostar的帖子使用cutree,但heatmap2会被帖子对Rowv选项的建议所困扰.这里的"TAD"是8列乘831行的数据矩阵.

My attempts to do the same with heatmap2 in R have failed miserably. The posts to stackoverflow and biostars have suggested using cutree but heatmap2 gets stuck with postings' suggestions on Rowv option. Here "TAD" is the data matrix 8 columns by 831 rows.

# cluster it
hr <- hclust(dist(TAD, method="manhattan"), method="average")

# draw the heat map
heatmap.2(TAD, main="Hierarchical Cluster",
          Rowv=as.dendrogram(cutree(hr, k=5)),
          Colv=NA, dendrogram="row", col=my_palette, density.info="none", trace="none")

返回消息:

Error in UseMethod("as.dendrogram") : 
  no applicable method for 'as.dendrogram' applied to an object of class "c('integer', 'numeric')"

是否使用cutree来探索绘制受限树状图的正确途径?有没有类似于Matlab的简便方法?

Is using cutree the correct avenue to explore for plotting a restricted dendrogram? Is there any easier way to do this akin to matlab?

问题是当您编写选择叶节点的数量"时是什么意思.

The question is what do you mean when you write "selecting number of leaf nodes".

heatmap.2中的Rowv参数需要树状图或TRUE/FALSE值.从帮助文件中:

The Rowv parameter in heatmap.2 needs a dendrogram or a TRUE/FALSE value. From the help file:

Rowv =确定是否以及如何对行树状图进行重新排序.经过 默认情况下,它为TRUE,这意味着将计算树状图并 根据行均值重新排序.如果为NULL或FALSE,则没有树状图 计算,并且不进行任何重新排序.如果是树状图,则使用 原样",即不进行任何重新排序.如果是整数向量,则 计算树状图,并根据向量的顺序对其进行重新排序.

Rowv = determines if and how the row dendrogram should be reordered. By default, it is TRUE, which implies dendrogram is computed and reordered based on row means. If NULL or FALSE, then no dendrogram is computed and no reordering is done. If a dendrogram, then it is used "as-is", ie without any reordering. If a vector of integers, then dendrogram is computed and reordered based on the order of the vector.

因此,当使用cutree(hr, k=5)时,您将获得一个整数向量(在产生5个簇的切割中,告诉您每个项目所属的簇).在其上使用as.dendrogram不会产生树状图,因此:Rowv=as.dendrogram(cutree(hr, k=5))会引发错误.

So, when using cutree(hr, k=5), you will get a vector of integer (telling you to which cluster each item belong to, in a cut that produces 5 clusters). Using as.dendrogram on it will not produce a dendrogram, hence: Rowv=as.dendrogram(cutree(hr, k=5)), throws an error.

如果您要突出显示树中的某些分支,为此,我邀请您研究

IF you want to highlight some of the branches in your tree, for that I invite you to look into the dendextend package to see which solution works for you best. Here is an example that may be what you are asking for:

library(gplots)
data(mtcars) 
x  <- as.matrix(mtcars)

# now let's spice up the dendrograms a bit:
Rowv  <- x %>% dist %>% hclust %>% as.dendrogram %>%
   set("branches_k_color", k = 3) %>% set("branches_lwd", 4) %>%
   rotate_DendSer(ser_weight = dist(x))
Colv  <- x %>% t %>% dist %>% hclust %>% as.dendrogram %>%
   set("branches_k_color", k = 2) %>% set("branches_lwd", 4) %>%
   rotate_DendSer(ser_weight = dist(t(x)))

heatmap.2(x, Rowv = Rowv, Colv = Colv)

具有以下输出:

还要考虑最近发布的

Consider also looking at the recently published tutorial of dendextend, you may want to work with the branches_attr_by_labels function (in the tutorial it is under the section: "Adjusting branches based on labels"), with the ability to manipulate dendrograms to create plots such as this:

如果要删除节点,仅保留其中几个要绘制的图形,则可能应该只为数据的一个子集创建热图.您还可以查看dendextend中的prune函数(通常用于查看较小的树状图),但是如果要将其用于热图,最好只处理数据的相关子集.

If what you want is to remove nodes, and leave only a few of them to be plotted, you should probably just create the heatmap for a subset of the data. You can also look at the prune function in dendextend (for the general purpose of looking at smaller dendrograms), but if you would want to use it for a heatmap, it is better to just work with a relevant subset of your data.