存储稀疏矩阵在.NET中的最佳方法

存储稀疏矩阵在.NET中的最佳方法

问题描述:

我们有一个存储一个稀疏矩阵的应用程序。这个矩阵具有大多是围绕矩阵的主对角线存在条目。我想知道是否有任何有效的算法(或现有的库),可以有效地处理这种稀疏矩阵? preferably,这将是一个通用的实现,其中每个矩阵条目可以是用户定义的类型。

We have an application that stores a sparse matrix. This matrix has entries that mostly exist around the main diagonal of the matrix. I was wondering if there were any efficient algorithms (or existing libraries) that can efficiently handle sparse matrices of this kind? Preferably, this would be a generic implementation where each matrix entry can be a user-defined type.

编辑在回答记者提问/响应:

Edit in response to a question/response:

当我说大多是围绕主对角线我的意思是大多数的矩阵的特征将是最条目聚集断的主对角线但有可能是零附近的对角,而且有可能是非零值远从对角线。我要为'最'的情况下这里的东西有效的。

When I say mostly around the main diagonal I mean that the characteristics of most of the matrices will be that most entries are clustered off of the main diagonal but there could be zeroes close to the diagonal and there could be non-zero values far out from the diagonal. I want something efficient for 'most' cases here.

我将要使用这个呢?我需要能够有高效的访问中的行或所有值的所有值中的一列。存储的值是布尔值。一个例子是:

What will I be using this for? I need to be able to have efficient access to all values in a row or all values in a column. The values stored would be Boolean values. An example would be:

  1. 对于行中的所有真值,的foreach列一个真正的出现在列中设置的所有条目的东西
  2. 对于连续全是假的值,设置入口的东西

这是所有做与链表previously但非常混乱来实现。我希望用一个稀疏矩阵我可以改进算法,但要找到正确的稀疏矩阵算法已被证明很难。

This was all done with linked lists previously but was very confusing to implement. I was hoping that with a sparse matrix I could improve the algorithm but finding the 'right' type of sparse matrix algorithm has proved difficult.

P.S。感谢您的答复迄今

p.s. Thanks for the responses thus far

您可以使用基于细胞的[行,列]的索引。由于数据是在一个对角线,储存的行索引的典型方法和相关联的列indeces与数据不是最佳的。下面是一些code,你可以用它来做到这一点:

You could use an index based on the [row,col] of the cell. Since the data is on a diagonal, the typical approach of storing the row index and the associated column indeces with data is not optimal. Here is some code you could use to do it:

    public class SparseMatrix<T>
    {
        public int Width { get; private set; }
        public int Height { get; private set; }
        public long Size { get; private set; }

        private Dictionary<long, T> _cells = new Dictionary<long, T>();

        public SparseMatrix(int w, int h)
        {
            this.Width = w;
            this.Height = h;
            this.Size = w * h;
        }

        public bool IsCellEmpty(int row, int col)
        {
            long index = row * Width + col;
            return _cells.ContainsKey(index);
        }

        public T this[int row, int col]
        {
            get
            {
                long index = row * Width + col;
                T result;
                _cells.TryGetValue(index, out result);
                return result;
            }
            set
            {
                long index = row * Width + col;
                _cells[index] = value;
            }
        }
    }

    static void Main()
    {
        var sm = new SparseMatrix<int>(512, 512);
        sm[42, 42] = 42;
        int val1 = sm[13, 13];
        int val2 = sm[42, 42];

        Console.WriteLine("VAL1 = " + val1); // prints out 0
        Console.WriteLine("VAL2 = " + val2); // prints out 42

        Console.ReadLine();
    }

请注意,当T是一个结构,您可能需要调用IsCellEmpty因为得到一个单元格的内容将不能为空,将有该类型的默认值。您还可以扩大code给你根据尺寸属性和 _cells.Count $ C $快速SparseRatio C>。

Note that when T is a struct, you might have to call the IsCellEmpty since getting the contents of a cell will not be null and will have the default value for that type. You can also expand the code to give you a quick "SparseRatio" based on the Size property and _cells.Count.

编辑:

好了,如果你有兴趣的是速度,你可以做权衡的空间VS速度。而不是只有一个解释,有三个!它三倍的空间,但它让你想真正的轻松任何方式枚举研究。下面是一些新的code,它表明:

Well, if you are interesting is speed, you can do the trade-off of space vs speed. Instead of having only one dictionary, have three! It triples your space, but it makes enumerating in any way you want real easy. Here is some new code that shows that:

    public class SparseMatrix<T>
    {
        public int Width { get; private set; }
        public int Height { get; private set; }
        public long MaxSize { get; private set; }
        public long Count { get { return _cells.Count; } }

        private Dictionary<long, T> _cells = new Dictionary<long, T>();

        private Dictionary<int, Dictionary<int, T>> _rows = 
            new Dictionary<int, Dictionary<int, T>>();

        private Dictionary<int, Dictionary<int, T>> _columns = 
            new Dictionary<int, Dictionary<int, T>>();

        public SparseMatrix(int w, int h)
        {
            this.Width = w;
            this.Height = h;
            this.MaxSize = w * h;
        }

        public bool IsCellEmpty(int row, int col)
        {
            long index = row * Width + col;
            return _cells.ContainsKey(index);
        }

        public T this[int row, int col]
        {
            get
            {
                long index = row * Width + col;
                T result;
                _cells.TryGetValue(index, out result);
                return result;
            }
            set
            {
                long index = row * Width + col;
                _cells[index] = value;

                UpdateValue(col, row, _columns, value);
                UpdateValue(row, col, _rows, value);
            }
        }

        private void UpdateValue(int index1, int index2, 
            Dictionary<int, Dictionary<int, T>> parent, T value)
        {
            Dictionary<int, T> dict;
            if (!parent.TryGetValue(index1, out dict))
            {
                parent[index2] = dict = new Dictionary<int, T>();
            }
            dict[index2] = value;
        }
    }

如果你想遍历所有条目,使用 _cells 。如果你想给定列使用 _columns 中的所有行。如果你想在一个给定行使用的所有列 _rows

If you want to iterate over all the entries, use _cells. If you want all the rows for a given column use _columns. If you want all the columns in a given row use _rows.

如果你想在有序进行迭代,你就可以开始添加LINQ到组合和/或使用排序列表与封装的条目(这将需要存储的行或列,实施 IComparable的&LT; T&GT; 分拣工作)

If you want to iterate in sorted order, you can start to add LINQ into the mix and/or use a sorted list with an inner class that encapsulates an entry (which would have to store the row or column and implement IComparable<T> for sorting to work).