从SQL Server处理大量数据
我有一个存储过程(SQL Server 2016),当前基于该SP的参数返回100K至200K行.
I have a stored procedure (SQL Server 2016) which currently returns 100K to 200K rows based on the parameters to that SP.
每行的大小可以为100KB至200KB.因此,总大小可以在10GB到20GB之间.
Each row can be a size of 100KB to 200KB. So total size can be around 10GB to 20GB.
我的客户端(后台作业)必须调用此SP并处理所有行并将其发送给另一个客户端.
My client(background job) has to call this SP and process all rows and send it to another client.
处理此类情况的最佳方法是什么?
What is the best approach to handle such scenarios?
目前,我正在考虑使用使用yield的流枚举器.
Currently I am thinking of using streaming enumerator using yield.
每当"datareader.Read()"读取一行并对其进行处理并将其发送给其他客户端时,获取记录.
Get the record whenever the 'datareader.Read()' read a row and process it and send it to other client.
dataReader = command.ExecuteReader();
while (dataReader.Read())
{
obj = new SomeClass();
// prepare Someclass
yield return obj;
}
这种方法足以处理如此大的数据吗?
Is this approach sufficient to handler such large data?
有没有更好的方法呢? (例如多线程等)
Is there any better approach to it? (Such as multi threading etc.)
如果是这样,我应该如何处理.有指针要参考吗?
If so how should I approach to it. Any pointers to refer?
SP具有多个联接,并且一天运行几次.
SP has multiple joins and runs couple of times in a day.
根据您的描述,我认为这是实现SSIS(集成服务)的好方案,该SSIS可以管理最终结果并将其写入CSV文件并允许客户交换它.
According to your description, I believe that it represents a good scenario for implementing an SSIS (Integration Services) which can manage and write the final results into a CSV file and allow the customer to exchange it.