是否可以让SQL Server将排序规则转换为UTF-8/UTF-16

问题描述:

在我正在处理的项目中,我的数据以排序规则Danish_Norwegian_CI_AS存储在SQL Server中.数据通过FreeTDS和ODBC输出到python,将数据作为UTF-8处理.某些字符(例如å,ø和æ)未正确编码,导致项目进度陷入停顿.

In a project I am working on my data is stored in SQL Server, with the collation Danish_Norwegian_CI_AS. The data is output'ed through FreeTDS and ODBC, to python that handles the data as UTF-8. Some of the characters, like å, ø and æ, are not being coded correctly, causing the project progress to grind to a halt.

我花了几个小时来阅读有关编码,排序规则和代码页的混乱世界,并且觉得自己对整个图片有了更好的理解.

I spent a couple of hours reading about the confusing world of encodings, collation and code-pages, and feel like I have gotten a better understanding of the entire picture.

我读过的一些文章使我认为可以:1.在SQL select语句中指定在输出排序规则数据时应将其编码为UTF-8.

Some of the articles I have read, makes me think that it would be possible to: Specify in the SQL select statement, that the collation data should be encoded to UTF-8 when it is output'ed.

我认为这是可能的原因是本文展示了一个示例,该示例说明了如何使用不同的排序规则来使表在一起很好地玩耍.

The reason I am thinking this is possible is this article which shows an example of how to get to tables, with different collations, to play nice together.

任何将归类转换为UTF-8/UTF-16的指针,将不胜感激!

Any pointers in the direction of converting collation to UTF-8 / UTF-16, would be greatly appreciated!

我已经阅读到SQL Server通过ncharnvarcharntext提供了unicode选项,并且其他字符串变量charvarchartext是根据set进行编码的整理.我还读到上面提到的unicode选项是用utf-16变体ucs-2编码的(我希望我没记错).所以;为了使表的语言环境排序规则和unicode发挥作用,应该有一个转换功能,不是吗?

I have read that SQL Server provides a unicode option through nchar, nvarchar and ntext, and that the other string variables char, varchar and text are coded according to set collation. I have also read that the above mentioned unicode options are coded in utf-16 variant ucs-2 (I hope I am remembering that right). So; in order to allow tables of locale collation and unicode, to play nice, there should be a conversion function, no?

4个月后,我终于找到了解决问题的答案.事实证明,它与FreeTDS驱动程序或数据库排序规则无关:

4 months on, I finally found the answer to my problem. Turns out it had nothing to do with the FreeTDS driver, or the database collation:

这是pyodbc的connect函数,显然需要一个标志. unicode_results=True

It was pyodbc's connect function, which apparently requires a flag; unicode_results=True

张贴在这里是为了帮助其他不幸的灵魂,他们注定要在黑暗中漫无目的徘徊,寻找线索.

Posted here to help other unfortunate soules doomed to wander aimlessly in the dark, looking for a clue.