mysql regex utf
I am trying to get data from MySQL
database via REGEX
with or without special utf-8 characters.
Let me explain on example :
If user enters word like sirena
it should return rows which include words like sirena
, siréna
, šíreňá
.. and so on.. also it should work backwards when he enters siréná
it should return the same results..
I am trying to search it via REGEX
, my query looks like this :
SELECT * FROM `content` WHERE `text` REGEXP '[sšŠ][iíÍ][rŕŔřŘ][eéÉěĚ][nňŇ][AaáÁäÄ0]'
It works only when in database is word sirena
but not when there is word siréňa
..
Is it because something with UTF-8
and MySQL? (collation of mysql column is utf8_general_ci
)
Thank you!
MySQL's regular expression library does not support utf-8.
See Bug #30241 Regular expression problems, which has been open since 2007. They will have to change the regular expression library they use before that can be fixed, and I haven't found any announcement of when or if they will do this.
The only workaround I've seen is to search for specific HEX strings:
mysql> SELECT * FROM `content` WHERE HEX(`text`) REGEXP 'C3A9C588';
+----------+
| text |
+----------+
| siréňa |
+----------+
Re your comment:
No, I don't know of any solution with MySQL.
You might have to switch to PostgreSQL, because that RDBMS supports u
codes for UTF characters in their regular expression syntax.
Try something like ... REGEXP '(a|b|[ab])'
SELECT * FROM `content` WHERE `text` REGEXP '(s|š|Š|[sšŠ])(i|í|Í|[iíÍ])(r|ŕ|Ŕ|ř|Ř|[rŕŔřŘ])(e|é|É|ě|Ě|[eéÉěĚ])(n|ň|Ň|[nňŇ])(A|a|á|Á|ä|Ä|0|[AaáÁäÄ0])'
It works for me!
Use the lib_mysqludf_preg library from the mysql UDF repository for PCRE regular expressions directly in mysql
Although MySQL's regular expression library does not support utf-8 the mysql UDF repository has the ability to use utf-8 compatible regex according PCRE regular expressions directly in mysql.
http://www.mysqludf.org/ https://github.com/mysqludf/lib_mysqludf_preg#readme
链接地址: http://www.djcxy.com/p/17026.html上一篇: 将一个数组列表复制到另一个列表的最快方法
下一篇: MySQL的正则表达式的UTF