MySQL and Unicode
- pixnet blog: Mysql調整成全UTF-8語系 (some values are old names for MySQL 4.x only)
- Article: Unicode development with DbForms, MySQL 4.1, and friends
- Unicode support is available since MySQL 4.1
- At the end of the table definition add: ENGINE=InnoDB DEFAULT CHARSET=utf8;
- Must use –default-character-set=utf8 option with mysqldump for backup to retain Unicode data
- MySQL 5.0 Manual – Unicode
- To save space with UTF-8, use VARCHAR instead of CHAR. Otherwise, MySQL must reserve three bytes for each character in a CHAR
- Run: [set names “utf8″] before do anything
- Use this to connect to server: mysql –default-character-set=utf8 -uyour_username -p -h your_databasehost.your_domain.com your_database
- Convert database to Unicode
- How to convert a complete Database to Unicode (UTF-8), using Windows.
- Create a dump file (use Mysqldump or phpMyAdmin)
- Open in Notepad
- Save in Unicode format
- Load to MySQL with: mysql –user=root –password databasename file.txt
- *Maybe* another way
- MySQL ODBC 3.x may not be Unicode friendly and still display the wrong charset
- How to convert a complete Database to Unicode (UTF-8), using Windows.
- Tested Result:
- Webpage charset = utf8, table charset=utf8, via php, will store in latin1? When mysql client connect as latin1, can get back the utf8 bytecode data. Saving the value charset is utf8
- Access update db, table charset=utf8, access and mysql cli can see back the value. Saving the value charset is big5.
- Access update db, table charset=utf8, ODBC connect set utf8, access cannot read value!?
- change the charset variables at server runtime does not change the value return to mysql cli, seems settings are per connection
set character_set_client=utf8; set character_set_connection=utf8; set character_set_database=utf8; set character_set_results=utf8; set character_set_server=utf8;
But even set these values, in mysql win32 CLI, update value with 2 Chinese characters still show 3 x ? values (24 bit, 1st char ASCII63,
- Bug: MyODBC 3.51 is not yet support Unicode, wait MyODBC 3.53
More -
- function to check database values: SELECT charset(value)
- Create Database syntax (for create database with unicode support)
create database dbname default character set utf8;- To complete convert a db from latin1 to utf8, a complete dump / restore maybe needed
- Solution to DB_DataObject
use a constructor to execute connection setting, mysql library seems don’t have related settings- Unicode solution:
- Make sure create the database with default charset utf8, or set the server variable ‘default-character-set=utf8’ under [mysqld]
- Set init_connect for non-root users at /etc/my.cnf, so any non-root users will set the connection to utf8:
[mysqld] init_connect='SET NAMES utf8'
- when connect with mysql client with a non-root user, \s should see:
Server characterset: utf8 Db characterset: utf8 Client characterset: utf8 Conn. characterset: utf8
- For php, make sure the website default charset is set to: utf-8
- For Microsoft Windows ODBC, need to set “init statement" of the ODBC driver to: big5 (or anything match the system locale),
due to the MyODBC 3.51 is not yet support utf8 - Problem fixed if use MyODBC 5.1 driver, and set connection parameter to “SET NAMES utf8″. Microsoft Office ODBC data source can get data in Unicode format properly.
- mysql.exe in win32 does NOT use unicode, so it follow the system locale