While building a site on one of our servers we had no issues with the character set. We were using latin1 (iso-8859-1) for the HTML files meta tag and also on the MySql database. However, that didn’t happen when we were going live on our client’s server. Their Apache was using UTF-8 and on the HTTP headers it returned “Content-Type: text/html; charset=UTF-8″ so our charset was a problem. And we had to do the transition.
Changing files from latin1 to UTF-8 was kind of easy with a script. However, changing the database from latin1 to UTF-8 was kind of a pain… one solved easily at the end but still a pain. So, to spare some time for whoever needs this, here’s what I did.
I tried first with “ALTER TABLE tbl_name CONVERT TO CHARACTER SET charset_name;“, but apparently what that does is to change the default character set for the table and set the space to save data in that particular character set. Then I tried to modify each column with “ALTER TABLE table MODIFY column TEXT CHARACTER SET utf8;” but that didn’t work either, apparently that changes the character set for a column but doesn’t change its contents.
Soooooo, the solution is to first pass each column to a binary character set** and then pass it to the character set needed, like this:
ALTER TABLE table MODIFY title BLOB;
ALTER TABLE table MODIFY title VARCHAR(250) CHARACTER SET latin1;
ALTER TABLE table MODIFY title VARCHAR(250) CHARACTER SET utf8;
Keep in mind to change every columns data type on the first and third line with what you actually need. This conversion isn’t needed for columns with data types different than text.
**: http://dev.mysql.com/doc/refman/5.1/en/alter-table.html:
If you specify CONVERT TO CHARACTER SET binary, the CHAR, VARCHAR, and TEXT columns are converted to their corresponding binary string types (BINARY, VARBINARY, BLOB). This means that the columns no longer will have a character set and a subsequent CONVERT TO operation will not apply to them.