I just completed a project to allow 6 languages including chinese and
japanese, using both mysql and mssql as a database with php...in order
to keep myself sane I used the utf-8 approach.
Fist things first, you have to set up php to use character sets with
multiple bytes. That means using the mb module. On linux that means
compiling php --enable-mbstring. If you're using the new 4.3 php
versions (rc1 or rc2)
here's a blip from the manual about it...
the option --enable-mbstring will be enabled by default and replaced
with --with-mbstring[=LANG] to support Chinese, Korean and Russian
language support. Japanese character encoding is supported by default.
If --with-mbstring=cn is used, simplified chinese encoding will be
supported. If --with-mbstring=tw is used, traditional chinese
encoding will be supported. If --with-mbstring=kr is used, korean
encoding will be supported. If --with-mbstring=ru is used, russian
encoding will be supported. If --with-mbstring=all is added, all
supported character encoding in mbstring will be enabled, but the
binary size of PHP will be maximized because of huge Unicode character
maps. Note that Chinese, Korean and Russian encoding is experimentally
supported in PHP 4.3.0.
That might be greek to you, it just means multibyte functions will be
included automatically when you compile, you just tell it what
languages you want. On windows it's even easier to get the multibyte
functions, just make sure the extension=mb_string.dll line is
uncommented.
now, you can read the php manual about the functions
http://us2.php.net/manual/en/ref.mbstring.php
or you can do it the easy way, use utf-8 and do this in the ini:
find output_buffering and make sure it's turned on or set to a value
(4096 is the default) Just below that is a line called output
handler. that needs to be changed to:
output_handler = mb_output_handler
all your multibyte stuff will be handled correctly and automagically
then. You HAVE to use multibyte functions whether you use utf-8 or
another character set. it's just easier not to have to change
encoding all the time. multibyte functions are one of the newer
features of php...this means you'll probably need a fairly new version
- I'd recommend at least 4.2.3
remember to restart your webserver!
Now that you have the module you need and the ini set up...it's time
to create a page:
First, at the top of every page on the website you need this line
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
that simply tells the browser what character set you're using. If
you're not using utf-8 you'll have to change that for every language
so they get their characters right. for your information, your
characters sets are kind of dictated to you by php - from the manual:
Character encodings work with PHP:
ISO-8859-*, EUC-JP, UTF-8
Character encodings do NOT work with PHP:
JIS, SJIS
the euc-jp is the japanese one to use with php...
Then you need some method of getting information into the database in
the right encoding. I wrote a quick administration area program to do
it...if you will have information entered by users you can use the
same method...the trick is sending the right encoding in the form you
use on the webpage.
<form name="translatetext" action="text.php" method="post"
accept-charset="utf-8">
<textarea rows="8" cols="32" name="text" ></textarea>
<input type="submit" name="changetext" class="submit"
value="Translate/Change Text" /> </form>
Notice the line on the top of the form - it will force all entered
information to be submitted in utf-8 so it doesn't matter what is put
in - from chinese to hindi, it will all be in utf-8 when you get it.
if you're using multiple character sets, you're going to have to have
multiple forms, each with different character sets.
then you connect to your database and use a simple insert statement to
put it into the db
$link = mysql_connect("mysql_host", "mysql_user", "mysql_password")
or die("Could not connect");
mysql_select_db("my_database") or die("Could not select
database");
$query = "Insert into my_table (mytext) values ('$_POST[text]')";
$result = mysql_query($query) or die("Query failed");
I'd stick a auto_increment id column for each text value...makes
getting it out easier...to get it out, you do a select, and then a
mysql_fetch_row or mysql_fetch_assoc to get the information - a simple
php echo will display it on the page unless you're doing something a
bit more esoteric.
$query = "select text from my_table where id=1";
$result = mysql_query($query) or die("Query failed");
$info = mysql_fetch_assoc();
echo $info['text'];
You can sniff browsers to get a user's set language
if(!isset($_SESSION['lang']))
{
$lang = $_SERVER['HTTP_ACCEPT_LANGUAGE'];
$lang] = strtolower(substr($lang, 0, 2));
if($lang != 'en' and $lang != 'it' and $lang != 'es' and $lang != 'zh'
and $lang != 'ja' and $lang != 'fr' and $lang != 'de') //or any other
languages you're gonna do
{$lang = 'en';}
$_SESSION['lang'] = $lang;
}
notice I assigned it as a session variable, remember to start a
session at the top of the page, then you can get to the language from
anywhere...then add another colum to your text table called language
your query can change the where clause to "where id=1 and
lang='$_SESSION[lang]'"
anyway, I hope this helped...took me forever to figure out that you
had to have php set up to use multibyte stuff to get it to work right.
if you have problems with mysql not holding information in multiple
character sets, try upgrading to mysql 4.0. whatever they're at now or
using utf-8 instead.
Have fun! |