Microsoft Office Outlook supports Unicode (Unicode: A character encoding standard developed by the Unicode Consortium. By using more than one byte to represent each character, Unicode enables almost all of the written languages in the world to be represented by using a single character set.) and provides full support for multilingual data. If you work in a multinational organization or share messages and items (item: An item is the basic element that holds information in Outlook (similar to a file in other programs). Items include e-mail messages, appointments, contacts, tasks, journal entries, notes, posted items, and documents.) with people who use Outlook on systems that run in other languages, you can take advantage of Unicode support in Outlook.
In this topic
More about Unicode support
Outlook can run in one of two mailbox modes on with Exchange accounts — Unicode or non-Unicode. Unicode mode is recommended and is the default mode if the configurations of your profile, the Exchange account, and administrator settings allow it. The mode is automatically determined by Outlook based on these settings and cannot be changed manually.
Running Outlook in Unicode mode will enable you to work with messages and items that are composed in different languages. If Outlook is running in non-Unicode mode with your Exchange account and if you would like to switch to Unicode mode, contact your administrator.
Note Earlier versions of Outlook provided support for multilingual Unicode data in the body of Outlook items. However, Outlook data, such as the To and Subject lines of messages and the ContactName and BusinessTelephoneNumber properties of contact items, were limited to characters defined by your system code page. This limitation is no longer the case in Microsoft Office Outlook 2003 and Microsoft Office Outlook 2007, provided Outlook is running in Unicode mode with an Exchange account.
POP3 (POP3: A common protocol that is used to retrieve e-mail messages from an Internet e-mail server.) accounts also have the capability to support multilingual Unicode data in Microsoft
Office Outlook 2003 and Office Outlook 2007, provided the items are delivered to a Personal Folders file (.pst) (Personal Folders file (.pst): Data file that stores your messages and other items on your computer. You can assign a .pst file to be the default delivery location for e-mail messages. You can use a .pst to organize and back up items for safekeeping.) that can support multilingual Unicode data. By default, new POP3 profiles that deliver to a new .pst file created in Microsoft Office Outlook 2003 and Office Outlook 2007 support multilingual Unicode data.
Note Other accounts such as IMAP (IMAP (Internet Message Access Protocol): Unlike Internet e-mail protocols such as POP3, IMAP creates folders on a server to store/organize messages for retrieval by other computers. You can read message headers only and select which messages to download.) and HTTP (HTTP (Hypertext Transfer Protocol): Protocol that is used when you access Web pages from the Internet. Outlook uses HTTP as an e-mail protocol.) do not support Unicode.
Top of Page
Understanding scripts and code pages
Scripts
Multilingual messages and items can contain text in languages that require different scripts. A single script can be used to represent many languages.
For example, the Latin or Roman script has character shapes — glyphs — for the 26 letters (both uppercase and lowercase) of the English alphabet, as well as accented (extended) characters used to represent sounds in other Western European languages.
The Latin script has glyphs to represent all of the characters in most European languages and a few others. Other European languages, such as Greek or Russian, have characters for which there are no glyphs in the Latin script; these languages have their own scripts.
Some Asian languages use ideographic scripts that have glyphs based on Chinese characters. Other languages, such as Thai and Arabic, use scripts that have glyphs that are composed of several smaller glyphs or glyphs that must be shaped differently depending on adjacent characters.
A common way to store plain text is to represent each character by using a single byte. The value of each byte is a numeric index — or code point — in a table of characters; a code point corresponds to a character in the default code page of the computer on which the text document is created. For example, a byte value of decimal 189 (the code point for the decimal value 189) will represent different characters in different code pages.
Code pages
A table of characters grouped together is called a code page. For single-byte code pages, each code page contains a maximum of 256 byte values; because each character in the code page is represented by a single byte, a code page can contain as many as 256 characters.
One code page with its limit of 256 characters cannot accommodate all languages because all languages together use far more than 256 characters. Therefore, different scripts use separate code pages. There is one code page for Greek, another for Japanese, and so on.
In addition, single-byte code pages cannot accommodate most Asian languages, which commonly use more than 5,000 Chinese-based characters. Double-byte code pages were developed to support these languages.
Top of Page
Benefits of running in Unicode mode with Microsoft Exchange accounts
The Unicode
character encoding standard enables the sharing of messages and other items in a multilingual environment when the languages involved span multiple code pages (code page: A table that relates the binary character codes used by a program to keys on the keyboard or to the appearance of characters on the display. Code pages are a means of providing support for the languages used in different countries/regions.).
Non-Unicode systems typically use a code page–based environment, in which each script has its own table of characters. Items based on the code page of one operating system rarely map well to the code page of another operating system. In some cases, the items cannot contain text that uses characters from more than one script.
For example, consider two people — one is running the English version of the Microsoft Windows XP operating system with the Latin code page and the second person is running the Japanese version of the Microsoft Windows XP operating system with the Japanese code page. The second person creates a meeting request in the Japanese version of Microsoft Outlook 2002 with Japanese characters in the Location field and sends it to the first person. When the person using the English version of Outlook 2002 opens the meeting request, the code points of the Japanese code page are mapped to unexpected or nonexistent characters in the Latin script, and the resulting text is unintelligible.
Note Since Microsoft Outlook 2000, the body of Outlook items (item: An item is the basic element that holds information in Outlook (similar to a file in other programs). Items include e-mail messages, appointments, contacts, tasks, journal entries, notes, posted items, and documents.) is Unicode, and the body of the item can be read irrespective of the language in which the item was created. However, all the other item properties such as the
To, Location, and Subject lines of messages and meeting items and the ContactName and BusinessTelephoneNumber properties of contact items will be unintelligible in versions earlier than Outlook 2003.
The universal character set provided by Unicode eliminates this problem. Unicode was developed to create a universal character set that can accommodate most known scripts. Unicode uses a unique, multi-byte encoding for every character; so in contrast to code pages, every character has its own unique code point. For example, the Unicode code point of Greek lowercase zeta (
) is the hexadecimal value 03B6, and Cyrillic lowercase zhe (
) is 0436.
Microsoft Office Outlook 2003 and Outlook 2007 are fully capable of using Unicode. The code page system of representing text also exists in Outlook.
However, Unicode mode is recommended and is the default mode if the configurations of your profile, Exchange account, and administrator settings allow it. Also, the mode is automatically determined by Outlook based on these settings and cannot be changed manually.
Running Outlook in Unicode mode with an Exchange account ensures that by default, the Offline Folder files (.ost) (Offline Folder file: The file on your hard disk that contains offline folders. The offline folder file has an .ost extension. You can create it automatically when you set up Outlook or when you first make a folder available offline.) and Personal Folders files (.pst) (Personal Folders file (.pst): Data file that stores your messages and other items on your computer. You can assign a .pst file to be the default delivery location for e-mail messages. You can use a .pst to organize and back up items for safekeeping.) used for the profile has the ability to store multilingual Unicode data and offers greater storage capacity for items and folders. If Outlook is running in non-Unicode mode with an Exchange account, and you would like to switch to Unicode mode, contact your administrator.
Top of Page
Impact of running in non-Unicode mode with Exchange accounts
If you do not share messages and items (item: An item is the basic element that holds information in Outlook (similar to a file in other programs). Items include e-mail messages, appointments, contacts, tasks, journal entries, notes, posted items, and documents.) with people who use Outlook on systems that run in other languages, you can run Outlook in Unicode or non-Unicode mode with an Exchange account.
If you work in a multinational organization or share messages and items with people who use Outlook on systems that run in other languages, Outlook should run in Unicode mode with an Exchange account. To switch to Unicode mode, contact your administrator.
When Outlook runs in non-Unicode mode with an Exchange account, the code page-based (code page: A table that relates the binary character codes used by a program to keys on the keyboard or to the appearance of characters on the display. Code pages are a means of providing support for the languages used in different countries/regions.) system is used for character mapping. In a code page-based system, a character entered in one language may not map to the same character in another language. Therefore, you are likely to see incorrect characters, including question marks.
For example, consider two people — one is running the English version of the Microsoft Windows XP operating system with the Latin code page and the second person is running the Japanese version of the Microsoft Windows XP operating system with the Japanese code page. The second person creates a meeting request in the Japanese version of Outlook 2002 and sends it to the first person. When the person using the English version of Outlook 2002 opens the meeting request, the code points of the Japanese code page are mapped to unexpected or nonexistent characters in the Latin script, and the resulting text is unintelligible.
Note Since Outlook 2000, the body of Outlook items is Unicode, and the body of the item can be read irrespective of the language in which the item was created. However, all the other item properties such as the To, Location, and Subject lines of messages and meeting items and the ContactName and BusinessTelephoneNumber properties of contact items will be unintelligible in versions earlier than Outlook 2003.
Top of Page
About the universal font for Unicode
Arial Unicode MS font is a full Unicode font. It contains all of the characters, ideographs, and symbols defined in the Unicode 2.1 standard. This universal font is automatically installed if you are using Windows Vista or Microsoft Windows XP.
Because of its considerable size and the typographic compromises required to make such a font, Arial Unicode MS should be used only when you can't use multiple fonts tuned for different writing systems. For example, if you have multilingual data from many different writing systems in Microsoft Office Access, you can use Arial Unicode MS as the font to display the data tables, because Access can't accept many different fonts.
Top of Page
Create a Unicode Personal Folders file (.pst)
- On the
File menu, point to
New, and then click
Outlook Data File.
- To create a Personal Folders File (.pst) that offers greater storage capacity for items and folders and supports multilingual Unicode data, click the Personal Folders File for your version of Outlook, and click OK.
- In the
File name box, type a name for the file, and then click
OK.
- In the
Name box, type a display name for the .pst folder.
- Select any other options you want, and then click
OK.
The name of the folder associated with the data file appears in
the Folder
List (Folder List: Displays the folders available in your mailbox. To view subfolders, click the plus sign (+) next to the folder. If the Folder List is not visible, on the Go menu, click Folder List.). To view the
Folder List, on the
Go menu, click
Folder List. By default, the folder will
be called
Personal Folders.
Top of Page
Troubleshoot Unicode-related problems
If you are experiencing any problems using Unicode, check the following list for some solutions.
I upgraded to
Outlook 2003 or Outlook 2007, but Outlook isn't running in Unicode mode with an Exchange account
There could be several reasons why Outlook is still not running in Unicode mode.
If none of the above helped you switch to Unicode mode, contact your Exchange administrator.
I upgraded to Outlook 2003 or Outlook 2007, but my POP3 account still doesn't support multilingual Unicode data
If your profile was configured to deliver to a Personal Folders file (.pst) before you upgraded to Microsoft Office Outlook 2003 or Outlook 2007, you are still using the old Personal Folders file (.pst) that does not support Unicode for storing items delivered from the POP3 (POP3: A common protocol that is used to retrieve e-mail messages from an Internet e-mail server.) account. To resolve this, you should change the delivery location to a Personal Folders file (.pst) that supports multilingual Unicode data.
The Offline Folders file I selected caused Outlook to switch to non-Unicode mode, and now some items display '?' characters and are unreadable
When Outlook runs in non-Unicode mode with an Exchange account, the code page-based (code page: A table that relates the binary character codes used by a program to keys on the keyboard or to the appearance of characters on the display. Code pages are a means of providing support for the languages used in different countries/regions.) system is used for character mapping. In a code page-based system, a character entered in one language may not map to the same character in another language. Therefore, you are likely to see incorrect characters, including question marks.
For example, consider two people — one is running the English version of the Microsoft Windows XP operating system with the Latin code page and the second person is running the Japanese version of the Microsoft Windows XP operating system with the Japanese code page. The second person creates a meeting request in the Japanese version of Outlook 2002 and sends it to the first person. When the person using the English version of Outlook 2002 opens the meeting request, the code points of the Japanese code page are mapped to unexpected or nonexistent characters in the Latin script, and the resulting text is unintelligible. Therefore, in multilingual environments, we recommend that Outlook run in Unicode mode with an Exchange account.
To resolve this, disable offline folders, close and restart Outlook, and then create a new Offline Folder file and synchronize the data.
How will using a non-Unicode data file or running Outlook in non-Unicode mode with an Exchange account affect me?
If you do not share messages and items with people who use Outlook on computers that run in other languages, you can run Outlook in Unicode or non-Unicode mode with an Exchange account. A disadvantage of running in non-Unicode mode is that the Offline Folder file used for the profile will be created in the format that does not offer greater storage capacity for items and folders. Therefore, if size limit of the Offline Folder file is a concern for you, then you should run Outlook in Unicode mode with an Exchange account.
However, if you work in a multinational organization or share messages and items with people who use Outlook on systems that run in other languages, Outlook should run in Unicode mode with an Exchange account. This will also ensure that Unicode-capable .pst files are used for the profile that have the capability to store multilingual Unicode data. To switch to Unicode mode, see the "I upgraded to Outlook 2003 or Outlook 2007, but Outlook isn't running in Unicode mode with an Exchange account" section above.
When Outlook runs in non-Unicode mode with an Exchange account, the code page-based (code page: A table that relates the binary character codes used by a program to keys on the keyboard or to the appearance of characters on the display. Code pages are a means of providing support for the languages used in different countries/regions.) system is used for character mapping. In a code page-based system, a character entered in one language may not map to the same character in another language and therefore, if Outlook runs in non-Unicode mode with an Exchange account, you are likely to see incorrect characters, including question marks.
For example, consider two people — one is running the English version of the Microsoft Windows XP operating system with the Latin code page and the second person is running the Japanese version of the Microsoft Windows XP operating system with the Japanese code page. The second person creates a meeting request in the Japanese version of Outlook 2002 and sends it to the first person. When the person using the English version of Outlook 2002 opens the meeting request, the code points of the Japanese code page are mapped to unexpected or nonexistent characters in the Latin script, and the resulting text is unintelligible. Therefore, in multilingual environments, we recommend that Outlook run in Unicode mode with an Exchange account.
Note Since Outlook 2000, the body of Outlook items (item: An item is the basic element that holds information in Outlook (similar to a file in other programs). Items include e-mail messages, appointments, contacts, tasks, journal entries, notes, posted items, and documents.) has been Unicode, and the body can be read irrespective of the language in which the item was created. However, Outlook data, such as the To and Subject lines of messages and the ContactName and BusinessTelephoneNumber properties of contact items, will be limited to characters defined by your code page if Outlook runs in non-Unicode with an Exchange account.
I get an error message when I try to add a Personal Folders file (.pst) as the default delivery location for items. Outlook says the format of the specified Personal Folders file (.pst) doesn't match the Unicode Offline Folder file I'm using
In Outlook 2003 and Outlook 2007, the format of the Offline Folder file and the data file used as the delivery location need to match. To resolve this issue, you can specify a Personal Folders file (.pst) that supports Unicode as the default delivery location for items or disable the use of offline folders.
Do one of the following:
Top of Page