NLS_LANG is used on the client application machines (web server or application servers) for interacting with the database and providing globalization support that enables the end users to interact with the database in their language. The default value of this parameter is automatically chosen based on the locale setting of the operating system. On the Oracle database, the three parameters in v$nls_parameters that you should be concerned about are: NLS_CHARACTERSET, NLS_LANGUAGE and NLS_TERRITORY. On the client machines: NLS_LANG.
The NLS_LANG parameter uses the following format:
NLS_LANG = LANGUAGE_TERRITORY.CHARACTER_SET
where:
- LANGUAGE Specifies the language and conventions for displaying messages, day name, and month name (American, English being some examples)
- TERRITORY Specifies the conventions for calculating week and day numbers, currency, decimal/comma separator etc. (America, UnitedKingdom being some examples)
- CHARACTER_SET Controls the character set used for displaying messages (UTF8, AL32UTF being some examples).
NOTE: If you do not care about the display part and the currency/decimal-comma separators, you can leave the language and territory to be American and America respectively for the US based installs. Character set is the important one.
The NLS_LANG character (third part) set should reflect the setting of the operating system client. For example, if the database character set is UTF8 and the client has a Windows operating system, you should not set UTF8 as the client character set because there are no UTF8 WIN32 clients. Instead the NLS_LANG setting should reflect the code page of the client.
NLS_LANG is set as a local environment variable on UNIX platforms. NLS_LANG is set in the registry on Windows platforms. For example, on an American English Windows client, the code page is WE8MSWIN1252. An appropriate setting for NLS_LANG is AMERICAN_AMERICA.WE8MSWIN1252.
Setting NLS_LANG correctly allows proper conversion from the client operating system code page to the database character set. When these settings are the same,
Oracle assumes that the data being sent or received is encoded in the same character set as the database character set, so no validation or conversion is performed. This can lead to corrupt data if the client code page and the database character set are different and conversions are necessary.
So, bottomline:
DB Server: the nls_characterset should be either UTF8 or AL32UTF8 (the nls_language and nls_territory need to be set on a per deployment basis as needed).
Client machines (Windows) : Based on the code page of the client. Oracle has a matrix that details what those values should be based on the operating system locale.
On Windows application box: The NLS_LANG parameter is stored in the registry under the HKEY_LOCAL_MACHINE\SOFTWARE\ORACLE\HOME ID\NLS_LANG subkey, where ID is the unique number identifying the Oracle home.
On Unix application box: Set the local environment variable accordingly.
A point of caution: NLS_LANG settings could be different (as compared to the above rule) when you run in MS-DOS or batch mode i.e. when using Oracle utilities like SQL *LDR, SQL *PLUS, Import/Export, data-pump etc.. That is because the MS-DOS mode uses (there are a few exceptions) a different character set (or a code page) from Windows (ANSI code page) and the default entry in the registry for Windows is always set to the appropriate Windows code page. If this is not done, then when running in MS-DOS or batch mode, data corruption can occur very easily. You can set it to it’s correct value by using the “SET NLS_LANG” command at the start of the batch. Look at Section E in the Oracle globalization guide to get a list of the Oracle character sets for Operating System Locales and make the changes as needed.