Language Section

Overview

The Language Section contains properties of interest to international users. The Language section may also be of interest to users who want to customize sentences for use with Second Site via TMG's language facility.

If you produce sites (or other output) in multiple languages, you should also read the International Sites page, and you may be interested in the language features in TMG Utility.

TMG users who want to learn more about using TMG's language facility should visit Terry Reigel's web site. Terry is a well-known TMG user who was one of the contributors to Lee Hoffman's Getting the Most Out of The Master Genealogist. He has also published articles related to TMG in user group publications and on his web site. His section on TMG's Language Capability includes multiple articles about TMG's language features.

Site Language

The Site Language pull-down menu controls which version of certain language-specific words and phrases are used during site creation. Second Site uses the value of Site Language in the Format definitions and when translating tag names using the <language>.ini files in the Defaults folder.

Note Always set Site Language to a "real" language even if you are using a custom Sentence Language.

If you change Site Language, and the language you select has an associated defaults file, Second Site will ask you if you want to load the text strings and other values from that defaults file.

So, for example, loading the default strings helps if you want to make a site in English and a site in German. Make the English site, then use File > Save As... to save the SDF file with a different name. In the copy, change Site Language to German, and Second Site will update the Strings section for you. This is not a complete solution; you will still have to translate many strings manually, such as the Site.Title, and all the content of Custom Pages.

Sentence Language

Second Site uses TMG's sentence rules when Body Tags.Detail Format property is set to Use Sentence. If you have defined sentences in a language other than English, choose the appropriate language in the Sentence Language pull-down menu.

The choices include all the languages defined in the standard versions of TMG v4+. If you have created a custom language definition, you may add it to the menu by modifying the list of languages via the 2ndsite.ini file, but it is more convenient to use one of the pre-defined language names used by Second Site: SS1, SS2, or SS3.

The Site Language and Sentence Language are usually the same, but they may be set to different values. A common example is when custom sentence languages are defined in the TMG dataset. In that case, Sentence Language might be set to "Alternate" or "SS1" and the Site Language to "English (U.S.)" or "Dutch", for example.

Dutch users have to be careful to choose the proper entry in the Sentence Language pull-down menu. If you are using TMG v5 or greater, choose Dutch. If you are using TMG v4, choose Nederlands. Dutch is the default when you create a new SDF file using the "Dutch - Standard" Defaults file.

Default to English (U.S.)

If Default to English (U.S.) is checked, Second Site will use the English (U.S.) version of the sentence if the tag type does not have a sentence defined in the language specified by the Sentence Language pull-down menu. This makes Second Site mimic TMG's behavior more closely. The default is unchecked.

Some TMG users have sentences stored in a mix of the English (U.S.) and English (UK) languages because they began using TMG before English (UK) was available. Default to English (U.S.) is primarily intended for those users.

Site Language or Sentence Language?

Area	Relevant Property
Tag translation and selection	Site Language
Format-specific words and phrases	Site Language
Relationship sentences	Site Language
Tag sentences	Sentence Language and Default to English

Area

Relevant Property

Tag translation and selection

Site Language

Format-specific words and phrases

Site Language

Relationship sentences

Site Language

Tag sentences

Sentence Language and
Default to English

Sort Sequence

Select one of the choices from the Sort Sequence menu to set the sort sequence for the indexes created by Second Site.

The standard entries in the Sort Sequence pull-down menu are:

ASCII: In the ASCII sequence, accented characters such as ü sort after most other characters, including symbols. This follows a common definition of ASCII, though it actually includes characters that are outside the ANSI standard definition of ASCII.
ASCII - Anglicized: In the ASCII - Anglicized sequence, accented characters such as ü sort as if they were the character they most resemble in English. So, ü is mixed with u. For sorting purposes, there is no difference between the two characters.
ASCII - Extended: In the ASCII - Extended sequence, accented characters such as ü sort immediately after the character they most resemble in English. So, ü follows u.

The remaining items in the menu are locale-specific sort sequences. The rules follow the known collating rules as closely as possible for the given locale. In German, for example, the single character ü sorts as if it was the multi-character sequence ue. For sorting purposes, there is no difference between ü and ue. This has an impact on the name index. For example, the surnames Luess and Lüss sort the same. If "John Luess" and "William Lüss" are both in the dataset, the name index would be as follows:

Luess, Lüss
   John (1858-1936)
   William (1828-1901)

When German names are anglicized, the ü character is often replaced by ue, and thus Luess and Lüss are essentially the same.

The Sort Sequence pull-down menu includes all the sort sequences defined in the Sorts folder which is located in the program folder. If you are familiar with collating sequences, you may add or modify the scripts.

HTML Character Set

The HTML Character Set property sets the value of the Content Type META tag. The default is "utf-8".

When the HTML Character Set is utf-8, Second Site will write the HTML files in UTF-8 format. When set to other values, Second Site writes the text files using the default character encoding. The default varies according to the configuration of MS Windows, but it typically Windows-1252.

Write Byte Order Mark

When the HTML Character Set is utf-8, the Write Byte Order Mark (BOM) property is enabled. If checked, Second Site will add a Byte Order Mark to the beginning of files written with the utf-8 encoding. A Byte Order Mark may or may not be required by your web host. Some text editors for Windows require a BOM or they may not recognize the file format as utf-8. Notepad will detect a BOM. If no BOM is present, Notepad will attempt to determine the character set by looking at the file contents.

Unicode Resources

A detailed description of Unicode, utf-8, and BOM is beyond the scope of this document. For detailed technical information about Unicode, see the Unicode Home Page. The UTF-8, UTF-16, UTF-32 & BOM FAQ has a lot of useful information.