Sidebar

reference:scripting_reference:scripting_objects:stringtools

StringTools object provides several utility methods for encoding and decoding strings. For example, you can use a StringTools object to Base64-encode a chunk of data, or decode a UTF-8 encoded message header.

You can obtain a StringTools object using the DOpusFactory.StringTools method.

Method Name Arguments Return Type Description

Decode

<Blob:source> or <string:source>
<string:format>

string or
Blob

Decodes an encoded string or data.

You can provide either a Blob object or a string as the source to decode. Depending on the value of the format argument, either a string or a Blob is returned. Valid formats are:

base64The source will be Base64-decoded, and a Blob is returned.
quotedThe source will be Quoted-printable-decoded, and a Blob is returned.
utf-8The source will be converted from UTF-8 to a native string.
utf-16
utf-16-le
The source will be converted from UTF-16 Little Endian to a native string.
utf-16-beThe source will be converted from UTF-16 Big Endian to a native string.
autoSpecial handling is invoked to decode a MIME-encoded email subject (e.g. one beginning with =?), and a string is returned if identified. It will also detect UTF-8 or UTF-16 encoded data if it has a BOM at the beginning.

If decoding UTF-8 or UTF-16 (via "auto" or "utf-8", etc.), any byte-order-mark (BOM) will be skipped if one exists at the beginning of the input data.

If format is not specified the default is auto. Otherwise, format must be set to one of the above keywords or a valid code-page name (e.g. "gb2312", "utf-8"), or a Windows code-page ID (e.g. 936, 65001). The source will be decoded using the specified code-page and a string is returned.

Encode

<Blob:source> or <string:source>
<string:format>

string or
Blob

Encodes a string or data.

You can provide either a Blob object or a string as the source to decode. Depending on the value of the format argument, either a string or a Blob is returned. Valid formats are:

base64The source will be Base64-encoded, and a string is returned.
quotedThe source will be Quoted-printable-encoded, and a string is returned.
utf-8The source will be converted to UTF-8 without a byte-order-mark (BOM).
utf-8 bomThe source will be converted to UTF-8 with a BOM at the start.
utf-16
utf-16-le
The source will be converted to UTF-16 Little Endian without a BOM.
utf-16 bom
utf-16-le bom
The source will be converted to UTF-16 Little Endian with a BOM.
utf-16-beThe source will be converted to UTF-16 Big Endian without a BOM.
utf-16-be bomThe source will be converted to UTF-16 Big Endian with a BOM.

Otherwise, format must be set to a valid code-page name (e.g. "gb2312", "utf-8" etc.), or a Windows code-page ID (e.g. 936, 65001). The source will be encoded using the specified code-page and a Blob is returned.

IsASCII

<string:input>

bool

Tests the input string to see if it only contains characters that can be represented in ASCII.

If the result is false, the string is not safe to save into a text file unless you use a Unicode format such as UTF-8.

This check is not affected by locales or codepages. Instead, it tests whether the string consists of only 7-bit ASCII characters, such that no characters will be lost or modified if you save the string to a text file and then load it back on any other computer.

LanguageStr

<string:name> or
<int:id>

string

Returns a translated string in the currently selected language. Mainly needed for internal use.

The currently defined strings are:

IDEnglish language string
FavoritesBarFavorites Bar
FindResultsFind Results
CopySelectionCopy Selection
CopyAllCopy All

MakeLegal

<string:name> [<string:flags>]

string

Strips any illegal filename characters from the supplied string.

The optional flags are:

fforward slashes: convert separators to / instead of \
nname instead of path: replace separators with _ (implies s)
ssubdirectory mode: replace : with ; and remove \\ from UNC paths

RemoveDiacritics

<string:input>

string

Returns a copy of the input string with any diacritics (accent symbols) removed. For example, "á" would be converted to "a".

This function uses the same rules that are used by the "ignore diacritics" options for pattern matching throughout Opus.

Truncate

<string:input> or object:Path <int:length> [<int:type>]

string

Truncates the specified input string to the requested number of characters.

The optional type argument specifies the truncation type. Valid values are:

0truncate on the right
1truncate on the left
2truncate in the middle

If not specified, the default is 2 if input is a Path object, otherwise the default is 0.

If the input value is a Path and middle truncation is selected, the function takes path separators into account correctly.