Sourceforge.net - The VCF's Project Host
   The VCF Website Home   |   Online Discussion Forums   |   Sourceforge.net Project Page   

VCF::UnicodeString Class Reference

The UnicodeString class represents a thin wrapper around the std::basic_string class since std::basic_string cannot be derived from (it has no virtual destructor). More...

#include <vcf/FoundationKit/VCFString.h>

List of all members.

Public Types

typedef char AnsiChar
typedef VCF::WideChar UniChar
typedef std::basic_string<
UniChar
StringData
typedef StringData::size_type size_type
typedef StringData::traits_type traits_type
typedef StringData::allocator_type allocator_type
typedef UniChar char_type
typedef StringData::difference_type difference_type
typedef StringData::pointer pointer
typedef StringData::const_pointer const_pointer
typedef StringData::reference reference
typedef StringData::const_reference const_reference
typedef StringData::value_type value_type
typedef StringData::iterator iterator
typedef StringData::const_iterator const_iterator
typedef StringData::reverse_iterator reverse_iterator
typedef StringData::const_reverse_iterator const_reverse_iterator
enum  LanguageEncoding {
  leUnknown = -1, leDefault = 0, leIBM037 = 100, leIBM437,
  leIBM500, leArabic708, leArabic449, leArabicTransparent,
  leDOSArabic, leGreek, leBaltic, leLatin1,
  leLatin2, leCyrillic, leTurkish, leMultilingualLatin1,
  lePortuguese, leIcelandic, leHebrew, leFrenchCanadian,
  leArabic864, leNordic, leRussianCyrillic, leModernGreek,
  leEBCDICLatin2, leThai, leEBCDICGreekModern, leShiftJIS,
  leSimplifiedChinese, leKorean, leChineseTraditionalBig5, leEBCDICTurkish,
  leEBCDICLatin1, leEBCDICUSCanada, leEBCDICGermany, leEBCDICDenmarkNorway,
  leEBCDICFinlandSweden, leEBCDICItaly, leEBCDICLatinAmericaSpain, leEBCDICUnitedKingdom,
  leEBCDICFrance, leEBCDICInternational, leEBCDICIcelandic, leUTF16LittleEndianByteOrder,
  leUTF16BigEndianByteOrder, leANSICentralEuropean, leANSICyrillic, leANSILatin1,
  leANSIGreek, leANSITurkish, leANSIHebrew, leANSIArabic,
  leANSIBaltic, leANSIVietnamese, leJohabKorean, leMacRoman,
  leMacJapanese, leMacTraditionalChineseBig5, leMacKorean, leMacArabic,
  leMacHebrew, leMacGreek, leMacCyrillic, leMacSimplifiedChinese,
  leMacRomanian, leMacUkrainian, leMacThai, leMacLatin2,
  leMacIcelandic, leMacTurkish, leMacCroatian, leUTF32LittleEndianByteOrder,
  leUTF32BigEndianByteOrder, leCNSTaiwan, leTCATaiwan, leEtenTaiwan,
  leIBM5550Taiwan, leTeleTextTaiwan, leWangTaiwan, leIA5WesternEuropean,
  leIA5German, leIA5Swedish, leIA5Norwegian, leUSASCII,
  leT61, leISO6937, leIBM273Germany, leIBM277DenmarkNorway,
  leIBM278FinlandSweden, leIBM280Italy, leIBM284LatinAmericaSpain, leIBM285UnitedKingdom,
  leIBM290JapaneseKatakanaExt, leIBM297France, leIBM420Arabic, leIBM423Greek,
  leIBM424Hebrew, leIBMKoreanExtended, leIBMThai, leRussianKOI8R,
  leIBM871Icelandic, leIBM880CyrillicRussian, leIBM905Turkish, leIBM00924Latin1,
  leEUCJapaneseJIS, leSimplifiedChineseGB2312, leKoreanWansung, leEBCDICCyrillicSerbianBulgarian,
  leUkrainianKOI8U, leISO88591Latin1, leISO88592CentralEuropean, leISO88593Latin3,
  leISO88594Baltic, leISO88595Cyrillic, leISO88596Arabic, leISO88597Greek,
  leISO88598HebrewVisual, leISO88599Turkish, leISO885913Estonian, leISO885915Latin9,
  leEuropa3, leISO88598HebrewLogical, leISO2022JapaneseNoHalfwidthKatakana, leISO2022JapaneseWithHalfwidthKatakana,
  leISO2022JapaneseAllow1ByteKana, leISO2022Korean, leISO2022SimplifiedChinese, leISO2022TraditionalChinese,
  leEBCDICJapaneseExt, leEBCDICUSCanadaAndJapanese, leEBCDICKoreanExtAndKorean, leEBCDICSimplifiedChineseExtSimplifiedChinese,
  leEBCDICSimplifiedChinese, leEBCDICUSCanadaAndTraditionalChinese, leEBCDICJapaneseLatinExtAndJapanese, leEUCJapanese,
  leEUCSimplifiedChinese, leEUCKorean, leEUCTraditionalChinese, leHZGB2312SimplifiedChinese,
  leGB18030SimplifiedChinese, leISCIIDevanagari, leISCIIBengali, leISCIITamil,
  leISCIITelugu, leISCIIAssamese, leISCIIOriya, leISCIIKannada,
  leISCIIMalayalam, leISCIIGujarati, leISCIIPunjabi, leUTF7,
  leUTF8
}
 Code page values for the locale:. More...
enum  {
  npos = (unsigned int)-1, UTF8BOMSize = sizeof(uchar) * 3, UTF16BOMSize = sizeof(ushort), UTF32BOMSize = sizeof(uint32),
  UTF8BOM = 0xEFBBBF, UTF16LittleEndianBOM = 0xFFFE, UTF16BigEndianBOM = 0xFEFF, UTF32LittleEndianBOM = 0xFFFE0000,
  UTF32BigEndianBOM = 0x0000FEFF
}

Public Member Functions

 ~UnicodeString ()
 UnicodeString ()
 UnicodeString (const UnicodeString &rhs)
 UnicodeString (const StringData &rhs)
 UnicodeString (const std::string &rhs, LanguageEncoding encoding=leDefault)
 UnicodeString (const AnsiChar *string, size_type stringLength, LanguageEncoding encoding=leDefault)
 UnicodeString (const UniChar *string, size_type stringLength)
 UnicodeString (const AnsiChar *string, LanguageEncoding encoding=leDefault)
 UnicodeString (const UniChar *string)
 UnicodeString (size_type n, AnsiChar c, LanguageEncoding encoding=leDefault)
 UnicodeString (size_type n, UniChar c)
void decode_ansi (TextCodec *codec, AnsiChar *str, size_type &strSize, LanguageEncoding encoding=leDefault) const
 Decodes the unicode data in this string and places the data in the ansi string buffer passed in.
UnicodeString decode (TextCodec *codec, LanguageEncoding encoding=leDefault) const
 Decodes the unicode data in the string and returns a new string with the decoded data as determined by the text codec.
void encode (TextCodec *codec, const AnsiChar *str, size_type n, LanguageEncoding encoding=leDefault)
 This encodes the ansi string into unicode, according to the algorithms in the codec, and replaces the data in the string's data_ value.
void encode (TextCodec *codec, const UnicodeString &str, LanguageEncoding encoding=leDefault)
 This encodes the data in the unicode string into another unicode string, according to the algorithms in the codec, and replaces the data in the string's data_ value.
const AnsiCharansi_c_str (LanguageEncoding encoding=leDefault) const
 Returns a const char* pointer.
 operator const StringData & () const
 This is a convenience function to get at the string's underlying data as a const std::basic_string.
 operator StringData & ()
 This is a convenience function to get at the string's underlying data as a const std::basic_string.
 operator AnsiString () const
 This is a convenience function that converts the string's data from unicode to ansi, and returns a std::basic_string<AnsiChar> (also known as std::string).
UnicodeStringoperator= (const UnicodeString &rhs)
UnicodeStringoperator= (const AnsiString &s)
UnicodeStringoperator= (const AnsiChar *s)
UnicodeStringoperator= (const UniChar *s)
UnicodeStringoperator= (AnsiChar c)
UnicodeStringoperator= (UniChar c)
bool operator== (const StringData &rhs) const
bool operator== (const UniChar *rhs) const
bool operator!= (const StringData &rhs) const
bool operator!= (const UniChar *rhs) const
bool operator< (const StringData &rhs) const
bool operator< (const UniChar *rhs) const
bool operator<= (const StringData &rhs) const
bool operator<= (const UniChar *rhs) const
bool operator> (const StringData &rhs) const
bool operator> (const UniChar *rhs) const
bool operator>= (const StringData &rhs) const
bool operator>= (const UniChar *rhs) const
bool operator== (const AnsiChar *rhs) const
bool operator!= (const AnsiChar *rhs) const
bool operator> (const AnsiChar *rhs) const
bool operator>= (const AnsiChar *rhs) const
bool operator< (const AnsiChar *rhs) const
bool operator<= (const AnsiChar *rhs) const
iterator begin ()
const_iterator begin () const
iterator end ()
const_iterator end () const
reverse_iterator rbegin ()
const_reverse_iterator rbegin () const
reverse_iterator rend ()
const_reverse_iterator rend () const
const_reference at (size_type pos) const
reference at (size_type pos)
const_reference operator[] (size_type pos) const
reference operator[] (size_type pos)
const UniCharc_str () const
const UniChardata () const
size_type length () const
size_type size () const
 Returns the number of characters in the string.
size_type size_in_bytes () const
 The number of bytes that make up this string.
size_type max_size () const
void resize (size_type n, UniChar c=UniChar())
size_type capacity () const
void reserve (size_type n=0)
bool empty () const
UnicodeStringoperator+= (const UnicodeString &rhs)
UnicodeStringoperator+= (const UniChar *s)
UnicodeStringoperator+= (UniChar c)
UnicodeStringoperator+= (AnsiChar c)
UnicodeStringoperator+= (const AnsiChar *rhs)
UnicodeStringappend (const UnicodeString &str)
UnicodeStringappend (const UnicodeString &str, size_type pos, size_type n)
UnicodeStringappend (const UniChar *s, size_type n)
UnicodeStringappend (const AnsiChar *s, size_type n)
UnicodeStringappend (const UniChar *s)
UnicodeStringappend (const AnsiChar *s)
UnicodeStringappend (size_type n, UniChar c)
UnicodeStringappend (size_type n, AnsiChar c)
UnicodeStringappend (const_iterator first, const_iterator last)
UnicodeStringassign (const UnicodeString &str)
UnicodeStringassign (const UnicodeString &str, size_type pos, size_type n)
UnicodeStringassign (const UniChar *s, size_type n)
UnicodeStringassign (const UniChar *s)
UnicodeStringassign (size_type n, UniChar c)
UnicodeStringassign (const AnsiChar *s, size_type n)
UnicodeStringassign (const AnsiChar *s)
UnicodeStringassign (size_type n, AnsiChar c)
UnicodeStringassign (const_iterator first, const_iterator last)
UnicodeStringinsert (size_type p0, const UnicodeString &str)
UnicodeStringinsert (size_type p0, const UnicodeString &str, size_type pos, size_type n)
UnicodeStringinsert (size_type p0, const AnsiChar *s, size_type n)
UnicodeStringinsert (size_type p0, const UniChar *s, size_type n)
UnicodeStringinsert (size_type p0, const AnsiChar *s)
UnicodeStringinsert (size_type p0, const UniChar *s)
UnicodeStringinsert (size_type p0, size_type n, AnsiChar c)
UnicodeStringinsert (size_type p0, size_type n, UniChar c)
iterator insert (iterator it, AnsiChar c)
iterator insert (iterator it, UniChar c)
void insert (iterator it, size_type n, AnsiChar c)
void insert (iterator it, size_type n, UniChar c)
void insert (iterator it, const_iterator first, const_iterator last)
UnicodeStringerase (size_type p0=0, size_type n=npos)
iterator erase (iterator it)
iterator erase (iterator first, iterator last)
UnicodeStringreplace (size_type p0, size_type n0, const UnicodeString &str)
UnicodeStringreplace (size_type p0, size_type n0, const UnicodeString &str, size_type pos, size_type n)
UnicodeStringreplace (size_type p0, size_type n0, const AnsiChar *s, size_type n)
UnicodeStringreplace (size_type p0, size_type n0, const UniChar *s, size_type n)
UnicodeStringreplace (size_type p0, size_type n0, const AnsiChar *s)
UnicodeStringreplace (size_type p0, size_type n0, const UniChar *s)
UnicodeStringreplace (size_type p0, size_type n0, size_type n, AnsiChar c)
UnicodeStringreplace (size_type p0, size_type n0, size_type n, UniChar c)
UnicodeStringreplace (iterator first0, iterator last0, const UnicodeString &str)
UnicodeStringreplace (iterator first0, iterator last0, const AnsiChar *s, size_type n)
UnicodeStringreplace (iterator first0, iterator last0, const UniChar *s, size_type n)
UnicodeStringreplace (iterator first0, iterator last0, const AnsiChar *s)
UnicodeStringreplace (iterator first0, iterator last0, const UniChar *s)
UnicodeStringreplace (iterator first0, iterator last0, size_type n, AnsiChar c)
UnicodeStringreplace (iterator first0, iterator last0, size_type n, UniChar c)
UnicodeStringreplace (iterator first0, iterator last0, const_iterator first, const_iterator last)
size_type copy (AnsiChar *s, size_type n, size_type pos=0) const
size_type copy (UniChar *s, size_type n, size_type pos=0) const
void swap (UnicodeString &str)
size_type find (const UnicodeString &str, size_type pos=0) const
size_type find (const AnsiChar *s, size_type pos, size_type n) const
size_type find (const UniChar *s, size_type pos, size_type n) const
size_type find (const AnsiChar *s, size_type pos=0) const
size_type find (const UniChar *s, size_type pos=0) const
size_type find (AnsiChar c, size_type pos=0) const
size_type find (UniChar c, size_type pos=0) const
size_type rfind (const UnicodeString &str, size_type pos=npos) const
size_type rfind (const AnsiChar *s, size_type pos, size_type n=npos) const
size_type rfind (const UniChar *s, size_type pos, size_type n=npos) const
size_type rfind (const AnsiChar *s, size_type pos=npos) const
size_type rfind (const UniChar *s, size_type pos=npos) const
size_type rfind (AnsiChar c, size_type pos=npos) const
size_type rfind (UniChar c, size_type pos=npos) const
size_type find_first_of (const UnicodeString &str, size_type pos=0) const
size_type find_first_of (const AnsiChar *s, size_type pos, size_type n) const
size_type find_first_of (const UniChar *s, size_type pos, size_type n) const
size_type find_first_of (const AnsiChar *s, size_type pos=0) const
size_type find_first_of (const UniChar *s, size_type pos=0) const
size_type find_first_of (AnsiChar c, size_type pos=0) const
size_type find_first_of (UniChar c, size_type pos=0) const
size_type find_last_of (const UnicodeString &str, size_type pos=npos) const
size_type find_last_of (const AnsiChar *s, size_type pos, size_type n=npos) const
size_type find_last_of (const UniChar *s, size_type pos, size_type n=npos) const
size_type find_last_of (const AnsiChar *s, size_type pos=npos) const
size_type find_last_of (const UniChar *s, size_type pos=npos) const
size_type find_last_of (AnsiChar c, size_type pos=npos) const
size_type find_last_of (UniChar c, size_type pos=npos) const
size_type find_first_not_of (const UnicodeString &str, size_type pos=0) const
size_type find_first_not_of (const AnsiChar *s, size_type pos, size_type n) const
size_type find_first_not_of (const UniChar *s, size_type pos, size_type n) const
size_type find_first_not_of (const AnsiChar *s, size_type pos=0) const
size_type find_first_not_of (const UniChar *s, size_type pos=0) const
size_type find_first_not_of (AnsiChar c, size_type pos=0) const
size_type find_first_not_of (UniChar c, size_type pos=0) const
size_type find_last_not_of (const UnicodeString &str, size_type pos=npos) const
size_type find_last_not_of (const AnsiChar *s, size_type pos, size_type n) const
size_type find_last_not_of (const UniChar *s, size_type pos, size_type n) const
size_type find_last_not_of (const AnsiChar *s, size_type pos=npos) const
size_type find_last_not_of (const UniChar *s, size_type pos=npos) const
size_type find_last_not_of (AnsiChar c, size_type pos=npos) const
size_type find_last_not_of (UniChar c, size_type pos=npos) const
UnicodeString substr (size_type pos=0, size_type n=npos) const
int compare (const UnicodeString &str) const
int compare (size_type p0, size_type n0, const UnicodeString &str)
int compare (size_type p0, size_type n0, const UnicodeString &str, size_type pos, size_type n)
int compare (const AnsiChar *s) const
int compare (const UniChar *s) const
int compare (size_type p0, size_type n0, const AnsiChar *s) const
int compare (size_type p0, size_type n0, const UniChar *s) const
int compare (size_type p0, size_type n0, const AnsiChar *s, size_type pos) const
int compare (size_type p0, size_type n0, const UniChar *s, size_type pos) const
uint64 sizeOf () const

Static Public Member Functions

static void transformAnsiToUnicode (const AnsiChar *str, size_type stringLength, StringData &newStr, LanguageEncoding encoding=leDefault)
static AnsiChartransformUnicodeToAnsi (const UnicodeString &str, LanguageEncoding encoding=leDefault)
static UniChar transformAnsiCharToUnicodeChar (AnsiChar c, LanguageEncoding encoding=leDefault)
static AnsiChar transformUnicodeCharToAnsiChar (UniChar c, LanguageEncoding encoding=leDefault)
static int adjustForBOMMarker (AnsiChar *&stringPtr, uint32 &len)

Protected Member Functions

void modified ()

Protected Attributes

StringData data_
AnsiCharansiDataBuffer_

Friends

bool operator== (const UnicodeString &lhs, const UnicodeString &rhs)
bool operator!= (const UnicodeString &lhs, const UnicodeString &rhs)
bool operator< (const UnicodeString &lhs, const UnicodeString &rhs)
bool operator<= (const UnicodeString &lhs, const UnicodeString &rhs)
bool operator> (const UnicodeString &lhs, const UnicodeString &rhs)
bool operator>= (const UnicodeString &lhs, const UnicodeString &rhs)


Detailed Description

The UnicodeString class represents a thin wrapper around the std::basic_string class since std::basic_string cannot be derived from (it has no virtual destructor).

The type of std::basic_string is a std::basic_string<wchar_t> meaning that the string class maintains unicode data internally.

The main purpose of the String class is to provide a drop in replacement for std::basic_string<wchar_t>, with an interface that is 100% compatible. In addition we add a few extra functions of our own:

These extra functions make it seamless to use with existing code that uses either old C style strings and/or std::string/stdwstring instances. For complete documentation of the std::basic_string, please see SGI's documentation.

Another set of functions is used to encode or decode text using a particular text codec isntance as specified by the TextCodec class. These encoding/decoding methods are:

In addition there are also a whole series of typedefs, again solely to make the class compatible with the std::basic_string class.


Member Typedef Documentation

typedef StringData::allocator_type VCF::UnicodeString::allocator_type
 

typedef char VCF::UnicodeString::AnsiChar