1a33f8fbdSAdrien Destugues/* 2a33f8fbdSAdrien Destugues * Copyright 2011, Haiku, Inc. All Rights Reserved. 3a33f8fbdSAdrien Destugues * Distributed under the terms of the MIT License. 4a33f8fbdSAdrien Destugues * 5a33f8fbdSAdrien Destugues * Authors: 6a33f8fbdSAdrien Destugues * Axel Dörfler, axeld@pinc-software.de 7a33f8fbdSAdrien Destugues * Adrien Destugues <pulkomandy@pulkomandy.ath.cx> 8a33f8fbdSAdrien Destugues * John Scipione, jscipione@gmail.com 9a33f8fbdSAdrien Destugues * 10a33f8fbdSAdrien Destugues * Corresponds to: 11a33f8fbdSAdrien Destugues * /trunk/headers/os/locale/Collator.h rev 42274 12a33f8fbdSAdrien Destugues * /trunk/src/kits/locale/Collator.cpp rev 42274 13a33f8fbdSAdrien Destugues */ 14a33f8fbdSAdrien Destugues 15a33f8fbdSAdrien Destugues 16a33f8fbdSAdrien Destugues/*! 17a33f8fbdSAdrien Destugues \file Collator.h 18a33f8fbdSAdrien Destugues \brief Provides the BCollator class. 19a33f8fbdSAdrien Destugues*/ 20a33f8fbdSAdrien Destugues 21a33f8fbdSAdrien Destugues 22e82e8f36SAdrien Destugues/*! 23e82e8f36SAdrien Destugues \class BCollator 24e82e8f36SAdrien Destugues \ingroup locale 25*f69cadd0SJohn Scipione \brief Class for handling locale-aware collation (sorting) of strings. 26e82e8f36SAdrien Destugues 27*f69cadd0SJohn Scipione BCollator is designed to handle collation (sorting) of strings. Unlike 28*f69cadd0SJohn Scipione string sorting using strcmp() or similar functions that compare raw bytes 29*f69cadd0SJohn Scipione the collation is done using a set of rules that changes from one locale 30*f69cadd0SJohn Scipione to another. For example, in Spanish, 'ch' is considered to be a letter 31a33f8fbdSAdrien Destugues and is sorted between 'c' and 'd'. This class is also able to perform 32*f69cadd0SJohn Scipione natural number sorting so that 2 is sorted before 10 unlike byte-based 33*f69cadd0SJohn Scipione sorting. 34e82e8f36SAdrien Destugues 35*f69cadd0SJohn Scipione \warning This class is not multithread-safe, as Compare() change the 36*f69cadd0SJohn Scipione ICUCollator (the strength). So if you want to use a BCollator from 37*f69cadd0SJohn Scipione more than one thread you need to protect it with a lock. 38e82e8f36SAdrien Destugues*/ 39e82e8f36SAdrien Destugues 40*f69cadd0SJohn Scipione 41e82e8f36SAdrien Destugues/*! 42e82e8f36SAdrien Destugues \fn BCollator::BCollator() 43*f69cadd0SJohn Scipione \brief Construct a collator with the default locale and strength. 44a33f8fbdSAdrien Destugues 45*f69cadd0SJohn Scipione \attention The default collator should be constructed by the BLocale 46*f69cadd0SJohn Scipione instead since it is aware of the currently defined locale. 47*f69cadd0SJohn Scipione 48*f69cadd0SJohn Scipione This constructor uses \c B_COLLATE_PRIMARY strength. 49e82e8f36SAdrien Destugues*/ 50e82e8f36SAdrien Destugues 51*f69cadd0SJohn Scipione 52e82e8f36SAdrien Destugues/*! 53a33f8fbdSAdrien Destugues \fn BCollator::BCollator(const char* locale, 54a33f8fbdSAdrien Destugues int8 strength = B_COLLATE_PRIMARY, bool ignorePunctuation = false) 55*f69cadd0SJohn Scipione \brief Construct a collator for the given \a locale and \a strength. 56e82e8f36SAdrien Destugues 57a33f8fbdSAdrien Destugues This constructor loads the data for the given locale. You can also 58*f69cadd0SJohn Scipione set the \a strength and choose if the collator should take 59*f69cadd0SJohn Scipione punctuation into account or not. 60a33f8fbdSAdrien Destugues 61*f69cadd0SJohn Scipione \param locale The \a locale to build the constructor for. 62*f69cadd0SJohn Scipione \param strength The collator class provide four level of \a strength. 63a33f8fbdSAdrien Destugues \li \c B_COLLATE_PRIMARY doesn't differentiate e from é, 64a33f8fbdSAdrien Destugues \li \c B_COLLATE_SECONDARY takes letter accents into account, 65a33f8fbdSAdrien Destugues \li \c B_COLLATE_TERTIARY is case sensitive, 66a33f8fbdSAdrien Destugues \li \c B_COLLATE_QUATERNARY is very strict. Most of the time you 67*f69cadd0SJohn Scipione shouldn't need to go this far. 68*f69cadd0SJohn Scipione \param ignorePunctuation Ignore punctuation during sorting. 69e82e8f36SAdrien Destugues*/ 70e82e8f36SAdrien Destugues 71*f69cadd0SJohn Scipione 72e82e8f36SAdrien Destugues/*! 73e82e8f36SAdrien Destugues \fn BCollator::BCollator(BMessage* archive) 74a33f8fbdSAdrien Destugues \brief Unarchive a collator from a message. 75a33f8fbdSAdrien Destugues 76*f69cadd0SJohn Scipione \param archive The message to unarchive the BCollator object from. 77e82e8f36SAdrien Destugues*/ 78e82e8f36SAdrien Destugues 79*f69cadd0SJohn Scipione 80e82e8f36SAdrien Destugues/*! 81e82e8f36SAdrien Destugues \fn BCollator::BCollator(const BCollator& other) 82e82e8f36SAdrien Destugues \brief Copy constructor. 83a33f8fbdSAdrien Destugues 84*f69cadd0SJohn Scipione Copies a BCollator object from another BCollator object. 85a33f8fbdSAdrien Destugues 86a33f8fbdSAdrien Destugues \param other The BCollator to copy from. 87e82e8f36SAdrien Destugues*/ 88e82e8f36SAdrien Destugues 89*f69cadd0SJohn Scipione 90e82e8f36SAdrien Destugues/*! 91a33f8fbdSAdrien Destugues \fn BCollator::~BCollator() 92*f69cadd0SJohn Scipione \brief Destructor method. 93a33f8fbdSAdrien Destugues 94*f69cadd0SJohn Scipione Deletes the BCollator object freeing the resources it consumes. 95e82e8f36SAdrien Destugues*/ 96e82e8f36SAdrien Destugues 97*f69cadd0SJohn Scipione 98e82e8f36SAdrien Destugues/*! 99a33f8fbdSAdrien Destugues \fn Bcollator& BCollator::operator=(const BCollator& other) 100e82e8f36SAdrien Destugues \brief Assignment operator. 101a33f8fbdSAdrien Destugues 102*f69cadd0SJohn Scipione \param other the BCollator object to assign from. 103e82e8f36SAdrien Destugues*/ 104e82e8f36SAdrien Destugues 105*f69cadd0SJohn Scipione 106e82e8f36SAdrien Destugues/*! 107e82e8f36SAdrien Destugues \fn void BCollator::SetDefaultStrength(int8 strength) 108*f69cadd0SJohn Scipione \brief Set the \a strength of the collator. 109e82e8f36SAdrien Destugues 110*f69cadd0SJohn Scipione Note that the \a strength can also be chosen on a case-by-case basis 111a33f8fbdSAdrien Destugues when calling other methods. 112e82e8f36SAdrien Destugues 113*f69cadd0SJohn Scipione \param strength The collator class provide four level of \a strength. 114a33f8fbdSAdrien Destugues \li \c B_COLLATE_PRIMARY doesn't differentiate e from é, 115a33f8fbdSAdrien Destugues \li \c B_COLLATE_SECONDARY takes letter accents into account, 116a33f8fbdSAdrien Destugues \li \c B_COLLATE_TERTIARY is case sensitive, 117a33f8fbdSAdrien Destugues \li \c B_COLLATE_QUATERNARY is very strict. Most of the time you 118*f69cadd0SJohn Scipione shouldn't need to go this far. 119e82e8f36SAdrien Destugues*/ 120e82e8f36SAdrien Destugues 121*f69cadd0SJohn Scipione 122e82e8f36SAdrien Destugues/*! 123e82e8f36SAdrien Destugues \fn int8 BCollator::DefaultStrength() const 124a33f8fbdSAdrien Destugues \brief Get the current strength of this catalog. 125a33f8fbdSAdrien Destugues 126*f69cadd0SJohn Scipione \returns The current strength of the catalog. 127e82e8f36SAdrien Destugues*/ 128e82e8f36SAdrien Destugues 129*f69cadd0SJohn Scipione 130e82e8f36SAdrien Destugues/*! 131e82e8f36SAdrien Destugues \fn void BCollator::SetIgnorePunctuation(bool ignore) 132*f69cadd0SJohn Scipione \brief Enable or disable punctuation handling. 133e82e8f36SAdrien Destugues 134*f69cadd0SJohn Scipione This function enables or disables the handling of punctuation. 135e82e8f36SAdrien Destugues 136*f69cadd0SJohn Scipione \param ignore Boolean indicating whether or not punctuation should 137*f69cadd0SJohn Scipione be ignored. 138e82e8f36SAdrien Destugues*/ 139e82e8f36SAdrien Destugues 140e82e8f36SAdrien Destugues/*! 141e82e8f36SAdrien Destugues \fn bool BCollator::IgnorePunctuation() const 142*f69cadd0SJohn Scipione \brief Gets the behavior of the collator with regards to punctuation. 143e82e8f36SAdrien Destugues 144*f69cadd0SJohn Scipione \returns \c true if the collator will take punctuation into account 145*f69cadd0SJohn Scipione when sorting, \c false otherwise. 146e82e8f36SAdrien Destugues*/ 147e82e8f36SAdrien Destugues 148e82e8f36SAdrien Destugues 149*f69cadd0SJohn Scipione/*! 150*f69cadd0SJohn Scipione \fn status_t BCollator::GetSortKey(const char* string, BString* key, 151*f69cadd0SJohn Scipione int8 strength) const 152*f69cadd0SJohn Scipione \brief Compute the sortkey of a \a string. 153*f69cadd0SJohn Scipione 154*f69cadd0SJohn Scipione The sortkey is a modified version of the input \a string that you can use 155*f69cadd0SJohn Scipione to perform faster comparisons with other sortkeys using strcmp() or a 156*f69cadd0SJohn Scipione similar comparison function. If you need to compare one string with other 157*f69cadd0SJohn Scipione many times, storing the sortkey will allow you to perform the comparisons 158*f69cadd0SJohn Scipione faster. 159e82e8f36SAdrien Destugues 160e82e8f36SAdrien Destugues \param string String from which to compute the sortkey. 161e82e8f36SAdrien Destugues \param key The resulting sortkey. 162*f69cadd0SJohn Scipione \param strength The \a strength to use to compute the sortkey. 163e82e8f36SAdrien Destugues 164*f69cadd0SJohn Scipione \retval B_OK if everything went well. 165*f69cadd0SJohn Scipione \retval B_ERROR if an error occurred generating the sortkey. 166e82e8f36SAdrien Destugues*/ 167e82e8f36SAdrien Destugues 168*f69cadd0SJohn Scipione 169e82e8f36SAdrien Destugues/*! 170a33f8fbdSAdrien Destugues \fn int BCollator::Compare(const char* s1, const char* s2, 171a33f8fbdSAdrien Destugues int8 strength) const 172*f69cadd0SJohn Scipione \brief Returns the difference betweens the two strings according to the 173*f69cadd0SJohn Scipione collation defined by the \a strength parameter. 174e82e8f36SAdrien Destugues 175*f69cadd0SJohn Scipione This method should be used in place of the strcmp() function to perform 176*f69cadd0SJohn Scipione locale-aware comparisons. 177e82e8f36SAdrien Destugues 178a33f8fbdSAdrien Destugues \param s1 The first string to compare. 179a33f8fbdSAdrien Destugues \param s2 The second string to compare. 180*f69cadd0SJohn Scipione \param strength The \a strength to use for the string comparison. 181a33f8fbdSAdrien Destugues 182a33f8fbdSAdrien Destugues \retval 0 if the strings are equal. 183a33f8fbdSAdrien Destugues \retval <0 if s1 is less than s2. 184a33f8fbdSAdrien Destugues \retval >0 if s1 is greater than s2. 185e82e8f36SAdrien Destugues*/ 186e82e8f36SAdrien Destugues 187*f69cadd0SJohn Scipione 188e82e8f36SAdrien Destugues/*! 189a33f8fbdSAdrien Destugues \fn bool BCollator::Equal(const char* s1, const char* s2, 190a33f8fbdSAdrien Destugues int8 strength) const 191*f69cadd0SJohn Scipione \brief Compares two strings for equality. 192e82e8f36SAdrien Destugues 193*f69cadd0SJohn Scipione Note that strings that are not byte-by-byte identical may end up being 194*f69cadd0SJohn Scipione treated as equal by this method. For example two strings may be 195*f69cadd0SJohn Scipione considered equal if the only differences between them are in case and 196*f69cadd0SJohn Scipione punctuation, depending on the \a strength used. Using 197*f69cadd0SJohn Scipione \c B_QUANTERNARY_STRENGTH will force this method return \c true only 198*f69cadd0SJohn Scipione if the strings are byte-for-byte identical. 199e82e8f36SAdrien Destugues 200a33f8fbdSAdrien Destugues \param s1 The first string to compare. 201a33f8fbdSAdrien Destugues \param s2 The second string to compare. 202*f69cadd0SJohn Scipione \param strength The \a strength to use for the string comparison. 203a33f8fbdSAdrien Destugues 204*f69cadd0SJohn Scipione \returns \c true if the strings are identical, \c false otherwise. 205e82e8f36SAdrien Destugues*/ 206e82e8f36SAdrien Destugues 207*f69cadd0SJohn Scipione 208e82e8f36SAdrien Destugues/*! 209a33f8fbdSAdrien Destugues \fn bool BCollator::Greater(cosnt char* s1, const char* s2, 210a33f8fbdSAdrien Destugues int8 strength) const 211a33f8fbdSAdrien Destugues \brief Determine if a string is greater than another. 212e82e8f36SAdrien Destugues 213*f69cadd0SJohn Scipione \note This method is commutative meaning that !Greater(s1, s2) 214*f69cadd0SJohn Scipione is the same as Greater(s2, s1). 215a33f8fbdSAdrien Destugues 216a33f8fbdSAdrien Destugues \param s1 The first string to compare. 217a33f8fbdSAdrien Destugues \param s2 The second string to compare. 218*f69cadd0SJohn Scipione \param strength The \a strength to use for the string comparison. 219a33f8fbdSAdrien Destugues 220a33f8fbdSAdrien Destugues \returns \c true if s1 is greater than, but not equal to, s2. 221e82e8f36SAdrien Destugues*/ 222e82e8f36SAdrien Destugues 223*f69cadd0SJohn Scipione 224e82e8f36SAdrien Destugues/*! 225a33f8fbdSAdrien Destugues \fn bool BCollator::GreaterOrEqual(cosnt char* s1, const char* s2, 226a33f8fbdSAdrien Destugues int8 strength) const 227*f69cadd0SJohn Scipione \brief Determines if one string is greater than another. 228*f69cadd0SJohn Scipione 229*f69cadd0SJohn Scipione \note This method is commutative meaning that !GreaterOrEqual(s1, s2) 230*f69cadd0SJohn Scipione is the same as GreaterOrEqual(s2, s1). 231e82e8f36SAdrien Destugues 232a33f8fbdSAdrien Destugues \param s1 The first string to compare. 233a33f8fbdSAdrien Destugues \param s2 The second string to compare. 234*f69cadd0SJohn Scipione \param strength The \a strength to use for the string comparison. 235a33f8fbdSAdrien Destugues 236a33f8fbdSAdrien Destugues \returns \c true if s1 is greater or equal than s2. 237e82e8f36SAdrien Destugues*/ 238cae874d3SAdrien Destugues 239*f69cadd0SJohn Scipione 240cae874d3SAdrien Destugues/*! 241a33f8fbdSAdrien Destugues \fn static BArchivable* BCollator::Instantiate(BMessage* archive) 242cae874d3SAdrien Destugues \brief Unarchive the collator 243cae874d3SAdrien Destugues 244*f69cadd0SJohn Scipione This method allows you to restore a collator that you previously 245*f69cadd0SJohn Scipione archived. It is faster to archive and unarchive a collator than it is 246*f69cadd0SJohn Scipione to create a new one up each time you need a BCollator object with the 247*f69cadd0SJohn Scipione same settings. 248a33f8fbdSAdrien Destugues 249a33f8fbdSAdrien Destugues \param archive The message to restore the collator from. 250a33f8fbdSAdrien Destugues 251*f69cadd0SJohn Scipione \returns A pointer to a BArchivable object containing the BCollator or 252*f69cadd0SJohn Scipione \c NULL if an error occurred restoring the \a archive. 253cae874d3SAdrien Destugues*/ 254