1a33f8fbdSAdrien Destugues/* 2*820dca4dSJohn Scipione * Copyright 2011 Haiku, Inc. All rights reserved. 3a33f8fbdSAdrien Destugues * Distributed under the terms of the MIT License. 4a33f8fbdSAdrien Destugues * 5a33f8fbdSAdrien Destugues * Authors: 6a33f8fbdSAdrien Destugues * Axel Dörfler, axeld@pinc-software.de 7a33f8fbdSAdrien Destugues * Adrien Destugues <pulkomandy@pulkomandy.ath.cx> 8a33f8fbdSAdrien Destugues * John Scipione, jscipione@gmail.com 9a33f8fbdSAdrien Destugues * 10a33f8fbdSAdrien Destugues * Corresponds to: 11*820dca4dSJohn Scipione * headers/os/locale/Collator.h rev 42274 12*820dca4dSJohn Scipione * src/kits/locale/Collator.cpp rev 42274 13a33f8fbdSAdrien Destugues */ 14a33f8fbdSAdrien Destugues 15a33f8fbdSAdrien Destugues 16a33f8fbdSAdrien Destugues/*! 17a33f8fbdSAdrien Destugues \file Collator.h 18*820dca4dSJohn Scipione \ingroup locale 19*820dca4dSJohn Scipione \ingroup libbe 20a33f8fbdSAdrien Destugues \brief Provides the BCollator class. 21a33f8fbdSAdrien Destugues*/ 22a33f8fbdSAdrien Destugues 23a33f8fbdSAdrien Destugues 24e82e8f36SAdrien Destugues/*! 25e82e8f36SAdrien Destugues \class BCollator 26e82e8f36SAdrien Destugues \ingroup locale 27*820dca4dSJohn Scipione \ingroup libbe 28f69cadd0SJohn Scipione \brief Class for handling locale-aware collation (sorting) of strings. 29e82e8f36SAdrien Destugues 30f69cadd0SJohn Scipione BCollator is designed to handle collation (sorting) of strings. Unlike 31f69cadd0SJohn Scipione string sorting using strcmp() or similar functions that compare raw bytes 32f69cadd0SJohn Scipione the collation is done using a set of rules that changes from one locale 33f69cadd0SJohn Scipione to another. For example, in Spanish, 'ch' is considered to be a letter 34a33f8fbdSAdrien Destugues and is sorted between 'c' and 'd'. This class is also able to perform 35f69cadd0SJohn Scipione natural number sorting so that 2 is sorted before 10 unlike byte-based 36f69cadd0SJohn Scipione sorting. 37e82e8f36SAdrien Destugues 38f69cadd0SJohn Scipione \warning This class is not multithread-safe, as Compare() change the 39f69cadd0SJohn Scipione ICUCollator (the strength). So if you want to use a BCollator from 40f69cadd0SJohn Scipione more than one thread you need to protect it with a lock. 41e82e8f36SAdrien Destugues*/ 42e82e8f36SAdrien Destugues 43f69cadd0SJohn Scipione 44e82e8f36SAdrien Destugues/*! 45e82e8f36SAdrien Destugues \fn BCollator::BCollator() 46f69cadd0SJohn Scipione \brief Construct a collator with the default locale and strength. 47a33f8fbdSAdrien Destugues 48f69cadd0SJohn Scipione \attention The default collator should be constructed by the BLocale 49f69cadd0SJohn Scipione instead since it is aware of the currently defined locale. 50f69cadd0SJohn Scipione 51f69cadd0SJohn Scipione This constructor uses \c B_COLLATE_PRIMARY strength. 52e82e8f36SAdrien Destugues*/ 53e82e8f36SAdrien Destugues 54f69cadd0SJohn Scipione 55e82e8f36SAdrien Destugues/*! 56a33f8fbdSAdrien Destugues \fn BCollator::BCollator(const char* locale, 57a33f8fbdSAdrien Destugues int8 strength = B_COLLATE_PRIMARY, bool ignorePunctuation = false) 58f69cadd0SJohn Scipione \brief Construct a collator for the given \a locale and \a strength. 59e82e8f36SAdrien Destugues 60a33f8fbdSAdrien Destugues This constructor loads the data for the given locale. You can also 61f69cadd0SJohn Scipione set the \a strength and choose if the collator should take 62f69cadd0SJohn Scipione punctuation into account or not. 63a33f8fbdSAdrien Destugues 64f69cadd0SJohn Scipione \param locale The \a locale to build the constructor for. 65f69cadd0SJohn Scipione \param strength The collator class provide four level of \a strength. 66a33f8fbdSAdrien Destugues \li \c B_COLLATE_PRIMARY doesn't differentiate e from é, 67a33f8fbdSAdrien Destugues \li \c B_COLLATE_SECONDARY takes letter accents into account, 68a33f8fbdSAdrien Destugues \li \c B_COLLATE_TERTIARY is case sensitive, 69a33f8fbdSAdrien Destugues \li \c B_COLLATE_QUATERNARY is very strict. Most of the time you 70f69cadd0SJohn Scipione shouldn't need to go this far. 71f69cadd0SJohn Scipione \param ignorePunctuation Ignore punctuation during sorting. 72e82e8f36SAdrien Destugues*/ 73e82e8f36SAdrien Destugues 74f69cadd0SJohn Scipione 75e82e8f36SAdrien Destugues/*! 76e82e8f36SAdrien Destugues \fn BCollator::BCollator(BMessage* archive) 77a33f8fbdSAdrien Destugues \brief Unarchive a collator from a message. 78a33f8fbdSAdrien Destugues 79f69cadd0SJohn Scipione \param archive The message to unarchive the BCollator object from. 80e82e8f36SAdrien Destugues*/ 81e82e8f36SAdrien Destugues 82f69cadd0SJohn Scipione 83e82e8f36SAdrien Destugues/*! 84e82e8f36SAdrien Destugues \fn BCollator::BCollator(const BCollator& other) 85e82e8f36SAdrien Destugues \brief Copy constructor. 86a33f8fbdSAdrien Destugues 87f69cadd0SJohn Scipione Copies a BCollator object from another BCollator object. 88a33f8fbdSAdrien Destugues 89a33f8fbdSAdrien Destugues \param other The BCollator to copy from. 90e82e8f36SAdrien Destugues*/ 91e82e8f36SAdrien Destugues 92f69cadd0SJohn Scipione 93e82e8f36SAdrien Destugues/*! 94a33f8fbdSAdrien Destugues \fn BCollator::~BCollator() 95f69cadd0SJohn Scipione \brief Destructor method. 96a33f8fbdSAdrien Destugues 97f69cadd0SJohn Scipione Deletes the BCollator object freeing the resources it consumes. 98e82e8f36SAdrien Destugues*/ 99e82e8f36SAdrien Destugues 100f69cadd0SJohn Scipione 101e82e8f36SAdrien Destugues/*! 102a33f8fbdSAdrien Destugues \fn Bcollator& BCollator::operator=(const BCollator& other) 103e82e8f36SAdrien Destugues \brief Assignment operator. 104a33f8fbdSAdrien Destugues 105f69cadd0SJohn Scipione \param other the BCollator object to assign from. 106e82e8f36SAdrien Destugues*/ 107e82e8f36SAdrien Destugues 108f69cadd0SJohn Scipione 109e82e8f36SAdrien Destugues/*! 110e82e8f36SAdrien Destugues \fn void BCollator::SetDefaultStrength(int8 strength) 111f69cadd0SJohn Scipione \brief Set the \a strength of the collator. 112e82e8f36SAdrien Destugues 113f69cadd0SJohn Scipione Note that the \a strength can also be chosen on a case-by-case basis 114a33f8fbdSAdrien Destugues when calling other methods. 115e82e8f36SAdrien Destugues 116f69cadd0SJohn Scipione \param strength The collator class provide four level of \a strength. 117a33f8fbdSAdrien Destugues \li \c B_COLLATE_PRIMARY doesn't differentiate e from é, 118a33f8fbdSAdrien Destugues \li \c B_COLLATE_SECONDARY takes letter accents into account, 119a33f8fbdSAdrien Destugues \li \c B_COLLATE_TERTIARY is case sensitive, 120a33f8fbdSAdrien Destugues \li \c B_COLLATE_QUATERNARY is very strict. Most of the time you 121f69cadd0SJohn Scipione shouldn't need to go this far. 122e82e8f36SAdrien Destugues*/ 123e82e8f36SAdrien Destugues 124f69cadd0SJohn Scipione 125e82e8f36SAdrien Destugues/*! 126e82e8f36SAdrien Destugues \fn int8 BCollator::DefaultStrength() const 127a33f8fbdSAdrien Destugues \brief Get the current strength of this catalog. 128a33f8fbdSAdrien Destugues 129f69cadd0SJohn Scipione \returns The current strength of the catalog. 130e82e8f36SAdrien Destugues*/ 131e82e8f36SAdrien Destugues 132f69cadd0SJohn Scipione 133e82e8f36SAdrien Destugues/*! 134e82e8f36SAdrien Destugues \fn void BCollator::SetIgnorePunctuation(bool ignore) 135f69cadd0SJohn Scipione \brief Enable or disable punctuation handling. 136e82e8f36SAdrien Destugues 137f69cadd0SJohn Scipione This function enables or disables the handling of punctuation. 138e82e8f36SAdrien Destugues 139f69cadd0SJohn Scipione \param ignore Boolean indicating whether or not punctuation should 140f69cadd0SJohn Scipione be ignored. 141e82e8f36SAdrien Destugues*/ 142e82e8f36SAdrien Destugues 143e82e8f36SAdrien Destugues/*! 144e82e8f36SAdrien Destugues \fn bool BCollator::IgnorePunctuation() const 145f69cadd0SJohn Scipione \brief Gets the behavior of the collator with regards to punctuation. 146e82e8f36SAdrien Destugues 147f69cadd0SJohn Scipione \returns \c true if the collator will take punctuation into account 148f69cadd0SJohn Scipione when sorting, \c false otherwise. 149e82e8f36SAdrien Destugues*/ 150e82e8f36SAdrien Destugues 151e82e8f36SAdrien Destugues 152f69cadd0SJohn Scipione/*! 153f69cadd0SJohn Scipione \fn status_t BCollator::GetSortKey(const char* string, BString* key, 154f69cadd0SJohn Scipione int8 strength) const 155f69cadd0SJohn Scipione \brief Compute the sortkey of a \a string. 156f69cadd0SJohn Scipione 157f69cadd0SJohn Scipione The sortkey is a modified version of the input \a string that you can use 158f69cadd0SJohn Scipione to perform faster comparisons with other sortkeys using strcmp() or a 159f69cadd0SJohn Scipione similar comparison function. If you need to compare one string with other 160f69cadd0SJohn Scipione many times, storing the sortkey will allow you to perform the comparisons 161f69cadd0SJohn Scipione faster. 162e82e8f36SAdrien Destugues 163e82e8f36SAdrien Destugues \param string String from which to compute the sortkey. 164e82e8f36SAdrien Destugues \param key The resulting sortkey. 165f69cadd0SJohn Scipione \param strength The \a strength to use to compute the sortkey. 166e82e8f36SAdrien Destugues 167f69cadd0SJohn Scipione \retval B_OK if everything went well. 168f69cadd0SJohn Scipione \retval B_ERROR if an error occurred generating the sortkey. 169e82e8f36SAdrien Destugues*/ 170e82e8f36SAdrien Destugues 171f69cadd0SJohn Scipione 172e82e8f36SAdrien Destugues/*! 173a33f8fbdSAdrien Destugues \fn int BCollator::Compare(const char* s1, const char* s2, 174a33f8fbdSAdrien Destugues int8 strength) const 175f69cadd0SJohn Scipione \brief Returns the difference betweens the two strings according to the 176f69cadd0SJohn Scipione collation defined by the \a strength parameter. 177e82e8f36SAdrien Destugues 178f69cadd0SJohn Scipione This method should be used in place of the strcmp() function to perform 179f69cadd0SJohn Scipione locale-aware comparisons. 180e82e8f36SAdrien Destugues 181a33f8fbdSAdrien Destugues \param s1 The first string to compare. 182a33f8fbdSAdrien Destugues \param s2 The second string to compare. 183f69cadd0SJohn Scipione \param strength The \a strength to use for the string comparison. 184a33f8fbdSAdrien Destugues 185a33f8fbdSAdrien Destugues \retval 0 if the strings are equal. 186a33f8fbdSAdrien Destugues \retval <0 if s1 is less than s2. 187a33f8fbdSAdrien Destugues \retval >0 if s1 is greater than s2. 188e82e8f36SAdrien Destugues*/ 189e82e8f36SAdrien Destugues 190f69cadd0SJohn Scipione 191e82e8f36SAdrien Destugues/*! 192a33f8fbdSAdrien Destugues \fn bool BCollator::Equal(const char* s1, const char* s2, 193a33f8fbdSAdrien Destugues int8 strength) const 194f69cadd0SJohn Scipione \brief Compares two strings for equality. 195e82e8f36SAdrien Destugues 196f69cadd0SJohn Scipione Note that strings that are not byte-by-byte identical may end up being 197f69cadd0SJohn Scipione treated as equal by this method. For example two strings may be 198f69cadd0SJohn Scipione considered equal if the only differences between them are in case and 199f69cadd0SJohn Scipione punctuation, depending on the \a strength used. Using 200f69cadd0SJohn Scipione \c B_QUANTERNARY_STRENGTH will force this method return \c true only 201f69cadd0SJohn Scipione if the strings are byte-for-byte identical. 202e82e8f36SAdrien Destugues 203a33f8fbdSAdrien Destugues \param s1 The first string to compare. 204a33f8fbdSAdrien Destugues \param s2 The second string to compare. 205f69cadd0SJohn Scipione \param strength The \a strength to use for the string comparison. 206a33f8fbdSAdrien Destugues 207f69cadd0SJohn Scipione \returns \c true if the strings are identical, \c false otherwise. 208e82e8f36SAdrien Destugues*/ 209e82e8f36SAdrien Destugues 210f69cadd0SJohn Scipione 211e82e8f36SAdrien Destugues/*! 212a33f8fbdSAdrien Destugues \fn bool BCollator::Greater(cosnt char* s1, const char* s2, 213a33f8fbdSAdrien Destugues int8 strength) const 214a33f8fbdSAdrien Destugues \brief Determine if a string is greater than another. 215e82e8f36SAdrien Destugues 2165caae4d4SAdrien Destugues \note !Greater(s1, s2) is the same as GreaterOrEqual(s2, s1). This means 2175caae4d4SAdrien Destugues there is no need for Lesser(s1, s2) and LesserOrEqual(s1, s2) methods. 218a33f8fbdSAdrien Destugues 219a33f8fbdSAdrien Destugues \param s1 The first string to compare. 220a33f8fbdSAdrien Destugues \param s2 The second string to compare. 221f69cadd0SJohn Scipione \param strength The \a strength to use for the string comparison. 222a33f8fbdSAdrien Destugues 223a33f8fbdSAdrien Destugues \returns \c true if s1 is greater than, but not equal to, s2. 224e82e8f36SAdrien Destugues*/ 225e82e8f36SAdrien Destugues 226f69cadd0SJohn Scipione 227e82e8f36SAdrien Destugues/*! 228a33f8fbdSAdrien Destugues \fn bool BCollator::GreaterOrEqual(cosnt char* s1, const char* s2, 229a33f8fbdSAdrien Destugues int8 strength) const 230f69cadd0SJohn Scipione \brief Determines if one string is greater than another. 231f69cadd0SJohn Scipione 2325caae4d4SAdrien Destugues \note !GreaterOrEqual(s1, s2) is the same as Greater(s2, s1). 233e82e8f36SAdrien Destugues 234a33f8fbdSAdrien Destugues \param s1 The first string to compare. 235a33f8fbdSAdrien Destugues \param s2 The second string to compare. 236f69cadd0SJohn Scipione \param strength The \a strength to use for the string comparison. 237a33f8fbdSAdrien Destugues 238a33f8fbdSAdrien Destugues \returns \c true if s1 is greater or equal than s2. 239e82e8f36SAdrien Destugues*/ 240cae874d3SAdrien Destugues 241f69cadd0SJohn Scipione 242cae874d3SAdrien Destugues/*! 243a33f8fbdSAdrien Destugues \fn static BArchivable* BCollator::Instantiate(BMessage* archive) 244cae874d3SAdrien Destugues \brief Unarchive the collator 245cae874d3SAdrien Destugues 246f69cadd0SJohn Scipione This method allows you to restore a collator that you previously 247f69cadd0SJohn Scipione archived. It is faster to archive and unarchive a collator than it is 248f69cadd0SJohn Scipione to create a new one up each time you need a BCollator object with the 249f69cadd0SJohn Scipione same settings. 250a33f8fbdSAdrien Destugues 251a33f8fbdSAdrien Destugues \param archive The message to restore the collator from. 252a33f8fbdSAdrien Destugues 253f69cadd0SJohn Scipione \returns A pointer to a BArchivable object containing the BCollator or 254f69cadd0SJohn Scipione \c NULL if an error occurred restoring the \a archive. 255cae874d3SAdrien Destugues*/ 256