annotate lib/unigbrk.in.h @ 19273:638b6d1fdf36 ueno/unicode-9.0.0

libunistring: update to Unicode 9.0.0 * lib/gen-uni-tables.c (fill_properties): Recognize Sentence_Terminal and Prepended_Concatenation_Mark. (is_property_default_ignorable_code_point): Exclude U+08E2. (fill_arabicshaping): Allow missing whitespace when parsing; recognize "AFRICAN FEH", "AFRICAN QAF", and "AFRICAN MOON". (output_blocks): Increase the element size of the level1 table to accommodate more blocks. (get_lbp): Recognize ZWJ, E_Base, and E_Modifier characters; Update each class according to the standard. (get_wbp): Recognize ZWJ, E_Base, E_Modifier, Glue_After_Zwj, and E_Base_GAZ characters. (output_gbp_table): Recognize ZWJ, E_Base, E_Modifier, Glue_After_Zwj, and E_Base_GAZ characters. * lib/unictype.in.h (UC_JOINING_GROUP_AFRICAN_FEH, UC_JOINING_GROUP_AFRICAN_QAF, UC_JOINING_GROUP_AFRICAN_MOON): New enum value. * lib/unilbrk/lbrktables.h (LBP_ZWJ, LBP_EB, LBP_EM): New enum value. * lib/unilbrk/lbrktables.c (unilbrk_table): Extend the table with LBP_ZWJ, LBP_EB, and LBP_EM. * lib/uniwbrk.in.h (WBP_ZWJ, WBP_EB, WBP_EM, WBP_GAZ, WBP_EBG): New enum value. * lib/uniwbrk/u-wordbreaks.h: Implement WB3c, WB15, and WB16. * lib/uniwbrk/wbrktable.h (uniwbrk_prop_index): New variable declaration. * lib/uniwbrk/wbrktable.c (uniwbrk_prop_index): New variable. (uniwbrk_table): Implement WB14. * tests/uniwbrk/test-uc-wordbreaks.c (wordbreakproperty_to_string): Check WBP_ZWJ, WBP_EB, WBP_EM, WBP_GAZ, and WBP_EBG. * modules/unigbrk/u{32,16,8}-grapheme-breaks: No longer depend on uc-is-grapheme-break. * modules/unigbrk/uc-grapheme-breaks: New module. * modules/unigbrk/uc-grapheme-breaks-tests: New module. * lib/unigbrk.in.h (GBP_ZWJ, GBP_EB, GBP_EM, GBP_GAZ, GBP_EBG): New enum value. (uc_grapheme_breaks): New function, replacing uc_is_grapheme_break. * lib/unigbrk/u-grapheme-breaks.h: New file. * lib/unigbrk/u{32,16,8}-grapheme-breaks.c: Rewrite using u-grapheme-breaks.h instead of uc_is_grapheme_break. * lib/unigbrk/uc-grapheme-breaks.c: New file. * lib/unigbrk/uc-is-grapheme-break.c: Partially update to TR29 rev 29. * tests/unigbrk/test-uc-gbrk-prop.c (graphemebreakproperty_to_string): Check GBP_ZWJ, GBP_EB, GBP_EM, GBP_GAZ, and GBP_EBG. * tests/unigbrk/test-uc-grapheme-breaks.c: New test. * tests/unigbrk/test-uc-is-grapheme-break.c (graphemebreakproperty_to_string): Check GBP_ZWJ, GBP_EB, GBP_EM, GBP_GAZ, and GBP_EBG. (main): Skip unsupported rules involving 3 or more characters, namely GB10, GB12, and GB13. * lib/uniwidth/width.c (nonspacing_table_data): Update.
author Daiki Ueno <ueno@gnu.org>
date Wed, 12 Oct 2016 17:40:37 +0200
parents 9759915b2aca
children 10eb9086bea0
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
14049
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
1 /* Grapheme cluster breaks in Unicode strings.
18626
12df2165ec1c version-etc: new year
Paul Eggert <eggert@cs.ucla.edu>
parents: 18189
diff changeset
2 Copyright (C) 2010-2017 Free Software Foundation, Inc.
14049
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
3 Written by Ben Pfaff <blp@cs.stanford.edu>, 2010.
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
4
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
5 This program is free software: you can redistribute it and/or modify it
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
6 under the terms of the GNU Lesser General Public License as published
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
7 by the Free Software Foundation; either version 3 of the License, or
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
8 (at your option) any later version.
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
9
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
10 This program is distributed in the hope that it will be useful,
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
11 but WITHOUT ANY WARRANTY; without even the implied warranty of
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
12 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
13 Lesser General Public License for more details.
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
14
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
15 You should have received a copy of the GNU Lesser General Public License
19190
9759915b2aca all: prefer https: URLs
Paul Eggert <eggert@cs.ucla.edu>
parents: 18626
diff changeset
16 along with this program. If not, see <https://www.gnu.org/licenses/>. */
14049
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
17
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
18 #ifndef _UNIGBRK_H
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
19 #define _UNIGBRK_H
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
20
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
21 /* Get bool. */
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
22 #include <stdbool.h>
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
23
14076
bf75753bb6d8 unigbrk: New modules for grapheme clusters.
Ben Pfaff <blp@cs.stanford.edu>
parents: 14049
diff changeset
24 /* Get size_t. */
bf75753bb6d8 unigbrk: New modules for grapheme clusters.
Ben Pfaff <blp@cs.stanford.edu>
parents: 14049
diff changeset
25 #include <stddef.h>
bf75753bb6d8 unigbrk: New modules for grapheme clusters.
Ben Pfaff <blp@cs.stanford.edu>
parents: 14049
diff changeset
26
14049
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
27 #include "unitypes.h"
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
28
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
29 #ifdef __cplusplus
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
30 extern "C" {
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
31 #endif
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
32
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
33 /* ========================================================================= */
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
34
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
35 /* Property defined in Unicode Standard Annex #29, section "Grapheme Cluster
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
36 Boundaries"
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
37 <http://unicode.org/reports/tr29/#Grapheme_Cluster_Boundaries> */
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
38
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
39 /* Possible values of the Grapheme_Cluster_Break property.
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
40 This enumeration may be extended in the future. */
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
41 enum
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
42 {
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
43 GBP_OTHER = 0,
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
44 GBP_CR = 1,
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
45 GBP_LF = 2,
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
46 GBP_CONTROL = 3,
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
47 GBP_EXTEND = 4,
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
48 GBP_PREPEND = 5,
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
49 GBP_SPACINGMARK = 6,
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
50 GBP_L = 7,
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
51 GBP_V = 8,
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
52 GBP_T = 9,
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
53 GBP_LV = 10,
17872
ab67a6ee6dc8 libunistring: update to Unicode 6.2.0
Daiki Ueno <ueno@gnu.org>
parents: 17848
diff changeset
54 GBP_LVT = 11,
19273
638b6d1fdf36 libunistring: update to Unicode 9.0.0
Daiki Ueno <ueno@gnu.org>
parents: 19190
diff changeset
55 GBP_RI = 12,
638b6d1fdf36 libunistring: update to Unicode 9.0.0
Daiki Ueno <ueno@gnu.org>
parents: 19190
diff changeset
56 GBP_ZWJ = 13,
638b6d1fdf36 libunistring: update to Unicode 9.0.0
Daiki Ueno <ueno@gnu.org>
parents: 19190
diff changeset
57 GBP_EB = 14,
638b6d1fdf36 libunistring: update to Unicode 9.0.0
Daiki Ueno <ueno@gnu.org>
parents: 19190
diff changeset
58 GBP_EM = 15,
638b6d1fdf36 libunistring: update to Unicode 9.0.0
Daiki Ueno <ueno@gnu.org>
parents: 19190
diff changeset
59 GBP_GAZ = 16,
638b6d1fdf36 libunistring: update to Unicode 9.0.0
Daiki Ueno <ueno@gnu.org>
parents: 19190
diff changeset
60 GBP_EBG = 17
14049
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
61 };
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
62
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
63 /* Return the Grapheme_Cluster_Break property of a Unicode character. */
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
64 extern int
16716
079fb1e73828 Enable common subexpression optimization in GCC.
Bruno Haible <bruno@clisp.org>
parents: 16201
diff changeset
65 uc_graphemeclusterbreak_property (ucs4_t uc)
079fb1e73828 Enable common subexpression optimization in GCC.
Bruno Haible <bruno@clisp.org>
parents: 16201
diff changeset
66 _UC_ATTRIBUTE_CONST;
14049
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
67
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
68 /* ========================================================================= */
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
69
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
70 /* Grapheme cluster breaks. */
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
71
14774
70d101744577 maint: correct misuse of "a" and "an"
Jim Meyering <meyering@redhat.com>
parents: 14082
diff changeset
72 /* Returns true if there is a grapheme cluster boundary between Unicode code
14049
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
73 points A and B. A "grapheme cluster" is an approximation to a
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
74 user-perceived character, which sometimes corresponds to multiple code
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
75 points. For example, an English letter followed by an acute accent can be
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
76 expressed as two consecutive Unicode code points, but it is perceived by the
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
77 user as only a single character and therefore constitutes a single grapheme
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
78 cluster.
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
79
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
80 Implements extended (not legacy) grapheme cluster rules, because UAX #29
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
81 indicates that they are preferred.
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
82
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
83 Use A == 0 or B == 0 to indicate start of text or end of text,
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
84 respectively. */
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
85 extern bool
16716
079fb1e73828 Enable common subexpression optimization in GCC.
Bruno Haible <bruno@clisp.org>
parents: 16201
diff changeset
86 uc_is_grapheme_break (ucs4_t a, ucs4_t b)
079fb1e73828 Enable common subexpression optimization in GCC.
Bruno Haible <bruno@clisp.org>
parents: 16201
diff changeset
87 _UC_ATTRIBUTE_CONST;
14049
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
88
14076
bf75753bb6d8 unigbrk: New modules for grapheme clusters.
Ben Pfaff <blp@cs.stanford.edu>
parents: 14049
diff changeset
89 /* Returns the start of the next grapheme cluster following S, or NULL if the
14080
87819eefffc5 unigbrk.in.h: Fix typo: "ben" => "been".
Ben Pfaff <blp@cs.stanford.edu>
parents: 14079
diff changeset
90 end of the string has been reached. */
14076
bf75753bb6d8 unigbrk: New modules for grapheme clusters.
Ben Pfaff <blp@cs.stanford.edu>
parents: 14049
diff changeset
91 extern const uint8_t *
16716
079fb1e73828 Enable common subexpression optimization in GCC.
Bruno Haible <bruno@clisp.org>
parents: 16201
diff changeset
92 u8_grapheme_next (const uint8_t *s, const uint8_t *end)
079fb1e73828 Enable common subexpression optimization in GCC.
Bruno Haible <bruno@clisp.org>
parents: 16201
diff changeset
93 _UC_ATTRIBUTE_PURE;
14076
bf75753bb6d8 unigbrk: New modules for grapheme clusters.
Ben Pfaff <blp@cs.stanford.edu>
parents: 14049
diff changeset
94 extern const uint16_t *
16716
079fb1e73828 Enable common subexpression optimization in GCC.
Bruno Haible <bruno@clisp.org>
parents: 16201
diff changeset
95 u16_grapheme_next (const uint16_t *s, const uint16_t *end)
079fb1e73828 Enable common subexpression optimization in GCC.
Bruno Haible <bruno@clisp.org>
parents: 16201
diff changeset
96 _UC_ATTRIBUTE_PURE;
14076
bf75753bb6d8 unigbrk: New modules for grapheme clusters.
Ben Pfaff <blp@cs.stanford.edu>
parents: 14049
diff changeset
97 extern const uint32_t *
16716
079fb1e73828 Enable common subexpression optimization in GCC.
Bruno Haible <bruno@clisp.org>
parents: 16201
diff changeset
98 u32_grapheme_next (const uint32_t *s, const uint32_t *end)
079fb1e73828 Enable common subexpression optimization in GCC.
Bruno Haible <bruno@clisp.org>
parents: 16201
diff changeset
99 _UC_ATTRIBUTE_PURE;
14076
bf75753bb6d8 unigbrk: New modules for grapheme clusters.
Ben Pfaff <blp@cs.stanford.edu>
parents: 14049
diff changeset
100
bf75753bb6d8 unigbrk: New modules for grapheme clusters.
Ben Pfaff <blp@cs.stanford.edu>
parents: 14049
diff changeset
101 /* Returns the start of the previous grapheme cluster before S, or NULL if the
14080
87819eefffc5 unigbrk.in.h: Fix typo: "ben" => "been".
Ben Pfaff <blp@cs.stanford.edu>
parents: 14079
diff changeset
102 start of the string has been reached. */
14076
bf75753bb6d8 unigbrk: New modules for grapheme clusters.
Ben Pfaff <blp@cs.stanford.edu>
parents: 14049
diff changeset
103 extern const uint8_t *
16716
079fb1e73828 Enable common subexpression optimization in GCC.
Bruno Haible <bruno@clisp.org>
parents: 16201
diff changeset
104 u8_grapheme_prev (const uint8_t *s, const uint8_t *start)
079fb1e73828 Enable common subexpression optimization in GCC.
Bruno Haible <bruno@clisp.org>
parents: 16201
diff changeset
105 _UC_ATTRIBUTE_PURE;
14076
bf75753bb6d8 unigbrk: New modules for grapheme clusters.
Ben Pfaff <blp@cs.stanford.edu>
parents: 14049
diff changeset
106 extern const uint16_t *
16716
079fb1e73828 Enable common subexpression optimization in GCC.
Bruno Haible <bruno@clisp.org>
parents: 16201
diff changeset
107 u16_grapheme_prev (const uint16_t *s, const uint16_t *start)
079fb1e73828 Enable common subexpression optimization in GCC.
Bruno Haible <bruno@clisp.org>
parents: 16201
diff changeset
108 _UC_ATTRIBUTE_PURE;
14076
bf75753bb6d8 unigbrk: New modules for grapheme clusters.
Ben Pfaff <blp@cs.stanford.edu>
parents: 14049
diff changeset
109 extern const uint32_t *
16716
079fb1e73828 Enable common subexpression optimization in GCC.
Bruno Haible <bruno@clisp.org>
parents: 16201
diff changeset
110 u32_grapheme_prev (const uint32_t *s, const uint32_t *start)
079fb1e73828 Enable common subexpression optimization in GCC.
Bruno Haible <bruno@clisp.org>
parents: 16201
diff changeset
111 _UC_ATTRIBUTE_PURE;
14076
bf75753bb6d8 unigbrk: New modules for grapheme clusters.
Ben Pfaff <blp@cs.stanford.edu>
parents: 14049
diff changeset
112
bf75753bb6d8 unigbrk: New modules for grapheme clusters.
Ben Pfaff <blp@cs.stanford.edu>
parents: 14049
diff changeset
113 /* Determine the grapheme cluster boundaries in S, and store the result at
bf75753bb6d8 unigbrk: New modules for grapheme clusters.
Ben Pfaff <blp@cs.stanford.edu>
parents: 14049
diff changeset
114 p[0..n-1]. p[i] = 1 means that a new grapheme cluster begins at s[i]. p[i]
bf75753bb6d8 unigbrk: New modules for grapheme clusters.
Ben Pfaff <blp@cs.stanford.edu>
parents: 14049
diff changeset
115 = 0 means that s[i-1] and s[i] are part of the same grapheme cluster. p[0]
bf75753bb6d8 unigbrk: New modules for grapheme clusters.
Ben Pfaff <blp@cs.stanford.edu>
parents: 14049
diff changeset
116 will always be 1.
bf75753bb6d8 unigbrk: New modules for grapheme clusters.
Ben Pfaff <blp@cs.stanford.edu>
parents: 14049
diff changeset
117 */
bf75753bb6d8 unigbrk: New modules for grapheme clusters.
Ben Pfaff <blp@cs.stanford.edu>
parents: 14049
diff changeset
118 extern void
bf75753bb6d8 unigbrk: New modules for grapheme clusters.
Ben Pfaff <blp@cs.stanford.edu>
parents: 14049
diff changeset
119 u8_grapheme_breaks (const uint8_t *s, size_t n, char *p);
bf75753bb6d8 unigbrk: New modules for grapheme clusters.
Ben Pfaff <blp@cs.stanford.edu>
parents: 14049
diff changeset
120 extern void
bf75753bb6d8 unigbrk: New modules for grapheme clusters.
Ben Pfaff <blp@cs.stanford.edu>
parents: 14049
diff changeset
121 u16_grapheme_breaks (const uint16_t *s, size_t n, char *p);
bf75753bb6d8 unigbrk: New modules for grapheme clusters.
Ben Pfaff <blp@cs.stanford.edu>
parents: 14049
diff changeset
122 extern void
bf75753bb6d8 unigbrk: New modules for grapheme clusters.
Ben Pfaff <blp@cs.stanford.edu>
parents: 14049
diff changeset
123 u32_grapheme_breaks (const uint32_t *s, size_t n, char *p);
bf75753bb6d8 unigbrk: New modules for grapheme clusters.
Ben Pfaff <blp@cs.stanford.edu>
parents: 14049
diff changeset
124 extern void
bf75753bb6d8 unigbrk: New modules for grapheme clusters.
Ben Pfaff <blp@cs.stanford.edu>
parents: 14049
diff changeset
125 ulc_grapheme_breaks (const char *s, size_t n, char *p);
19273
638b6d1fdf36 libunistring: update to Unicode 9.0.0
Daiki Ueno <ueno@gnu.org>
parents: 19190
diff changeset
126 extern void
638b6d1fdf36 libunistring: update to Unicode 9.0.0
Daiki Ueno <ueno@gnu.org>
parents: 19190
diff changeset
127 uc_grapheme_breaks (const ucs4_t *s, size_t n, char *p);
14076
bf75753bb6d8 unigbrk: New modules for grapheme clusters.
Ben Pfaff <blp@cs.stanford.edu>
parents: 14049
diff changeset
128
14049
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
129 /* ========================================================================= */
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
130
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
131 #ifdef __cplusplus
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
132 }
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
133 #endif
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
134
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
135
fc0c7d9c14f8 New modules for grapheme cluster breaking.
Ben Pfaff <blp@cs.stanford.edu>
parents:
diff changeset
136 #endif /* _UNIGBRK_H */