public class Text extends BinaryComparable implements WritableComparable<BinaryComparable>
Writable 和 WritableComparable 的实现.
Text 内部使用字节数组保存字符串,并采用 UTF-8 编码,提供了序列化、反序列化以及字节级别的字符串比较方法。
另外,Text 也提供了一组丰富的字符串处理函数。
| Modifier and Type | Class and Description |
|---|---|
static class |
Text.Comparator
Text 对象的
WritableComparator 自然顺序实现(升序). |
| Constructor and Description |
|---|
Text()
默认构造空串的 Text.
|
Text(byte[] utf8)
给定字节数组构造 Text
|
Text(String string)
给定字符串构造一个 Text.
|
Text(Text utf8)
从另一个 Text 对象构造 Text,内容会做拷贝.
|
| Modifier and Type | Method and Description |
|---|---|
void |
append(byte[] utf8,
int start,
int len)
追加字符串内容.
|
static int |
bytesToCodePoint(ByteBuffer bytes)
Returns the next code point at the current position in the buffer.
|
int |
charAt(int position)
返回给定位置 position 的 Unicode 32位标量值(Unicode scalar value).
|
void |
clear()
清空字符串,清空后
getLength() 返回0 |
static String |
decode(byte[] utf8)
Converts the provided byte array to a String using the UTF-8 encoding.
|
static String |
decode(byte[] utf8,
int start,
int length) |
static String |
decode(byte[] utf8,
int start,
int length,
boolean replace)
Converts the provided byte array to a String using the UTF-8 encoding.
|
static ByteBuffer |
encode(String string)
Converts the provided String to bytes using the UTF-8 encoding.
|
static ByteBuffer |
encode(String string,
boolean replace)
Converts the provided String to bytes using the UTF-8 encoding.
|
boolean |
equals(Object o)
Returns true iff
o is a Text with the same contents. |
int |
find(String what)
查找子串出现的位置.
|
int |
find(String what,
int start)
从某个起始位置开始查找子串出现的位置
|
byte[] |
getBytes()
返回存储字符串内容的字节数组,注意字节数组的有效内容是:0 ~
getLength(). |
int |
getLength()
获取内容长度,单位:字节
|
int |
hashCode()
计算二进制内容的哈希值.
|
void |
readFields(ByteBuffer bf) |
void |
readFields(DataInput in)
从指定的
DataInput in 反序列化. |
static String |
readString(DataInput in)
Read a UTF8 encoded string from in
|
void |
set(byte[] utf8)
给定 UTF-8 编码数组设置字符串内容.
|
void |
set(byte[] utf8,
int start,
int len)
设置字符串内容.
|
void |
set(String string)
给定 String 设置字符串内容.
|
void |
set(Text other)
拷贝另个 Text 的内容.
|
static void |
skip(DataInput in)
Skips over one Text in the input.
|
String |
toString()
Convert text back to string
|
static int |
utf8Length(String string)
For the given string, returns the number of UTF-8 bytes required to encode
the string.
|
static void |
validateUTF8(byte[] utf8)
Check if a byte array contains valid utf-8
|
static void |
validateUTF8(byte[] utf8,
int start,
int len)
Check to see if a byte array is valid utf-8
|
void |
write(DataOutput out)
序列化到指定的
DataOutput out. |
static int |
writeString(DataOutput out,
String s)
Write a UTF8 encoded string to out
|
compareTo, compareTocompareTopublic Text()
public Text(String string)
string - public Text(Text utf8)
string - public Text(byte[] utf8)
utf8 - public byte[] getBytes()
getLength().
注意:字节数组的有效内容是:0 ~ getLength()
!直接使用整个字节数组可能会读到无效内容。getBytes in class BinaryComparablegetLength()public int getLength()
getLength in class BinaryComparablegetBytes()public int charAt(int position)
Text 内部使用字节数组保存字符串,并采用 UTF-8 编码,charAt 方法将字节数组在 position 位置的 UTF-8 字符转换为 Unicode 32位标量值并返回
getLength()),则返回-1getBytes(),
getLength()public int find(String what)
等价于 find(what, 0);
public int find(String what, int start)
public void set(String string)
string - public void set(byte[] utf8)
utf8 - public void set(Text other)
other - public void set(byte[] utf8,
int start,
int len)
utf8 - 待拷贝内容的字节数组start - 待拷贝内容的起始位置len - 待拷贝内容的长度public void append(byte[] utf8,
int start,
int len)
utf8 - 待追加内容的字节数组start - 待追加内容的起始位置len - 待追加内容的长度public void clear()
getLength() 返回0public String toString()
toString in class ObjectObject.toString()public void readFields(DataInput in) throws IOException
WritableDataInput in 反序列化.readFields in interface WritableIOExceptionpublic void readFields(ByteBuffer bf)
public static void skip(DataInput in) throws IOException
IOExceptionpublic void write(DataOutput out) throws IOException
WritableDataOutput out.write in interface WritableIOExceptionpublic boolean equals(Object o)
o is a Text with the same contents.equals in class BinaryComparablepublic int hashCode()
BinaryComparable直接使用 {@link WritableComparator#hashBytes(byte[], int)进行计算
hashCode in class BinaryComparablepublic static String decode(byte[] utf8) throws CharacterCodingException
CharacterCodingExceptionpublic static String decode(byte[] utf8, int start, int length) throws CharacterCodingException
CharacterCodingExceptionpublic static String decode(byte[] utf8, int start, int length, boolean replace) throws CharacterCodingException
replace is true, then malformed input is replaced with the
substitution character, which is U+FFFD. Otherwise the method throws a
MalformedInputException.CharacterCodingExceptionpublic static ByteBuffer encode(String string) throws CharacterCodingException
CharacterCodingExceptionpublic static ByteBuffer encode(String string, boolean replace) throws CharacterCodingException
replace is true, then malformed input is replaced with the
substitution character, which is U+FFFD. Otherwise the method throws a
MalformedInputException.CharacterCodingExceptionpublic static String readString(DataInput in) throws IOException
IOExceptionpublic static int writeString(DataOutput out, String s) throws IOException
IOExceptionpublic static void validateUTF8(byte[] utf8)
throws MalformedInputException
utf8 - byte arrayMalformedInputException - if the byte array contains invalid utf-8public static void validateUTF8(byte[] utf8,
int start,
int len)
throws MalformedInputException
utf8 - the array of bytesstart - the offset of the first byte in the arraylen - the length of the byte sequenceMalformedInputException - if the byte array contains invalid bytespublic static int bytesToCodePoint(ByteBuffer bytes)
public static int utf8Length(String string)
string - text to encodeCopyright © 2023 Alibaba Cloud Computing. All rights reserved.