



public class String {
    private int m_stringLength;
    private char m_firstChar;


internal extern static String FastAllocateString(int length);

internal static unsafe void wstrcpy(char* dmem, char* smem, int charCount) {
    // memory copy...

private static unsafe void FillStringChecked(String dest, int destPos, String src) {
    if (src.Length > dest.Length - destPos) {
        throw new IndexOutOfRangeException();

    fixed (char* pDest = &dest.m_firstChar)
    fixed (char* pSrc = &src.m_firstChar) {
        wstrcpy(pDest + destPos, pSrc, src.Length);

public static String Concat(String str0, String str1) {
    // return String.Empty if both null or empty

    int str0Length = str0.Length;

    String result = FastAllocateString(str0Length + str1.Length);

    FillStringChecked(result, 0, str0);
    FillStringChecked(result, str0Length, str1);

    return result;



private unsafe string InternalSubString(int startIndex, int length, bool fAlwaysCopy) {
    if (startIndex == 0 && length == this.Length && !fAlwaysCopy) {
        return this;

    String result = FastAllocateString(length);

    fixed (char* dest = &result.m_firstChar)
    fixed (char* src = &this.m_firstChar) {
        wstrcpy(dest, src + startIndex, length);

    return result;




严格来说,“Java”是一个标准,而没有限制特定的实现方式,我们这里分析的是使用最广泛的OpenJDK实现。例如在OpenJDK 7里String类型是这样定义的:

public final class String {
    /** The value is used for character storage. */
    private final char value[];

    /** The offset is the first index of the storage that is used. */
    private final int offset;

    /** The count is the number of characters in the String. */
    private final int count;

此外还有一个hash字段,这样单个字符串的哈希值只需计算一次即可。这里我们可以看出OpenJDK 7与.NET的不同,后者是直接包含字符序列的内容,而前者则是保留一个字符数组,并记录起始位置及其偏移量。这么做最大的好处是substring方法无需复制内存,而完全可以重用内部的字符数组:

// Package private constructor which shares value array for speed.
String(int offset, int count, char value[]) {
    this.value = value;
    this.offset = offset;
    this.count = count;

public String substring(int beginIndex, int endIndex) {
    // throw IndexOutOfBoundsException if necessary

    return ((beginIndex == 0) && (endIndex == count)) ? this :
        new String(offset + beginIndex, endIndex - beginIndex, value);


public String(char value[], int offset, int count) {
    // throw StringIndexOutOfBoundsException if necessary

    this.offset = 0;
    this.count = count;
    this.value = Arrays.copyOfRange(value, offset, offset + count);


共享字符数组的优势显而易见,而劣势便是成为了Java程序中最常见的内存泄露原因之一。说起来我到十八摸以后写的第一个程序便遇到了这个问题:从服务器端得到一个长长的字符串形式的数据,经过一个内部解析类库获得一小个片段(可能只是记录个ID)并保存在内存中。不过后来发现内存的占用量上升的很快,且稳定后比预想地要高的多,通过Memory Profiling发现原来是这一小段字符串还持有原来完整的内容。知道了原因之后自然容易解决,用以下的构造函数重新生成一个新的字符串即可:

public String(String original) {
    int size = original.count;
    char[] originalValue = original.value;
    char[] v;
    if (originalValue.length > size) {
        // The array representing the String is bigger than the new
        // String itself.  Perhaps this constructor is being called
        // in order to trim the baggage, so make a copy of the array.
        int off = original.offset;
        v = Arrays.copyOfRange(originalValue, off, off+size);
    } else {
        // The array representing the String is the same
        // size as the String, so no point in making a copy.
        v = originalValue;
    this.offset = 0;
    this.count = size;
    this.value = v;

有意思的是,在未来的OpenJDK 8里,String类的这方面表现已经改变了:

public final class String  {
    /** The value is used for character storage. */
    private final char value[];

OpenJDK 8放弃了保留了近二十年的设计,让String对象使用各自独立的字符数组,就跟.NET一贯以来的做法一样。这样,它的相关方法如substring也有了相应改变:

public String substring(int beginIndex, int endIndex) {
    // throw StringIndexOutOfBoundsException if necessary

    int subLen = endIndex - beginIndex;
    if (subLen < 0) {
        throw new StringIndexOutOfBoundsException(subLen);
    return ((beginIndex == 0) && (endIndex == value.length)) ? this
            : new String(value, beginIndex, subLen);

这里直接调用的已经是之前列举过的,会复制字符数组内容的公有构造函数了。所以说,“Java”只是一个标准,可以有各种实现。从外部表现看来,OpenJDK 8的String类相对于之前没有任何变化。




Author:jeffz@live.com (老赵)