JDK中String类的源码分析(二)

时间:2023-03-09 18:04:12
JDK中String类的源码分析(二)

1、startsWith(String prefix, int toffset)方法

包括startsWith(*),endsWith(*)方法,都是调用上述一个方法

     public boolean startsWith(String prefix, int toffset) {
char ta[] = value;
int to = toffset;
char pa[] = prefix.value;
int po = 0;
int pc = prefix.value.length;
// Note: toffset might be near -1>>>1.
if ((toffset < 0) || (toffset > value.length - pc)) {
return false;
}
while (--pc >= 0) {
if (ta[to++] != pa[po++]) {
return false;
}
}
return true;
}

上述算法的时间复杂度,最差的情况下为O(n)(取决于匹配子串的长度),最理想的情况下为O(1);

2、indexOf方法

有多个重载的方法,参数可以为字符,也可以为字符串

     static int indexOf(char[] source, int sourceOffset, int sourceCount,
char[] target, int targetOffset, int targetCount,
int fromIndex) {
if (fromIndex >= sourceCount) {
return (targetCount == 0 ? sourceCount : -1);
}
if (fromIndex < 0) {
fromIndex = 0;
}
if (targetCount == 0) {
return fromIndex;
} char first = target[targetOffset];
int max = sourceOffset + (sourceCount - targetCount); for (int i = sourceOffset + fromIndex; i <= max; i++) {
/* Look for first character. */
if (source[i] != first) {
while (++i <= max && source[i] != first);
} /* Found first character, now look at the rest of v2 */
if (i <= max) {
int j = i + 1;
int end = j + targetCount - 1;
for (int k = targetOffset + 1; j < end && source[j]
== target[k]; j++, k++); if (j == end) {
/* Found whole string. */
return i - sourceOffset;
}
}
}
return -1;
}

这个匹配子串的方法比较复杂,值得深入研究

3、substring方法

在JDK1.7之前的代码中,substring存在严重内存泄露问题,然而,这个问题在JDK1.7之后的版本中都有了改善;

因为JDK1.7中修改了构造方法,调用Arrays.copyOfRange()方法,只是复制出数组的一部分;

关于String类的构造方法,可以参看: JDK中String类的源码分析(一)

4、concat方法

连接两个字符串

     public String concat(String str) {
int otherLen = str.length();
if (otherLen == 0) {
return this;
}
int len = value.length;
char buf[] = Arrays.copyOf(value, len + otherLen);
str.getChars(buf, len);
return new String(buf, true);
}

5、split方法,切割字符串

     public String[] split(String regex, int limit) {

         char ch = 0;
if (((regex.value.length == 1 &&
".$|()[{^?*+\\".indexOf(ch = regex.charAt(0)) == -1) ||
(regex.length() == 2 &&
regex.charAt(0) == '\\' &&
(((ch = regex.charAt(1))-'0')|('9'-ch)) < 0 &&
((ch-'a')|('z'-ch)) < 0 &&
((ch-'A')|('Z'-ch)) < 0)) &&
(ch < Character.MIN_HIGH_SURROGATE ||
ch > Character.MAX_LOW_SURROGATE))
{
int off = 0;
int next = 0;
boolean limited = limit > 0;
ArrayList<String> list = new ArrayList<>();
while ((next = indexOf(ch, off)) != -1) {
if (!limited || list.size() < limit - 1) {
list.add(substring(off, next));
off = next + 1;
} else { // last one
//assert (list.size() == limit - 1);
list.add(substring(off, value.length));
off = value.length;
break;
}
}
// If no match was found, return this
if (off == 0)
return new String[]{this}; // Add remaining segment
if (!limited || list.size() < limit)
list.add(substring(off, value.length)); // Construct result
int resultSize = list.size();
if (limit == 0)
while (resultSize > 0 && list.get(resultSize - 1).length() == 0)
resultSize--;
String[] result = new String[resultSize];
return list.subList(0, resultSize).toArray(result);
}
return Pattern.compile(regex).split(this, limit);
}

6、trim方法,去除字符串两端的空格

     public String trim() {
int len = value.length;
int st = 0;
char[] val = value; /* avoid getfield opcode */ while ((st < len) && (val[st] <= ' ')) {
st++;
}
while ((st < len) && (val[len - 1] <= ' ')) {
len--;
}
return ((st > 0) || (len < value.length)) ? substring(st, len) : this;
}

算法时间复杂度在O(log(n))左右,截取,创建一个新的字符串;

总结:

在String类中的大多数方法,都存在new对象的操作,因为String的不可变性,如果大量的调用这些方法,在内存中会产生大量的String对象;

这样对GC的压力很非常大,很可能会出现内存溢出;