What is a good way to split strings here?

I have the following string:
A:B:1111;domain:80;a;b
The A is optional so B:1111;domain:80;a;b is also valid input.
The :80 is optional as well so B:1111;domain;a;b or :1111;domain;a;b are also valid input
What I want is to end up with a String[] that has:

s[0] = "A";  
s[1] = "B";  
s[2] = "1111";  
s[3] = "domain:80"  
s[4] = "a"  
s[5] = "b"  

I did this as follows:

List<String> tokens = new ArrayList<String>();  
String[] values = s.split(";");  
String[] actions = values[0].split(":");   

for(String a:actions){  
    tokens.add(a);  
}  
//Start from 1 to skip A:B:1111
for(int i = 1; i < values.length; i++){  
    tokens.add(values[i]);  
}  
String[] finalResult = tokens.toArray();

I was wondering is there a better way to do this? How else could I do this more efficiently?


There are not many efficiency concerns here, all I see is linear.

Anyway, you could either use a regular expression or a manual tokenizer.

You can avoid the list. You know the length of values and actions , so you can do

String[] values = s.split(";");  
String[] actions = values[0].split(":");
String[] result = new String[actions.length + values.length - 1];
System.arraycopy(actions, 0, result, 0, actions.legnth);
System.arraycopy(values, 1, result, actions.length, values.length - 1);
return result;

It should be reasonably efficient, unless you insist on implementing split yourself.

Untested low-level approach (make sure to unit test and benchmark before use):

// Separator characters, as char, not string.
final static int s1 = ':';
final static int s2 = ';';
// Compute required size:
int components = 1;
for(int p = Math.min(s.indexOf(s1), s.indexOf(s2));
  p < s.length() && p > -1;
  p = s.indexOf(s2, p+1)) {
    components++;
}
String[] result = new String[components];
// Build result
int in=0, i=0, out=Math.min(s.indexOf(s1), s.indexOf(s2));
while(out < s.length() && out > -1) {
  result[i] = s.substring(in, out);
  i++;
  in = out + 1;
  out = s.indexOf(s2, in);
}
assert(i == result.length - 1);
result[i] = s.substring(in, s.length());
return result;

Note: this code is optimized in the crazy way of that it will consider a : only in the first component. Handling the last component is a bit tricky, as out will have the value -1 .

I would usually not use this last approach, unless performance and memory is extremely crucial. Most likely there are still some bugs in it, and the code is fairly unreadable, in particulare compare to the one above.


With some assumptions about acceptable characters, this regex provides validation as well as splitting into the groups you desire.

Pattern p = Pattern.compile("^((.+):)?(.+):(d+);(.+):(d+);(.+);(.+)$");
Matcher m = p.matcher("A:B:1111;domain:80;a;b");
if(m.matches())
{
    for(int i = 0; i <= m.groupCount(); i++)
        System.out.println(m.group(i));
}
m = p.matcher("B:1111;domain:80;a;b");
if(m.matches())
{
    for(int i = 0; i <= m.groupCount(); i++)
        System.out.println(m.group(i));
}

Gives:

A:B:1111;domain:80;a;b // ignore this
A: // ignore this
A // This is the optional A, check for null
B
1111
domain
80
a
b

And

B:1111;domain:80;a;b // ignore this
null // ignore this
null // This is the optional A, check for null
B
1111
domain
80
a
b

你可以做类似的事情

String str = "A:B:1111;domain:80;a;b";
String[] temp;

/* delimiter */
String delimiter = ";";
/* given string will be split by the argument delimiter provided. */
temp = str.split(delimiter);
/* print substrings */
for(int i =0; i < temp.length ; i++)
System.out.println(temp[i]);
链接地址: http://www.djcxy.com/p/10780.html

上一篇: 使用单个.gitignore文件忽略所有文件夹中的某些文件

下一篇: 什么是在这里拆分字符串的好方法?