Java - Split String by Number and Letters

喜你入骨 提交于 2019-11-27 14:52:34

问题


So I have, for example, a string such as this C3H20IO

What I wanna do is split this string so I get the following:

Array1 = {C,H,I,O}
Array2 = {3,20,1,1}

The 1 as the third element of the Array2 is indicative of the monoatomic nature of the I element. Same for O. That is actually the part I am struggling with.

This is a chemical equation, so I need to separate the elements according to their names and the amount of atoms there are etc.


回答1:


You could try this approach:

String formula = "C3H20IO";

//insert "1" in atom-atom boundry 
formula = formula.replaceAll("(?<=[A-Z])(?=[A-Z])|(?<=[a-z])(?=[A-Z])|(?<=\\D)$", "1");

//split at letter-digit or digit-letter boundry
String regex = "(?<=\\D)(?=\\d)|(?<=\\d)(?=\\D)";
String[] atoms = formula.split(regex);

Output:

atoms: [C, 3, H, 20, I, 1, O, 1]

Now all even even indices (0, 2, 4...) are atoms and odd ones are the associated number:

String[] a = new String[ atoms.length/2 ];
int[] n = new int[ atoms.length/2 ];

for(int i = 0 ; i < a.length ; i++) {
    a[i] = atoms[i*2];
    n[i] = Integer.parseInt(atoms[i*2+1]);
}

Output:

a: [C, H, I, O]
n: [3, 20, 1, 1]




回答2:


You can use a regular expression to slide over your input using the Matcher.find() method.

Here a rough example of what it may look like:

    String input = "C3H20IO";

    List<String> array1 = new ArrayList<String>();
    List<Integer> array2 = new ArrayList<Integer>();

    Pattern pattern = Pattern.compile("([A-Z][a-z]*)([0-9]*)");
    Matcher matcher = pattern.matcher(input);               
    while(matcher.find()){
        array1.add(matcher.group(1));

        String atomAmount = matcher.group(2);
        int atomAmountInt = 1;
        if((atomAmount != null) && (!atomAmount.isEmpty())){
            atomAmountInt = Integer.valueOf(atomAmount);
        }
        array2.add(atomAmountInt);
    }

I know, the conversion from List to Array is missing, but it should give you an idea of how to approach your problem.




回答3:


An approach without REGEX and data stored using ArrayList:

String s = "C3H20IO";

char Chem = '-';
String val = "";
boolean isFisrt = true;
List<Character> chemList = new ArrayList<Character>();
List<Integer> weightList = new ArrayList<Integer>();
for (char c : s.toCharArray()) {
    if (Character.isLetter(c)) {
        if (!isFisrt) {
            chemList.add(Chem);
            weightList.add(Integer.valueOf(val.equals("") ? "1" : val));
            val = "";
        }
        Chem = c;
    } else if (Character.isDigit(c)) {
        val += c;
    } 
    isFisrt = false;
}
chemList.add(Chem);
weightList.add(Integer.valueOf(val.equals("") ? "1" : val));

System.out.println(chemList);
System.out.println(weightList);

OUTPUT:

[C, H, I, O]
[3, 20, 1, 1]



回答4:


This works assuming each element starts with a capital letter, i.e. if you have "Fe" you don't represent it in String as "FE". Basically, you split the string on each capital letter then split each new string by letters and numbers, adding "1" if the new split contains no numbers.

        String s = "C3H20IO";
        List<String> letters = new ArrayList<>();
        List<String> numbers = new ArrayList<>();

        String[] arr = s.split("(?=\\p{Upper})");  // [C3, H20, I, O]
        for (String str : arr) {  //[C, 3]:[H, 20]:[I]:[O]
            String[] temp = str.split("(?=\\d)", 2);
            letters.add(temp[0]);
            if (temp.length == 1) {
                numbers.add("1");
            } else {
                numbers.add(temp[1]);
            }
        }
        System.out.println(Arrays.asList(letters)); //[[C, H, I, O]]
        System.out.println(Arrays.asList(numbers)); //[[3, 20, 1, 1]]



回答5:


make (for loop) with size of input length and add following condition

if(i==number)
// add it to the number array

if(i==character)
//add it into character array



回答6:


I suggest splitting by uppercase letter using zero-width lookahead regex (to extract items like C12, O2, Si), then split each item into element and its numeric weight:

List<String> elements = new ArrayList<>();
List<Integer> weights = new ArrayList<>();

String[] items = "C6H12Si6OH".split("(?=[A-Z])");  // [C6, H12, Si6, O, H]
for (String item : items) {
    String[] pair = item.split("(?=[0-9])", 2);    // e.g. H12 => [H, 12], O => [O]
    elements.add(pair[0]);
    weights.add(pair.length > 1 ? Integer.parseInt(pair[1]) : 1);
}
System.out.println(elements);  // [C, H, Si, O, H]
System.out.println(weights);   // [6, 12, 6, 1, 1]



回答7:


Is this good? (Not using split)

Regex Demo

String line = "C3H20ZnO2ABCD";
String pattern = "([A-Z][a-z]*)(((?=[A-Z][a-z]*|$))|\\d+)";

Pattern r = Pattern.compile(pattern);

Matcher m = r.matcher(line);

while (m.find( )) {
     System.out.print(m.group(1));
     if (m.group(2).length() == 0) {
         System.out.println(" 1");
     } else {
         System.out.println(" " + m.group(2));
     }
  }

IDEONE DEMO




回答8:


You can split the string by using a regular expression like (?<=\D)(?=\d). Try this :

String alphanum= "abcd1234";
String[] part = alphanum.split("(?<=\\D)(?=\\d)");
System.out.println(part[0]);
System.out.println(part[1]);

will output

abcd 1234




回答9:


I did this as following

ArrayList<Integer> integerCharacters = new ArrayList();
ArrayList<String> stringCharacters = new ArrayList<>();

String value = "C3H20IO"; //Your value 
String[] strSplitted = value.split("(?<=\\D)(?=\\d)|(?<=\\d)(?=\\D)"); //Split numeric and strings

for(int i=0; i<strSplitted.length; i++){

    if (Character.isLetter(strSplitted[i].charAt(0))){
        stringCharacters.add(strSplitted[i]); //If string then add to strings array
    }
    else{
        integerCharacters.add(Integer.parseInt(strSplitted[i])); //else add to integer array
    }
}



回答10:


You can use two patterns :

  • [0-9]
  • [a-zA-Z]

Split twice by each of them.

List<String> letters = Arrays.asList(test.split("[0-9]"));
List<String> numbers = Arrays.asList(test.split("[a-zA-Z]"))
            .stream()
            .filter(s -> !s.equals(""))
            .collect(Collectors.toList());

if(letters.size() != numbers.size()){
        numbers.add("1");
    }


来源:https://stackoverflow.com/questions/36423633/java-split-string-by-number-and-letters

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!