Parsing a csv file to populate database

你离开我真会死。 提交于 2019-12-25 08:16:51

问题


Given I have a csv file such as this

str_name,int_points,int_bonus
joe,2,5
Moe,10,15
Carlos,25,60

I can have csv file with x number of columns and y number of rows so i am trying to develop a generic method to parse it and populate data in to dynamodb table.

In order to populate the dynamodb table i would do something like this

String line = "";
    String cvsSplitBy = ",";

    try (BufferedReader br = new BufferedReader(
                                new InputStreamReader(objectData, "UTF-8"));

        while ((line = br.readLine()) != null) {

            // use comma as separator
            String[] elements = line.split(cvsSplitBy);

            try {
                table.putItem(new Item()
                    .withPrimaryKey("name", elements[0])
                    .withInt("points", elements[1])
                    .withInt("bonus", elements[2])
                    .....);

                System.out.println("PutItem succeeded: " + elements[0]);

            } catch (Exception e) {
                System.err.println("Unable to add user: " + elements);
                System.err.println(e.getMessage());
                break;
            }

        }

    } catch (IOException e) {
        e.printStackTrace();
    }

However i would not always know wether i am inserting a int or a string, it is depenedent on the csv file so i was kinda lost on how to create a generic function which would read the first line of my csv file and take advantage of prefix which indicates if the particular column is a int or a string.


回答1:


Just store labels (first row) and then while iterating over row values, decide based on label what method to call. If you are not against bringing some external dependencies I advise you to use some external csv reader , e.g. SuperCsv Using this library you can for example read each row as a Map(label->val) then iterate over entries and based on labels prefix update your db with correct method. Or just read header and then do the same reading each row as a list.

Example :

This is of course very crude and I would probably refactor it somehow (e.g. have a list of processors for each column instead of ugly switch) but it shows you the idea

        List<String> labels = new ArrayList<>();//store first row here
        List<String> elements = new ArrayList<>();//currently processed line here
        Item item = new Item();
        for (int i = 0; i < elements.size(); i++) {
            String label = labels.get(i);
            switch (getTypePrefix(label)){
                case "int":
                    item = item.withInt(getName(label),elements.get(i));
                    break;
                case "str":
                    item = item.withString(getName(label),elements.get(i));
                    break;
                default:
                    //sth
                    break;
            }
        }
        table.putItem(item);



回答2:


OK, I can't post this as a comment so I wrote a simple example. Note that I'm not familiar with that Amazon API you're using but you should get the idea how I'd go about it (I've basically rewritten your code)

        String line = "";
        String cvsSplitBy = ",";

        try (BufferedReader br = new BufferedReader(
                            new InputStreamReader(objectData, "UTF-8"));

     String[]  colNames = br.readLine().split(cvsSplitBy);      //first line just to get the column names
     while ((line = br.readLine()) != null) {
        String currColumnName = colNames.get(i);
        // use comma as separator
        String[] elements = line.split(cvsSplitBy);
        boolean isInt ;
        for (int i = 0; i < elements.length;i++){

        try {
            try{
            int iVal = new Integer(elements[i]);
            isInt = true;
            }catch(NumberFormatException e){
            //process exception
            isInt = false;
            }
            if(isInt){
            table.putItem.(new Item().withInt(currColumnName,iVal));
            }else{
            table.putItem.(new Item().withString(currColumnName),elements[i])); //don't even know whether there is a withString method
            }

            System.out.println("PutItem succeeded: " + elements[i]);

        } catch (Exception e) {
            System.err.println("Unable to add user: " + elements);
            System.err.println(e.getMessage());
            break;
        }
        }

    }

} catch (IOException e) {
    e.printStackTrace();
}

This example assumes that your first row contains the column names as stored in the DB. You don't have to write anywhere whether they an int or a String because there is a check in the program (granted this is not the most efficient way to do this and you may write something better, perhaps what Molok has suggested)



来源:https://stackoverflow.com/questions/39103094/parsing-a-csv-file-to-populate-database

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!