问题
I am trying to prototype performance results of orientdb mass delete of vertexes. I need to prototype trying to delete more than 10000 upto million.
Firstly, I am using the light weight edges property to be false while creating my vertices and edges following this Issue with creating edge in OrientDb with Blueprints / Tinkerpop
When I try deleting (Please see below the code)
private static OrientGraph graph = new OrientGraph(
"remote:localhost/WorkDBMassDelete2", "admin", "admin");
private static void removeCompleatedWork() {
try {
long startTime = System.currentTimeMillis();
List params = new ArrayList();
String deleteQuery = "delete vertex Work where out('status') contains (work-state = 'Not Started')";
int no = graph.getRawGraph().command(new OCommandSQL(deleteQuery))
.execute(params);
// graph.commit();
long endTime = System.currentTimeMillis();
System.out.println("No of activities removed : " + no
+ " and time taken is : " + (endTime - startTime));
} catch (Exception e) {
e.printStackTrace();
} finally {
graph.shutdown();
}
}
The Results are good if I tring to delete in 100's 500 activities take ~500 ms. But when i trying to delete 2500/5000 activities the no's are high for 2500 deletions it takes ~6000.
A) I also tried creating index. What is the best practise to create a index on the attribute work-state or to create index on the edge status? I tried both while creating the vertex and edge. But both are not improving the performance a lot.
((OrientBaseGraph) graph).createKeyIndex("Status", Edge.class);
//or on the vertex
((OrientBaseGraph) graph).createKeyIndex("work-state", Vertex.class);
What is the best practice to delete mass/group data using the query like mentioned above? Any help is greatly appreciated.
UPDATE:
I downloaded orientdb-community-1.7-20140416.230539-144-distribution.tar.gz from https://oss.sonatype.org/content/repositories/snapshots/com/orientechnologies/orientdb-community/1.7-SNAPSHOT/.
When I try deleting using the subquery from the studio / program I get the following error:com.orientechnologies.orient.core.sql.OCommandSQLParsingException: Error on parsing command at position #0: Class 'FROM was not found . I had modified my query like this:
delete vertex from (select in('status') from State where work-state = 'Complete')
Also while I ran it through program I updated my maven dependencies to 1.7-SNAPSHOT libraries. My old query was still producing the same numbers and the subquery deletion was giving errors even in studio. Please let me know if I am missing anything. Thanks !!
回答1:
First, please try the same exact code with 1.7-SNAPSHOT. It should be faster.
Then in 1.7-SNAPSHOT we just added the ability to delete vertices from a sub-query. This is because why browsing all the Work when you could delete all the incoming vertex from the Status vertex "Not Started"?
So if you've 1.7-SNAPSHOT change this query from:
delete vertex Work where out('status') contains (work-state = 'Not Started')
to (assuming the status vertex is called "State"):
delete vertex from (select in('status') from State where work-state = 'Not Started')
来源:https://stackoverflow.com/questions/23156580/mass-group-delete-orientdb-java