问题
I am working on an HR analytics projects using neo4j and I encountered a complicated query (I am also very new to cypher). Basically, I have a list of employee with some features like location, skill set, education and positions with location, skills required. I look if location (employee’s location and position’s location) matches and set that score to 1 (0 otherwise). If I have only one open position, the query works fine. However, with more positions the variable gets overwritten with last value of last position. I would imagine putting the node properties in an array but doesn’t seem that cypher has that (deprecated).
https://neo4j.com/docs/rest-docs/current/#rest-api-property-values
This is the query that I use and I would be grateful if you can help me resolve the issue. Final goal is to have a score for every employee per a given position and fill the position with the employee with highest score. Is this even possible on cypher?? Thanks a lot
MATCH
(e:Employee)-[r:FUTURE_POSITION]-> (p:Position {open_status:1}),
(e)-[h:HAS_DEGREE]-> (d:Degree),
(e)-[s:HAS_SKILL]-> (n:Personal_Skill)
WITH r, e,p,d,n,
CASE WHEN p.position_state = e.home_state THEN 1 ELSE 0 END AS SameStateScore,
CASE WHEN p.position_city = e.home_city THEN 1 ELSE 0 END AS SameCityScore,
CASE WHEN d.name = "College Degree" THEN 1 ELSE 0 END AS HasCollegeDegree,
CASE WHEN n.name = "Management" THEN 1 ELSE 0 END AS HasRequiredSkill
SET e.score = SameStateScore + SameCityScore + HasCollegeDegree + HasRequiredSkill
RETURN DISTINCT e.name,p.name, SameStateScore,SameCityScore,HasCollegeDegree,MAX(HasRequiredSkill) AS HasRequiredSkill, e.score
ORDER BY e.score DESC
回答1:
The link you used in your question (https://neo4j.com/docs/rest-docs/current/#rest-api-property-values) is for the deprecated legacy REST API. If you want to make HTTP requests to execute Cypher, you can use the new HTTP API instead.
In order to have a separate employee score per position, you should not be storing the score in the Employee
node, but in the FUTURE_POSITION
relationship -- since there is a separate relationship per position. So, just use r.score
instead of e.score
:
MATCH
(e:Employee)-[r:FUTURE_POSITION]->(p:Position {open_status:1}),
(e)-[:HAS_DEGREE]-> (d:Degree),
(e)-[:HAS_SKILL]-> (n:Personal_Skill)
WITH r, e, p,
CASE WHEN p.position_state = e.home_state THEN 1 ELSE 0 END AS SameStateScore,
CASE WHEN p.position_city = e.home_city THEN 1 ELSE 0 END AS SameCityScore,
CASE WHEN d.name = "College Degree" THEN 1 ELSE 0 END AS HasCollegeDegree,
CASE WHEN n.name = "Management" THEN 1 ELSE 0 END AS HasRequiredSkill
SET r.score = SameStateScore + SameCityScore + HasCollegeDegree + HasRequiredSkill
RETURN DISTINCT
e.name, p.name, SameStateScore, SameCityScore, HasCollegeDegree,
MAX(HasRequiredSkill) AS HasRequiredSkill, r.score
ORDER BY r.score DESC
Also, the DISTINCT
option may not be needed, as aggregation functions (like MAX
) already make sure the set of non-aggregated values in the same result row are distinct.
来源:https://stackoverflow.com/questions/50843464/how-to-store-properties-of-a-neo4j-node-as-an-array