问题
Given a data frame as follow:
v1 v2 v3 v4
Tom A Jim B
Gary A Shirly A
Shirly B Jack B
Tom A Jack B
...
v2 and v4 denote which group the name in v1 and v3 respectively belongs to. Tom belongs to group A and Jim belongs to group v4.
I'd like to plot a social network with geom_net
, with lines linkage to two names if they are in the same row, for instance, Tom
and Jim
. And the size of edges should be proportional to the times they have been appeared in V3, i.e, the edge of Jack
should be as twice big as Jim
and Shirly
.
I tried
ggplot(df, aes(from_id = V1,to_id = V3)) +geom_net()
But a very bad result is given:
And a warning is generated:
In f(..., self = self) :
There are 35 nodes without node information:
#And the below are all the values in V1 and V3
Tom, Shirly, ....
Did you use all=T in merge?
I wonder how to show the result in a proper and good looking way with no x-axis or y-axis and the relationship among edges should be clearly shown. And the edges' color should represent the groups they belongs to. That means all names in the same group should have same color.
Hope to get your help! Thanks in advance!
回答1:
I struggled with this too until I figured out what the correct data.frame structure was for the geom_net package. Basically what you need is a data.frame that has two parts: in part 1 you describe the edges (the lines drawn) by providing a FROM and a TO column. Optionally, additional info can be provided in a separate column e.g., linewidth
ans <- read.table(text ="
from to linewidth
Tom Jim 0.1
Gary Shirly 1
Shirly Jack 0.5
Tom Jack 2
", sep = " ", stringsAsFactors = FALSE, header=TRUE)
p <- ggplot(data = ans, aes(from_id = from, to_id = to))
p + geom_net(label = TRUE, vjust=-1)
But you will notice that some of the nodes (vertices) are not labelled. So this is where part 2 of the data.frame is important. In part 2 you supply the names of the nodes to be labelled. This is because geom_net only labels the FROM node and not the TO node, so you will need to supply, as a minimum, the names of the nodes that are not used as a FROM point.
ans <- read.table(text ="
from to linewidth
Tom Jim 0.1
Gary Shirly 1
Shirly Jack 0.5
Tom Jack 2
Helen Jack 3
Jim NA NA
Jack NA NA
", sep = " ", stringsAsFactors = FALSE, header=TRUE, na.strings = "NA")
p <- ggplot(data = ans, aes(from_id = from, to_id = to, linewidth = linewidth))
p + geom_net(label = TRUE, vjust=-1)
Several things going on above: 1) I added "Jim NA NA Jack NA NA" as labels for the unlabeled nodes, 2) also added na.strings = "NA" to ensure that read.table() properly interprets the NA values, and 3) I added the linewidth parameter to the aes so that it maps from the data.frame to the plot.
Also, once you supply names for all the nodes, the warning message "There are XX nodes without node information" goes away.
Hope that helps edit: as requested I added the resultant output. Since geom_net() changes the layout each time it is run, I have included two example images
Just to complete the whole data.frame building process, I have included below a case where you have two separate data.frames and you need to merge them together: first data.frame is for the lines (edges) and the second is the nodes (vertices).
lines <- read.table(text ="
from to linewidth
Tom Ivy 0.1
Gary Ivy 1
Shirly Ivy 0.5
Tom Helen 2
Helen Ivy 3
", sep = " ", stringsAsFactors = FALSE, header=TRUE, na.strings = "NA")
nodes <- read.table(text ="
name
Tom
Jim
Gary
Shirly
Jack
Helen
Susan
Joel
Ivy
", sep = " ", stringsAsFactors = FALSE, header=TRUE,na.strings = "NA")
df <- merge(lines, nodes, by.x = "from", by.y = "name", all = TRUE)
p <- ggplot(data = df, aes(from_id = from, to_id = to, linewidth = linewidth))
p + geom_net(label = TRUE, vjust=-1)
回答2:
maintainer of geomnet here. If possible please post future questions to github.com/sctyner/geomnet/issues. @hackR has the right idea, of which there are several examples in the documentation. The idea is: you have an edges data frame has a from_id and a to_id column (+additional columns), and you also have a vertices data frame with an id column (+additional columns). Then you merge them:
network_data <- merge(edges, vertices, by.x = "from_id", by.y = "to_id", all = T)
Don't forget to include the all = T argument!
Thanks, Sam.
来源:https://stackoverflow.com/questions/34976716/inproper-show-when-use-geom-net-in-r