R: Scatter plot of time series data for multiple points, ggplot?, reshape?
I have data in the following format. Column V1 is the genomic location of interest, and column V4 and V5 are the minor allele frequencies at two different points in time. I would like to make a simple xy scatter plot with a line connecting the allele frequency for each specific location from timepoint 1 to timepoint 2(plotted on y-axis). (Note, I actually have hundreds to thousands of data points).
V1 V2 V3 V4 V5
1 153 1/113 1/115 0.008849558 0.008695652
2 390 0/176 150/152 0.000000000 0.986842105
3 445 1/149 1/152 0.006711409 0.006578947
4 507 0/154 144/146 0.000000000 0.986301370
5 619 1/103 99/101 0.009708738 0.980198020
6 649 0/138 120/123 0.000000000 0.975609756
I feel like I should be able to accomplish this with ggplot, but I am not sure how to go about doing so, as I don't know how to specify two y-values for each genomic position, nor specify a column as a category. I suspect the data needs to be reshaped somehow. Any help or suggestions are greatly appreciated!
Update:
Thanks to all who gave me suggestions. I don't think I was very clear about wanting the time points to be my x-axis as opposed to the genomic position - my apologies. Hopefully this picture clarifies that!
I have successfully generated the plot I wished to make with the following code:
ggplot(dat) + geom_segment(aes(x="timepoint 1", y=V4, xend="timepoint2", yend=V5))
and this is what the plot looks like with more data points...
I haven't changed the axes titles and played with margins yet, but this is the general idea!
If your example data was in DF
, then
ggplot(DF) +
geom_segment(aes(x=V4, y="timepoint 1", xend=V5, yend="timepoint 2"))
It's not completely clear from the question, but I think this is what you're after:
ggplot(d, aes(x=V1, y=V4, ymin=V4, ymax=V5))
+ geom_linerange()
+ xlab('Genomic location')
+ ylab('Minor allele frequency')
Docs: http://docs.ggplot2.org/current/geom_linerange.html
with(dat, plot(x=V1, y=V5, ylim=c(0,1) ,type='n',
xaxt="n", ylab="Allele Frequency", xlab="Genomic Location"))
with(dat, axis(1, V1,V1, cex.axis=0.7) )
with( dat, arrows(x0=V1,x1=V1+10, y0=V4, y1=V5) )
You can clean up the labeling and tweak colors and arrowhead features:
?arrows
链接地址: http://www.djcxy.com/p/86148.html