ggplot2 secondary axis strange output
I am trying two make a double y-axis plot with ggplot2. However, the primary y-axis text values are changed (and limits) and one of the variables is wrong displayed ("mean" variable). Edit: The text labels for the "mean" variable are ranging from 0.55 until 0.75, making difficult to see the varibility. However, in the original step for that plot (p <- p + geom_line(aes(y = mean_d, colour = "mean")) + geom_point(aes(y = mean_d, colour = "mean"))) it was ranging from 0.7757 until 0.7744. It should be displayed as the original step (maybe it has to be with the manipulation of the data within the ggplot calls?) In addition, is it possible to coordinate the axis-y1 texts with the axis-y2 text to be displayed in the same horizontal line?
# dput(coeff.mean)
coeff.mean <- structure(list(individuals = c(5L, 18L, 31L, 43L, 56L, 69L, 82L,
95L, 108L, 120L, 133L, 146L, 159L, 172L, 185L, 197L, 210L, 223L,
236L, 249L, 262L, 274L, 287L, 300L, 313L, 326L, 339L, 351L, 364L,
377L), mean_d = c(0.775414405190575, 0.774478867355839, 0.774632679560057,
0.774612015422181, 0.774440717600404, 0.774503749029999, 0.774543337328481,
0.774536584528457, 0.774518615875444, 0.774572944896752, 0.774553554507719,
0.774526346948343, 0.774537645238366, 0.774549039219398, 0.774518593880137,
0.77452848368359, 0.774502654364311, 0.774527249259969, 0.774551190425812,
0.774524221826879, 0.774514765537317, 0.774541221078135, 0.774552621147008,
0.774546365564095, 0.774540310535789, 0.774540468208943, 0.774548658706833,
0.77454534219406, 0.774541081476004, 0.774541996470423), var_d = c(0.000438374265308954,
0.000345714068446388, 0.000324909665783972, 0.000318897997146887,
0.000316077108040133, 0.000314032075708385, 0.000310447758209298,
0.000310325171003455, 0.000311927176741998, 0.000309622062319051,
0.000308772480851544, 0.000308388263293765, 0.000306838067001956,
0.000307838047303517, 0.000307737478217495, 0.000306351076037266,
0.000307288393036824, 0.000306717640522594, 0.000306768886331324,
0.000306897320278579, 0.000307154374510682, 0.000306352361061403,
0.000306998606721366, 0.000306434828650204, 0.000305865398401208,
0.000306061994682725, 0.000305934443005304, 0.000305853730364841,
0.000306181262913308, 0.000306820996289535)), .Names = c("individuals",
"mean_d", "var_d"), row.names = c(NA, -30L), class = c("tbl_df",
"tbl", "data.frame"))
p <- ggplot(coeff.mean, aes(x=individuals))
p <- p + geom_line(aes(y = mean_d, colour = "mean")) + geom_point(aes(y = mean_d, colour = "mean"))
p <- p + geom_line(aes(y = var_d*(max(mean_d)/max(var_d)), colour = "var")) + geom_point(aes(y = var_d*(max(mean_d)/max(var_d)), colour = "var"))
p <- p + scale_y_continuous(sec.axis = sec_axis(~.*(max(coeff.mean$var_d)/max(coeff.mean$mean_d)), name = "var"))
p <- p + scale_colour_manual(values = c("black", "grey"))
p <- p + labs(y = "mean", x = "Resampled", colour = "Statistic")
print(p)
I do appreciate any advice.
This more clearly shows what my comment was pointing out: You don't need to multiplicatively scale var_d , you need to add to it.
library(dplyr)
coeff.mean %>%
ggplot(aes(individuals, mean_d)) +
geom_point(aes(color = "mean_d")) + geom_line(aes(color = "mean_d")) +
geom_point(aes(individuals, var_d+0.7745, color = "var_d")) +
geom_line(aes(individuals, var_d+0.7745, color = "var_d")) +
scale_y_continuous(sec.axis = sec_axis(trans = ~ . - 0.7745))
Of course, this figure is problematic for all sorts of reasons. It's hard to interpret for sure.
If you want to scale both multiplicatively and additively, you could try scales::rescale
, once to scale var_d to the range of mean_d , and then again to scale the scaled var_d back to the original range.
coeff.mean %>%
mutate(var_rescaled = scales::rescale(var_d, to = range(mean_d))) %>%
ggplot(aes(individuals, mean_d)) +
geom_point(aes(color = "mean_d")) + geom_line(aes(color = "mean_d")) +
geom_point(aes(y = var_rescaled, color = "var_d")) +
geom_line(aes(y = var_rescaled, color = "var_d")) +
scale_y_continuous(sec.axis =
sec_axis(trans = ~scales::rescale(., to = range(coeff.mean$var_d)),
breaks = function(values) {scales::pretty_breaks(n=5)(values)},
name = "var_d"))
This one has problems too. Particularly, since the highest value of both mean_d and var_d were at the same individual , they overlap at that point.
Here I show using facets as an alternative to a dual-axis plot. I know it does not answer the original question, sorry!
library(ggplot2)
library(tidyr)
# Convert data to long form with tidyr::gather()
long_dat = gather(data=coeff.mean, key="stat", value="stat_value", mean_d, var_d)
head(long_dat)
# A tibble: 6 x 3
# individuals stat stat_value
# <int> <chr> <dbl>
# 1 5 mean_d 0.7754144
# 2 18 mean_d 0.7744789
# 3 31 mean_d 0.7746327
# 4 43 mean_d 0.7746120
# 5 56 mean_d 0.7744407
# 6 69 mean_d 0.7745037
p2 = ggplot(long_dat, aes(x=individuals, y=stat_value, colour=stat)) +
geom_point() +
geom_line() +
scale_colour_manual(values=c(mean_d="black", var_d="grey40")) +
facet_grid(stat ~ ., scales="free_y")
ggsave("faceted_plot.png", plot=p2, height=4, width=6, dpi=150)
链接地址: http://www.djcxy.com/p/86158.html
上一篇: 端与服务器
下一篇: ggplot2次轴奇怪的输出