ggplot2 secondary axis strange output

2018-06-30 21:20:16

I am trying two make a double y-axis plot with ggplot2. However, the primary y-axis text values are changed (and limits) and one of the variables is wrong displayed ("mean" variable). Edit: The text labels for the "mean" variable are ranging from 0.55 until 0.75, making difficult to see the varibility. However, in the original step for that plot (p <- p + geom_line(aes(y = mean_d, colour = "mean")) + geom_point(aes(y = mean_d, colour = "mean"))) it was ranging from 0.7757 until 0.7744. It should be displayed as the original step (maybe it has to be with the manipulation of the data within the ggplot calls?) In addition, is it possible to coordinate the axis-y1 texts with the axis-y2 text to be displayed in the same horizontal line?

# dput(coeff.mean)
coeff.mean <- structure(list(individuals = c(5L, 18L, 31L, 43L, 56L, 69L, 82L, 
95L, 108L, 120L, 133L, 146L, 159L, 172L, 185L, 197L, 210L, 223L, 
236L, 249L, 262L, 274L, 287L, 300L, 313L, 326L, 339L, 351L, 364L, 
377L), mean_d = c(0.775414405190575, 0.774478867355839, 0.774632679560057, 
0.774612015422181, 0.774440717600404, 0.774503749029999, 0.774543337328481, 
0.774536584528457, 0.774518615875444, 0.774572944896752, 0.774553554507719, 
0.774526346948343, 0.774537645238366, 0.774549039219398, 0.774518593880137, 
0.77452848368359, 0.774502654364311, 0.774527249259969, 0.774551190425812, 
0.774524221826879, 0.774514765537317, 0.774541221078135, 0.774552621147008, 
0.774546365564095, 0.774540310535789, 0.774540468208943, 0.774548658706833, 
0.77454534219406, 0.774541081476004, 0.774541996470423), var_d = c(0.000438374265308954, 
0.000345714068446388, 0.000324909665783972, 0.000318897997146887, 
0.000316077108040133, 0.000314032075708385, 0.000310447758209298, 
0.000310325171003455, 0.000311927176741998, 0.000309622062319051, 
0.000308772480851544, 0.000308388263293765, 0.000306838067001956, 
0.000307838047303517, 0.000307737478217495, 0.000306351076037266, 
0.000307288393036824, 0.000306717640522594, 0.000306768886331324, 
0.000306897320278579, 0.000307154374510682, 0.000306352361061403, 
0.000306998606721366, 0.000306434828650204, 0.000305865398401208, 
0.000306061994682725, 0.000305934443005304, 0.000305853730364841, 
0.000306181262913308, 0.000306820996289535)), .Names = c("individuals", 
"mean_d", "var_d"), row.names = c(NA, -30L), class = c("tbl_df", 
"tbl", "data.frame"))

p <- ggplot(coeff.mean, aes(x=individuals))
p <- p + geom_line(aes(y = mean_d, colour = "mean")) + geom_point(aes(y = mean_d, colour = "mean"))
p <- p + geom_line(aes(y = var_d*(max(mean_d)/max(var_d)), colour = "var")) + geom_point(aes(y = var_d*(max(mean_d)/max(var_d)), colour = "var")) 
p <- p + scale_y_continuous(sec.axis = sec_axis(~.*(max(coeff.mean$var_d)/max(coeff.mean$mean_d)), name = "var"))
p <- p + scale_colour_manual(values = c("black", "grey"))
p <- p + labs(y = "mean", x = "Resampled", colour = "Statistic")
print(p)

I do appreciate any advice.

在这里输入图像描述

This more clearly shows what my comment was pointing out: You don't need to multiplicatively scale var_d , you need to add to it.

library(dplyr)

coeff.mean %>% 
  ggplot(aes(individuals, mean_d)) +
  geom_point(aes(color = "mean_d")) + geom_line(aes(color = "mean_d")) +
  geom_point(aes(individuals, var_d+0.7745, color = "var_d")) + 
  geom_line(aes(individuals, var_d+0.7745, color = "var_d")) +
  scale_y_continuous(sec.axis = sec_axis(trans = ~ . - 0.7745))

在这里输入图像描述

Of course, this figure is problematic for all sorts of reasons. It's hard to interpret for sure.

If you want to scale both multiplicatively and additively, you could try scales::rescale , once to scale var_d to the range of mean_d , and then again to scale the scaled var_d back to the original range.

coeff.mean %>% 
  mutate(var_rescaled = scales::rescale(var_d, to = range(mean_d))) %>% 
  ggplot(aes(individuals, mean_d)) +
  geom_point(aes(color = "mean_d")) + geom_line(aes(color = "mean_d")) +
  geom_point(aes(y = var_rescaled, color = "var_d")) + 
  geom_line(aes(y = var_rescaled, color = "var_d")) +
  scale_y_continuous(sec.axis = 
    sec_axis(trans = ~scales::rescale(., to = range(coeff.mean$var_d)),
             breaks = function(values) {scales::pretty_breaks(n=5)(values)},
             name = "var_d"))

在这里输入图像描述

This one has problems too. Particularly, since the highest value of both mean_d and var_d were at the same individual , they overlap at that point.

Here I show using facets as an alternative to a dual-axis plot. I know it does not answer the original question, sorry!

library(ggplot2)
library(tidyr)

# Convert data to long form with tidyr::gather()
long_dat = gather(data=coeff.mean, key="stat", value="stat_value", mean_d, var_d)

head(long_dat)
# A tibble: 6 x 3
#   individuals   stat stat_value
#         <int>  <chr>      <dbl>
# 1           5 mean_d  0.7754144
# 2          18 mean_d  0.7744789
# 3          31 mean_d  0.7746327
# 4          43 mean_d  0.7746120
# 5          56 mean_d  0.7744407
# 6          69 mean_d  0.7745037

p2 = ggplot(long_dat, aes(x=individuals, y=stat_value, colour=stat)) + 
     geom_point() + 
     geom_line() + 
     scale_colour_manual(values=c(mean_d="black", var_d="grey40")) +
     facet_grid(stat ~ ., scales="free_y")

ggsave("faceted_plot.png", plot=p2, height=4, width=6, dpi=150)

在这里输入图像描述

链接地址: http://www.djcxy.com/p/86158.html

上一篇: 端与服务器

下一篇: ggplot2次轴奇怪的输出