Specify the colour of ggpairs plot using a variable but not plot that variable
I have a dataset from the world bank with some continuous and categorical variables.
> head(nationsCombImputed)
iso3c iso2c country year.x life_expect population birth_rate neonat_mortal_rate region
1 ABW AW Aruba 2014 75.45 103441 10.1 2.4 Latin America & Caribbean
2 AFG AF Afghanistan 2014 60.37 31627506 34.2 36.1 South Asia
3 AGO AO Angola 2014 52.27 24227524 45.5 49.6 Sub-Saharan Africa
4 ALB AL Albania 2014 77.83 2893654 13.4 6.5 Europe & Central Asia
5 AND AD Andorra 2014 70.07 72786 20.9 1.5 Europe & Central Asia
6 ARE AE United Arab Emirates 2014 77.37 9086139 10.8 3.6 Middle East & North Africa
income gdp_percap.x log_pop
1 High income 47008.83 5.014693
2 Low income 1942.48 7.500065
3 Lower middle income 7327.38 7.384309
4 Upper middle income 11307.55 6.461447
5 High income 30482.64 4.862048
6 High income 67239.00 6.958379
I wish to use ggpairs to plot some of the continuous variables (life_expect, birth_rate, neonat_mortal_rate, gdp_percap.x) in a scatter plot but I would like to colour them using the region categorical variable from the data. I have tried a number of different ways but I cannot colour the continuous variables without including the categorical variable.
ggpairs(nationsCombImputed[,c(2,5,7,8,9,11)],
title="Scatterplot of Variables",
mapping = ggplot2::aes(color = region),
labeller = "iso2c")
But I get this error
Error in stop_if_high_cardinality(data, columns, cardinality_threshold) : Column 'iso2c' has more levels (211) than the threshold (15) allowed. Please remove the column or increase the 'cardinality_threshold' parameter. Increasing the cardinality_threshold may produce long processing times
Ultimately I would just like a 4x4 scatter plot of the continuous variables coloured by region with the data points labels using the iso2c code in column 2.
Is this possible in ggpairs?
Well yes it is possible! As per @Robin Gertenbach suggestions I added the columns argument to my code and this worked great, please see below.
ggpairs(nationsCombImputed,
title="Scatterplot of Variables",
columns = c(5,7,8,11),
mapping=ggplot2::aes(colour = region))
I still wish to add data point labels to the scatter plot using the iso2c column but I am struggling with this, any pointers would be greatly appreciated.
As mentioned in the comment you can get ggpairs to color but not plot a dimension by specifying the numeric indices of the columns you do want to plot with columns = c(5,7,8,11)
.
To have a text scatter plot you will need to define a function eg textscatter
that you will supply via lower = list(continuous = textscatter)
in the ggpairs function call and specify the labels in the aesthetics.
textscatter <- function(data, mapping, ...) {
ggplot(data, mapping, ...) + geom_text()
}
ggpairs(
nationsCombImputed,
title="Scatterplot of Variables",
columns = c(5,7,8,11),
mapping=ggplot2::aes(colour = region, label = iso2c))
lower = list(continuous = textscatter)
)
Of course you can also put the label aesthetic definition into textscatter
链接地址: http://www.djcxy.com/p/30870.html