How to preserve original values in a variable turned into a factor?

Here's some working code to illustrate my question:

# Categorical variable recorded as numeric (integer)
df1 <- data.frame(group = c(1, 2, 3, 9, 3, 2, 9, 1, 9, 3, 2))

I have a categorical variable ( group ) recorded as integer values. For plots and to include this variable in models, it would be useful to have it encoded as factor, mapping each number to a label describing the category. So I crete a factor:

# Make it a factor
df1$group_f <- factor(x = df1$group, 
                      levels = c(1, 2, 3, 9), 
                      labels = c("G1", "G2", "G3", "Unknown"))

df1
   group group_f
1      1      G1
2      2      G2
3      3      G3
4      9 Unknown
5      3      G3
6      2      G2
7      9 Unknown
8      1      G1
9      9 Unknown
10     3      G3
11     2      G2

Now, the problem is that eventually I need the original values again (because I have to join tables based on this variable, and the other table has the original numbers for each category -1,2,3,9- and not the labels).

Converting to numeric does not work ("Unknown" category gets mapped to 4 instead of 9)

# And back to numeric
df1$group_num <- as.numeric(df1$group_f)

df1

   group group_f group_num
1      1      G1         1
2      2      G2         2
3      3      G3         3
4      9 Unknown         4
5      3      G3         3
6      2      G2         2
7      9 Unknown         4
8      1      G1         1
9      9 Unknown         4
10     3      G3         3
11     2      G2         2

?factor says:

as.numeric applied to a factor is meaningless, and may happen by implicit coercion. To transform a factor f to approximately its original numeric values, as.numeric(levels(f))[f] is recommended and slightly more efficient than as.numeric(as.character(f)).

But as.numeric over the levels does not work either ('cause levels now are character with the labels, so cannot be coerced to numeric):

> as.numeric(levels(df1$group_f))
[1] NA NA NA NA
Warning message:
NAs introduced by coercion 

Is there a way to create a factor variable, so that it preserves the original values? (1,2,3,9 in this example)???

Note: the idea is to have one single factor variable that has the labels describing the categories, and the original number underlying. Although in this example I keep the variable group along the newly created factor variable, in my real use case I would/can not do that (it is a huge dataset).

链接地址: http://www.djcxy.com/p/24948.html

上一篇: R:将因子的某些级别转换为数字

下一篇: 如何保存变量中的原始值变成一个因子?