distribution - Multivariate K-S test in R -
so can run k-s test assess if have difference in distribution of dtwo datasets, outlined here.
so lets take following data
set.seed(123) n <- 1000 var1 <- runif(n, min=0, max=0.5) var2 <- runif(n, min=0.3, max=0.7) var3 <- rbinom(n=n, size=1, prob = 0.45) df <- data.frame(var1, var2, var3)
we can seperate based on var3 outcome
df.1 <- subset(df, var3 == 1) df.2 <- subset(df, var3 == 0)
now can run kolmogorov–smirnov test test differences in distributions of each individual variable.
ks.test(jitter(df.1$var1), jitter(df.2$var1)) ks.test(jitter(df.1$var2), jitter(df.2$var2))
and not suprisngly, not difference , can assume different dataset have been drawn same distribution. can visualised through:
plot(ecdf(df.1$var1), col=2) lines(ecdf(df.2$var1)) plot(ecdf(df.1$var2), col=3) lines(ecdf(df.2$var2), col=4)
but want consider if distributions between var3==0
, var3==1
differ when consider both var1
& var2
together. is there r package run such test when have multiple predictors
the similar question posed here, has not received answers
there appears literature: example 1 example 2
but nothing appears linked r
Comments
Post a Comment