distribution - Multivariate K-S test in R -


so can run k-s test assess if have difference in distribution of dtwo datasets, outlined here.

so lets take following data

set.seed(123) n <- 1000 var1 <- runif(n, min=0, max=0.5) var2 <- runif(n, min=0.3, max=0.7) var3 <- rbinom(n=n, size=1, prob = 0.45)  df <- data.frame(var1, var2, var3) 

we can seperate based on var3 outcome

df.1 <- subset(df, var3 == 1) df.2 <- subset(df, var3 == 0) 

now can run kolmogorov–smirnov test test differences in distributions of each individual variable.

ks.test(jitter(df.1$var1), jitter(df.2$var1)) ks.test(jitter(df.1$var2), jitter(df.2$var2)) 

and not suprisngly, not difference , can assume different dataset have been drawn same distribution. can visualised through:

plot(ecdf(df.1$var1), col=2) lines(ecdf(df.2$var1))  plot(ecdf(df.1$var2), col=3) lines(ecdf(df.2$var2), col=4) 

but want consider if distributions between var3==0 , var3==1 differ when consider both var1 & var2 together. is there r package run such test when have multiple predictors

the similar question posed here, has not received answers

there appears literature: example 1 example 2

but nothing appears linked r


Comments

Popular posts from this blog

Fail to load namespace Spring Security http://www.springframework.org/security/tags -

sql - MySQL query optimization using coalesce -

unity3d - Unity local avoidance in user created world -