r - scatter plot for a multiclass dataset with class imbalance and class overlapping -


i'm using weka develop classifier detecting semantic relations. lets supose have multiclass dataset. dataset, @ first, contains 4 numeric features (could on 4) , class attribute, valid class attribute value "hypernym", "synonym" or "no", i.e., 3 classes. so, examples of instances be:

   feat1   feat2   feat3   feat4   class     ....     0.32    0.45    0.15      5       no     0.26    0.48    0.93     20       hyper     0.65    0.32    0.43     13       no     0.43    0.19    0.89     45       syn     ... 

this typical classification problem. however, must consider dataset inflicted class imbalance problem (it problem in machine learning total number of class of data (positive) far less total number of class of data (negative)) , class overlapping (examples of different classes have similar characteristics).

the question is: how can represent each instance in graph 2d, in way can visualize degree of overlapping between classes?

i have found a picture illustrates possible example of graph, scatter plot. however, don't know how plot this.

is there easy way make figure similar, in r or using weka?

you can use multidimensional scaling (mds) first, reduce dimension of data , plot it. method tries preserve distances between points when projecting lower dimension.

here example in r iris dataset

data <- iris colors <- as.integer(as.factor(data$species)) d <- dist(data[,1:4]) fit <- cmdscale(d,k=2)# k resulting dimension x <- fit[,1] y <- fit[,2] plot(x, y, xlab="coordinate 1", ylab="coordinate 2", main="mds", pch=19, col=colors) 

enter image description here

or reduce 3 dimensions , plot using scatterplot3d library.

fit <- cmdscale(d,k=3)# k resulting dimension x <- fit[,1] y <- fit[,2] z <- fit[,3] scatterplot3d(x,y,z, color = colors, pch = 19) 

enter image description here

about class imbalance problem, don't know how represent in scatter plot. maybe increasing size of points minority classes.


Comments

Popular posts from this blog

javascript - oscilloscope of speaker input stops rendering after a few seconds -

javascript - gulp-nodemon - nodejs restart after file change - Error: listen EADDRINUSE events.js:85 -

Fatal Python error: Py_Initialize: unable to load the file system codec. ImportError: No module named 'encodings' -