I need to make a SVM in weka to filter documents using Java
I am an absolute beginner. Never made a classifier or anything in weka using Java I have used the interface before. Basically I am kind of lost I've looked at the filter class for weka and played around with it a little. My documents are text documents and I need to separate them into 2 categories.
I'm not sure how I define the categories or how I load the documents into an IDE to be classified
:-(
Any help/tutorials or pointers would be greatly appreciated.
I found this java tutorial very helpful, although there are very few resources online available (that I have found)
http://www.cs.waikato.ac.nz/ml/weka/index_documentation.html
hope this helps
Using weka for the first time is a pain, but you will need to go through it.
Also, I tried out weka, but I had to dump it due to JVM out of memory exceptions. I wrote my own small clustering algo using Ruby, it's performance was way better.
Any way, here is how to use SVM in WEKA:
You can follow this tutorial of how to use SVM in weka: www.stat.nctu.edu.tw/~misg/WekaInC.ppt
Now, you will need data in ARFF format (and I recommend you use this, as per my exp, it helps, data looks more structured from WEKA's prespective). So, you can do that using XML2ARFF-Converter which I wrote for my self. You can modify it to read text files and convert your text file to ARFF.