Alternatives to Restricted Boltzmann Machine for vector data (instead of binary)
I have a very large corpus with each element consisting of a large amount of high dimensional data. Elements are constantly being added to the corpus. Potentially, only a portion of the corpus needs to be considered for each interaction. Elements are labeled, potentially with multiple labels and weights associated with the strength of those labels. The data is not sparse as far as I understand.
The input data is a set of parameters in the range of -1...1 between around (10-1000) inputs. This may be somewhat flexible depending on what machine learning method is most appropriate.
I am targeting high end smart phone devices. Ideally the processing could be done on the same device but I'm open to the possibility of transmitting it to a modest server.
What would be an appropriate machine learning approach for this kind of situation?
I've been reading about random forrest decision trees, restricted boltzmann machines, deep learning boltzmann machines etc, but I could really use the advice of an experienced hand to direct me towards a few approaches to research that would work well give the conditions.
If my description seems wonky please let me know as I am still getting to grips with the ideas and may be fundamentally misunderstanding some aspect.
Try using the simplest k-nearest neighbor algorithm. You can use a Manhattan distance function to attain a quick distance function. You then can take a weighted average or majority class based on the nearest points.
This is also similar to kernel regression. I would suggest using a data structure such as a kd tree to efficiently store your points.
链接地址: http://www.djcxy.com/p/68604.html上一篇: 如何测试受限玻尔兹曼机器的实施?