Hi @MrAE, @jbrowne6 and @falkben
Just pinging the ppl that seemed to touch these specific LOC.
I know you guys don't maintain this code anymore and have moved on, but I had a quick question in terms of what a specific line is doing. I was wondering if you could provide a quick answer (if you happened to write this part) to make sure I'm interpreting correctly. FYI: I have ported the code to cython and once this issue is resolved, I think we can safely move on :)
In
|
inline void randMatTernary(std::vector<weightedFeature>& featuresToTry){ |
|
int rndMtry; |
|
int rndFeature; |
|
int rndWeight; |
|
int mtryDensity = (int)((double)fpSingleton::getSingleton().returnMtry() * fpSingleton::getSingleton().returnMtryMult()); |
|
for (int i = 0; i < mtryDensity; ++i){ |
|
rndMtry = randNum->gen(fpSingleton::getSingleton().returnMtry()); |
|
rndFeature = randNum->gen(fpSingleton::getSingleton().returnNumFeatures()); |
|
featuresToTry[rndMtry].returnFeatures().push_back(rndFeature); |
|
rndWeight = (randNum->gen(2)%2) ? 1 : -1; |
|
assert(rndWeight==1 || rndWeight==-1); |
|
featuresToTry[rndMtry].returnWeights().push_back(rndWeight); |
|
} |
|
} |
|
|
are you sampling without replacement the feature index? It looks like
rndFeature = randNum->gen(fpSingleton::getSingleton().returnNumFeatures()); can generate a random feature index, but is it possible to have a duplicate?
For example, say you have data with 4 columns, then maybe SPORF will sample a projection of:
indices = [0, 2, 0]
weights = [1, -1, 1]
Note that this in turn isn't a sparse linear combination with only +/- 1's, but now has a +2, -1 weight when doing the linear combination. Or is this function guaranteed to not have duplicates in its sampling of the projection matrix?
Hi @MrAE, @jbrowne6 and @falkben
Just pinging the ppl that seemed to touch these specific LOC.
I know you guys don't maintain this code anymore and have moved on, but I had a quick question in terms of what a specific line is doing. I was wondering if you could provide a quick answer (if you happened to write this part) to make sure I'm interpreting correctly. FYI: I have ported the code to cython and once this issue is resolved, I think we can safely move on :)
In
SPORF/packedForest/src/forestTypes/binnedTree/processingNodeBin.h
Lines 99 to 113 in a7a3c7e
rndFeature = randNum->gen(fpSingleton::getSingleton().returnNumFeatures());can generate a random feature index, but is it possible to have a duplicate?For example, say you have data with 4 columns, then maybe SPORF will sample a projection of:
Note that this in turn isn't a sparse linear combination with only +/- 1's, but now has a +2, -1 weight when doing the linear combination. Or is this function guaranteed to not have duplicates in its sampling of the projection matrix?