-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reorder data for RNN::Train() and other RNN functions #1192
Conversation
Also handle the case where the shuffle is being done in-place.
{ | ||
// Generate ordering. | ||
arma::uvec ordering = arma::shuffle(arma::linspace<arma::uvec>(0, | ||
inputPoints.n_cols - 1, inputPoints.n_cols)); | ||
// std::cout << "ordering:\n" << ordering.t(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like we can remove the debug message here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, fixed.
inputPoints.n_cols, true); | ||
LabelsType newOutputLabels(inputLabels.n_elem); | ||
for (size_t i = 0; i < inputLabels.n_elem; ++i) | ||
newOutputLabels[ordering[i]] = inputLabels[i]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you use a non-contiguous view here, to avoid the for loop? It's probably the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I didn't think of that---fixed.
@@ -631,6 +631,8 @@ BOOST_AUTO_TEST_CASE(SparseShuffleTest) | |||
data(0, i) = i; | |||
labels[i] = i; | |||
} | |||
// This appears to be a necessary workaround for an Armadillo 8 bug. | |||
data *= 1.0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting, perhaps we should update the matrix build, maybe we already test against armadillo 8?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm in the process of updating the matrix build, but it's not ready yet. I haven't isolated the Armadillo bug yet, but I'll either open an issue or send a fix to Conrad.
output.slice(j) = outputSlice; | ||
} | ||
|
||
// std::cout << "label:\n" << label << "\noutput:" << output |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like another debug output.
for (size_t j = 0; j < output.n_slices; ++j) | ||
{ | ||
arma::mat outputSlice = output.slice(j); | ||
data::Binarize(outputSlice, outputSlice, 0.5); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think we should add cube
support, might be helpful to postprocess the data. If you agree, this could be a good beginner task.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think it would be a nice beginner task to add. I'll go ahead and write up an issue for it. I think no new code is necessary; just the existing Binarize()
can be adapted for this case, then this code can be simplified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks ready for me.
This is related to a discussion from long ago, but I am not sure I am finding the right ticket to reference: #513, #831, #541? Anyway, if I am remembering right, there was some discussion about whether to use
mat
orcube
to represent input to neural networks. It's completely possible my views have changed over time and I don't realize the change, but to me now it seems the best way is to havemat
be the input to FFNs (i.e. one column per point, one row per dimension), butcube
be the input to RNNs (i.e. one column per point, one row per dimension, one slice per timestep).This PR implements that support and adds some documentation to the
RNN
class about how the data is expected to be formatted. There are also a couple of other warning fixes that I noticed and handled.One of the important reasons for doing this change is that internally, at each time step, the layer expects a matrix of shape (n_dims x n_points). But RNN data is currently embedded in a matrix of shape ((n_dims * n_timesteps) * n_points), with a single column containing all of the time steps for a given point. This means that when the batch size is greater than 1 (which it typically will be after #1137) we have to extract a non-contiguous submatrix, which Armadillo will do as a copy. If we reorder our data into a cube where each timestep is a separate slice, this allows us to avoid that copy. I also think the cube structure is a easier to understand for a user, but we can change that around if there are other opinions.
This doesn't solve the problem of how to pass sequences of different length to RNNs---you still must either pass them individually, or zero-pad them all into a cube and pass that. That problem can be solved some other day...