[go: nahoru, domu]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reorder data for RNN::Train() and other RNN functions #1192

Merged
merged 10 commits into from
Jan 16, 2018

Conversation

rcurtin
Copy link
Member
@rcurtin rcurtin commented Jan 11, 2018

This is related to a discussion from long ago, but I am not sure I am finding the right ticket to reference: #513, #831, #541? Anyway, if I am remembering right, there was some discussion about whether to use mat or cube to represent input to neural networks. It's completely possible my views have changed over time and I don't realize the change, but to me now it seems the best way is to have mat be the input to FFNs (i.e. one column per point, one row per dimension), but cube be the input to RNNs (i.e. one column per point, one row per dimension, one slice per timestep).

This PR implements that support and adds some documentation to the RNN class about how the data is expected to be formatted. There are also a couple of other warning fixes that I noticed and handled.

One of the important reasons for doing this change is that internally, at each time step, the layer expects a matrix of shape (n_dims x n_points). But RNN data is currently embedded in a matrix of shape ((n_dims * n_timesteps) * n_points), with a single column containing all of the time steps for a given point. This means that when the batch size is greater than 1 (which it typically will be after #1137) we have to extract a non-contiguous submatrix, which Armadillo will do as a copy. If we reorder our data into a cube where each timestep is a separate slice, this allows us to avoid that copy. I also think the cube structure is a easier to understand for a user, but we can change that around if there are other opinions.

This doesn't solve the problem of how to pass sequences of different length to RNNs---you still must either pass them individually, or zero-pad them all into a cube and pass that. That problem can be solved some other day...

{
// Generate ordering.
arma::uvec ordering = arma::shuffle(arma::linspace<arma::uvec>(0,
inputPoints.n_cols - 1, inputPoints.n_cols));
// std::cout << "ordering:\n" << ordering.t();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we can remove the debug message here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, fixed.

inputPoints.n_cols, true);
LabelsType newOutputLabels(inputLabels.n_elem);
for (size_t i = 0; i < inputLabels.n_elem; ++i)
newOutputLabels[ordering[i]] = inputLabels[i];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use a non-contiguous view here, to avoid the for loop? It's probably the same.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I didn't think of that---fixed.

@@ -631,6 +631,8 @@ BOOST_AUTO_TEST_CASE(SparseShuffleTest)
data(0, i) = i;
labels[i] = i;
}
// This appears to be a necessary workaround for an Armadillo 8 bug.
data *= 1.0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, perhaps we should update the matrix build, maybe we already test against armadillo 8?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm in the process of updating the matrix build, but it's not ready yet. I haven't isolated the Armadillo bug yet, but I'll either open an issue or send a fix to Conrad.

output.slice(j) = outputSlice;
}

// std::cout << "label:\n" << label << "\noutput:" << output
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like another debug output.

for (size_t j = 0; j < output.n_slices; ++j)
{
arma::mat outputSlice = output.slice(j);
data::Binarize(outputSlice, outputSlice, 0.5);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think we should add cube support, might be helpful to postprocess the data. If you agree, this could be a good beginner task.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think it would be a nice beginner task to add. I'll go ahead and write up an issue for it. I think no new code is necessary; just the existing Binarize() can be adapted for this case, then this code can be simplified.

Copy link
Member
@zoq zoq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks ready for me.

@rcurtin rcurtin merged commit fbc59ee into mlpack:master Jan 16, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants