Reorder data for RNN::Train() and other RNN functions #1192

rcurtin · 2018-01-11T16:58:10Z

This is related to a discussion from long ago, but I am not sure I am finding the right ticket to reference: #513, #831, #541? Anyway, if I am remembering right, there was some discussion about whether to use mat or cube to represent input to neural networks. It's completely possible my views have changed over time and I don't realize the change, but to me now it seems the best way is to have mat be the input to FFNs (i.e. one column per point, one row per dimension), but cube be the input to RNNs (i.e. one column per point, one row per dimension, one slice per timestep).

This PR implements that support and adds some documentation to the RNN class about how the data is expected to be formatted. There are also a couple of other warning fixes that I noticed and handled.

One of the important reasons for doing this change is that internally, at each time step, the layer expects a matrix of shape (n_dims x n_points). But RNN data is currently embedded in a matrix of shape ((n_dims * n_timesteps) * n_points), with a single column containing all of the time steps for a given point. This means that when the batch size is greater than 1 (which it typically will be after #1137) we have to extract a non-contiguous submatrix, which Armadillo will do as a copy. If we reorder our data into a cube where each timestep is a separate slice, this allows us to avoid that copy. I also think the cube structure is a easier to understand for a user, but we can change that around if there are other opinions.

This doesn't solve the problem of how to pass sequences of different length to RNNs---you still must either pass them individually, or zero-pad them all into a cube and pass that. That problem can be solved some other day...

… a cube.

Also handle the case where the shuffle is being done in-place.

zoq · 2018-01-11T20:24:15Z

src/mlpack/core/math/shuffle_data.hpp

 {
  // Generate ordering.
  arma::uvec ordering = arma::shuffle(arma::linspace<arma::uvec>(0,
      inputPoints.n_cols - 1, inputPoints.n_cols));
+//  std::cout << "ordering:\n" << ordering.t();


Looks like we can remove the debug message here.

Oops, fixed.

zoq · 2018-01-11T20:28:36Z

src/mlpack/core/math/shuffle_data.hpp

+        inputPoints.n_cols, true);
+    LabelsType newOutputLabels(inputLabels.n_elem);
+    for (size_t i = 0; i < inputLabels.n_elem; ++i)
+      newOutputLabels[ordering[i]] = inputLabels[i];


Can you use a non-contiguous view here, to avoid the for loop? It's probably the same.

Ah, I didn't think of that---fixed.

zoq · 2018-01-11T20:37:36Z

src/mlpack/tests/math_test.cpp

@@ -631,6 +631,8 @@ BOOST_AUTO_TEST_CASE(SparseShuffleTest)
    data(0, i) = i;
    labels[i] = i;
  }
+  // This appears to be a necessary workaround for an Armadillo 8 bug.
+  data *= 1.0;


Interesting, perhaps we should update the matrix build, maybe we already test against armadillo 8?

I'm in the process of updating the matrix build, but it's not ready yet. I haven't isolated the Armadillo bug yet, but I'll either open an issue or send a fix to Conrad.

zoq · 2018-01-11T20:39:43Z

src/mlpack/tests/recurrent_network_test.cpp

+        output.slice(j) = outputSlice;
+      }
+
+//      std::cout << "label:\n" << label << "\noutput:" << output


Looks like another debug output.

zoq · 2018-01-11T20:41:53Z

src/mlpack/tests/recurrent_network_test.cpp

+      for (size_t j = 0; j < output.n_slices; ++j)
+      {
+        arma::mat outputSlice = output.slice(j);
+        data::Binarize(outputSlice, outputSlice, 0.5);


Do you think we should add cube support, might be helpful to postprocess the data. If you agree, this could be a good beginner task.

Yeah, I think it would be a nice beginner task to add. I'll go ahead and write up an issue for it. I think no new code is necessary; just the existing Binarize() can be adapted for this case, then this code can be simplified.

zoq

Looks ready for me.

rcurtin added 9 commits January 11, 2018 09:57

Reorder the data passed to RNNs so that each time slice is a slice of…

670494f

… a cube.

Add ShuffleData() for cubes and corresponding tests.

f7fbda3

Also handle the case where the shuffle is being done in-place.

Fix -Wreorder.

b5345e1

Fix -Wunused.

0d2b587

Remove debugging code.

3b48e7d

Add tools for checking differences between cubes.

680590c

Adapt tests to new RNN input format.

d5fb14c

Merge remote-tracking branch 'upstream/master' into rnn-reorder

3120ae4

Fix -Wreorder.

3855d99

zoq reviewed Jan 11, 2018

View reviewed changes

Fix minor issues and clean up ShuffleData().

dc141e7

zoq approved these changes Jan 12, 2018

View reviewed changes

rcurtin merged commit fbc59ee into mlpack:master Jan 16, 2018

rcurtin mentioned this pull request Jan 16, 2018

Memory leaks detected by Boost tests #1187

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reorder data for RNN::Train() and other RNN functions #1192

Reorder data for RNN::Train() and other RNN functions #1192

Reorder data for RNN::Train() and other RNN functions #1192

Reorder data for RNN::Train() and other RNN functions #1192

Conversation

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment