You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Right now, it requires nltk, torch, transformer_engine, as well as apex.
Installing transformer_engine does not work out of the box -- had to install out of box (on a A100).
Installing apex has similar problems, when using https://github.com/NVIDIA/apex?tab=readme-ov-file#linux
Given that the repo does not have some sample idx, bin files, one would expect the preprocess_data process to be relatively simple. Could this process be simplified?
....
....
If this is not desired, please set os.environ['TORCH_CUDA_ARCH_LIST'].
warnings.warn(
Emitting ninja build file /home/megauser/apex/build/temp.linux-x86_64-cpython-310/build.ninja...
Compiling objects...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/2] c++ -MMD -MF /home/megauser/apex/build/temp.linux-x86_64-cpython-310/csrc/mlp.o.d -pthread -B /home/megauser/.conda/envs/pre/compiler_compat -Wno-unused-result -Wsign-compare -DNDEBUG -fwrapv -O2 -Wall -fPIC -O2 -isystem /home/megauser/.conda/envs/pre/include -fPIC -O2 -isystem /home/megauser/.conda/envs/pre/include -fPIC -I/home/megauser/.conda/envs/pre/lib/python3.10/site-packages/torch/include -I/home/megauser/.conda/envs/pre/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/home/megauser/.conda/envs/pre/lib/python3.10/site-packages/torch/include/TH -I/home/megauser/.conda/envs/pre/lib/python3.10/site-packages/torch/include/THC -I/usr/local/cuda/include -I/home/megauser/.conda/envs/pre/include/python3.10 -c -c /home/megauser/apex/csrc/mlp.cpp -o /home/megauser/apex/build/temp.linux-x86_64-cpython-310/csrc/mlp.o -O3 -DVERSION_GE_1_1 -DVERSION_GE_1_3 -DVERSION_GE_1_5 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=mlp_cuda -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++17
/home/megauser/apex/csrc/mlp.cpp: In function ‘std::vector<at::Tensor> mlp_forward(int, int, std::vector<at::Tensor>)’:
/home/megauser/apex/csrc/mlp.cpp:57:21: warning: comparison of integer expressions of different signedness: ‘int’ and ‘long unsigned int’ [-Wsign-compare]
57 | for (int i = 0; i < num_layers; i++) {
| ~~^~~~~~~~~~~~
/home/megauser/apex/csrc/mlp.cpp:64:77: warning: ‘at::DeprecatedTypeProperties& at::Tensor::type() const’ is deprecated: Tensor.type() is deprecated. Instead use Tensor.options(), which in many cases (e.g. in a constructor) is a drop-in replacement. If you were using data from type(), that is now available from Tensor itself, so instead of tensor.type().scalar_type(), use tensor.scalar_type() instead and instead of tensor.type().backend() use tensor.device(). [-Wdeprecated-declarations]
64 | auto out = at::empty({batch_size, output_features.back()}, inputs[0].type());
| ^
In file included from /home/megauser/.conda/envs/pre/lib/python3.10/site-packages/torch/include/ATen/core/Tensor.h:3,
from /home/megauser/.conda/envs/pre/lib/python3.10/site-packages/torch/include/ATen/Tensor.h:3,
from /home/megauser/.conda/envs/pre/lib/python3.10/site-packages/torch/include/torch/csrc/autograd/function_hook.h:3,
from /home/megauser/.conda/envs/pre/lib/python3.10/site-packages/torch/include/torch/csrc/autograd/cpp_hook.h:2,
from /home/megauser/.conda/envs/pre/lib/python3.10/site-packages/torch/include/torch/csrc/autograd/variable.h:6,
from /home/megauser/.conda/envs/pre/lib/python3.10/site-packages/torch/include/torch/csrc/autograd/autograd.h:3,
from /home/megauser/.conda/envs/pre/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/autograd.h:3,
from /home/megauser/.conda/envs/pre/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/all.h:7,
from /home/megauser/.conda/envs/pre/lib/python3.10/site-packages/torch/include/torch/extension.h:5,
from /home/megauser/apex/csrc/mlp.cpp:1:
/home/megauser/.conda/envs/pre/lib/python3.10/site-packages/torch/include/ATen/core/TensorBody.h:225:30: note: declared here
225 | DeprecatedTypeProperties & type() const {
| ^~~~
The text was updated successfully, but these errors were encountered:
Your question
Can
tools/preprocess_data.py
be simplified?Using
Right now, it requires nltk, torch, transformer_engine, as well as apex.
Installing transformer_engine does not work out of the box -- had to install out of box (on a A100).
Installing apex has similar problems, when using https://github.com/NVIDIA/apex?tab=readme-ov-file#linux
Given that the repo does not have some sample
idx
,bin
files, one would expect thepreprocess_data
process to be relatively simple. Could this process be simplified?Installing apex
gives
The text was updated successfully, but these errors were encountered: