forked from tmikolov/word2vec
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
tmikolov
committed
Aug 1, 2013
1 parent
9f78f72
commit 25b0cc6
Showing
1 changed file
with
12 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
make | ||
if [ ! -e text8 ]; then | ||
wget http://mattmahoney.net/dc/text8.zip -O text8.gz | ||
gzip -d text8.gz -f | ||
fi | ||
echo ---------------------------------------------------------------------------------------------------------------- | ||
echo Note that the accuracy and coverage of the test set questions is going to be low with this small training corpus | ||
echo To achieve better accuracy, larger training set is needed | ||
echo ---------------------------------------------------------------------------------------------------------------- | ||
time ./word2phrase -train text8 -output text8-phrase -threshold 500 -debug 2 -min-count 3 | ||
time ./word2vec -train text8-phrase -output vectors-phrase.bin -cbow 0 -size 300 -window 10 -negative 0 -hs 1 -sample 1e-3 -threads 12 -binary 1 -min-count 3 | ||
./compute-accuracy vectors-phrase.bin <questions-phrases.txt |