[go: nahoru, domu]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

build(node): Include binaries in NPM packing #1459

Closed
wants to merge 2 commits into from

Conversation

aaronclong
Copy link
@aaronclong aaronclong commented Feb 25, 2024

https://docs.npmjs.com/cli/v10/configuring-npm/package-json#files
https://github.blog/changelog/2023-09-22-github-actions-transitioning-from-node-16-to-node-20/
https://github.com/actions-rs/toolchain
#1403

Background


The native modules don't seem packed and published with the node build. There seem to be missing entries in the package.json file. This PR adds all the permutations from the index.js file. However, this could be cleaned up with future glob patterns, "glob pattern (, **/, and such) will make it so that file is included in the tarball when it's packed".

NPM Pack Results

Using the npm pack command, I could confirm the binaries are packed when running the build locally. Github doesn't allow the sharing of .tgz files in PRs. However, the CLI output has been included below.

PR's Pack Output

npm notice 
npm notice 📦  tokenizers@0.14.0-dev0
npm notice === Tarball Contents === 
npm notice 1.1kB  LICENSE                      
npm notice 2.0kB  README.md                    
npm notice 9.7kB  index.d.ts                   
npm notice 10.9kB index.js                     
npm notice 3.9kB  package.json                 
npm notice 7.7MB  tokenizers.linux-x64-gnu.node
npm notice === Tarball Details === 
npm notice name:          tokenizers                              
npm notice version:       0.14.0-dev0                             
npm notice filename:      tokenizers-0.14.0-dev0.tgz              
npm notice package size:  2.6 MB                                  
npm notice unpacked size: 7.7 MB                                  
npm notice shasum:        bba97857358d036d155ddb44aa1ddd3cd5629c6f
npm notice integrity:     sha512-sEodSqm4VaQXp[...]SoeVMMXnoT83Q==
npm notice total files:   6                                       
npm notice 
tokenizers-0.14.0-dev0.tgz

Main's Pack Output

npm notice 
npm notice 📦  tokenizers@0.14.0-dev0
npm notice === Tarball Contents === 
npm notice 1.1kB  LICENSE     
npm notice 2.0kB  README.md   
npm notice 9.7kB  index.d.ts  
npm notice 10.9kB index.js    
npm notice 3.0kB  package.json
npm notice === Tarball Details === 
npm notice name:          tokenizers                              
npm notice version:       0.14.0-dev0                             
npm notice filename:      tokenizers-0.14.0-dev0.tgz              
npm notice package size:  6.8 kB                                  
npm notice unpacked size: 26.7 kB                                 
npm notice shasum:        c54d1095caf5cb8681d9d06fe0bbe92176de3377
npm notice integrity:     sha512-jiDw2etLo5N9X[...]Bixc8Kgp/1TnA==
npm notice total files:   5                                       
npm notice 
tokenizers-0.14.0-dev0.tgz

Github Action Changes

In addition to all the above, I notice most of your actions for the release are deprecated. Github is switching to node20 for actions, and the legacy version might not fully support it.

One thing I couldn't update was the rust init step. The action, actions-rs/toolchain, being used is deprecated and archived.

@aaronclong
Copy link
Author

@ArthurZucker could I be annoying and ask you to review this, please and thank you? I totally understand if this isn't a priority right now though.

@aaronclong
Copy link
Author
aaronclong commented Feb 28, 2024

@Narsil I see you are also an active maintainer in this repository. Could I bother you to take a look at this PR?

I'm sorry if I am being pushy at all, but would love to fix this problem for my own uses-cases

@anupamamurthi
Copy link

Wow, I seem to be running into the exact same issue -

 Cannot find module 'tokenizers-linux-x64-gnu' from 'index.js'
      
      at Resolver.resolveModule (node_modules/jest-resolve/build/index.js:169:17)
      at Object.<anonymous> (node_modules/tokenizers/index.js:178:31)

Can someone help review this PR and merge it?

@aaronclong
Copy link
Author

@anupamamurthi I'm going to give it a few days, but I might publish a fork until the maintainers start reviewing PRs again.

I'm wonder if this repo is being actively maintained. I've tried several avenues to get someone to review it and the others pending. If you know of anything @anupamamurthi, I'm all ears.

@anupamamurthi
Copy link

hey @aaronclong let me know if u publish a fork, I will be happy to give your node package a try....

@anupamamurthi
Copy link

@aaronclong do u have a node package ready/handy?

@aaronclong
Copy link
Author

@aaronclong do u have a node package ready/handy?

@anupamamurthi I do not at the moment. I was planning to try and work on that this weekend when time allows

@aaronclong
Copy link
Author

Giving this one more shot, @ArthurZucker or @Narsil would it be possible to review this PR? I notice another PR was merged only two days ago. I'm very open to any and all feedback. I just personally want to resolve this issue for my own needs. Thanks again and please let me know.

@loretoparisi
Copy link
loretoparisi commented Mar 15, 2024

Getting the same error on both darwin and linux:

Error: Cannot find module 'tokenizers-darwin-arm64'
Require stack:

@Narsil any update on this issue? 🙏🏾

@aaronclong
Copy link
Author

@anupamamurthi and @loretoparisi it still requires a lot of work, but I was able to publish a fork!!!

As time allows, I'm going to clean up the node code here. My future plan is to pull the tokenizer rust from cargo, and do the napi-rust binding in this repo. Likewise, I would love to try and do a WASM build.

However, you are welcoming to use it as it stands now. I can't speak to how functional it is because just focused on the build fixes.

https://github.com/turingscript/tokenizers
https://www.npmjs.com/package/@turingscript/tokenizers

@github-actions github-actions bot added the Stale label Apr 17, 2024
@github-actions github-actions bot closed this Apr 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants