Remove causal masks cache #412

EricLBuehler · 2024-06-09T09:56:19Z

No description provided.

github-actions · 2024-06-09T09:57:21Z

Code Metrics Report

  ===============================================================================
 Language            Files        Lines         Code     Comments       Blanks
===============================================================================
 Dockerfile              1           34           25            0            9
 Happy                   1          442          369            0           73
 JSON                    9           21           21            0            0
 Python                 27          995          848           29          118
 TOML                   16          430          390            1           39
-------------------------------------------------------------------------------
 Jupyter Notebooks       1            0            0            0            0
 |- Markdown             1           60           30           22            8
 |- Python               1           96           87            1            8
 (Total)                            156          117           23           16
-------------------------------------------------------------------------------
 Markdown               16         1091            0          809          282
 |- BASH                 5          100           97            0            3
 |- Python               6          122          110            0           12
 |- Rust                 2           80           72            3            5
 (Total)                           1393          279          812          302
-------------------------------------------------------------------------------
 Rust                  109        33248        30125          552         2571
 |- Markdown            55          627           13          581           33
 (Total)                          33875        30138         1133         2604
===============================================================================
 Total                 181        36261        31778         1391         3092
===============================================================================

gregszumel · 2024-06-09T14:49:38Z

Looks good to me, I cannot reproduce the problem on this PR.

For context - I observed that memory was not being fully freed in inference, so a long request might take up a bunch of memory and not properly release it, even when shorter requests come through. This fixes that -- thanks @EricLBuehler!

EricLBuehler · 2024-06-09T16:31:27Z

@gregszumel thank you for confirming!

* Implement gpt2 gguf tokenizer * Fix unk tok calculation * Remove normalizer * Update gguf tokenizer * Allow adding unk token when found * Add unk token to builder if provided. * Improve add_special_tokens * Use tokenizerx builder * Add useful comment Co-authored-by: Brennan Kinney <5098581+polarathene@users.noreply.github.com> * Bump version to 0.1.16 (#404) * Bump version to 0.1.17 * Fix version bump * Add and update template READMEs (#405) * Add readmes * Fix typos * Improve Rust docs (#406) * Expose phi3v loader and remove unused deps (#408) * Support format for mixtral where experts are in one tensor (#355) * Normal loading metadata for vision models (#409) * Phi 3 vision ISQ support (#410) * ISQ support for phi3v * Document it * Remove causal masks cache (#412) * Fix: use new slice_assign (#415) * Use new slice_assign * Fix dead image links * Fix Phi-3 GGUF (#414) * Fix kv head usage * Fix rope weights * Clippy * Work on the gpt2 conversion * Add comment * Add some tests * Update readme --------- Co-authored-by: Brennan Kinney <5098581+polarathene@users.noreply.github.com>

Remove causal masks cache

1decc55

EricLBuehler merged commit dfb9dc5 into master Jun 9, 2024
10 of 11 checks passed

EricLBuehler deleted the rm_causal_masks_cache branch June 9, 2024 16:31

EricLBuehler added a commit that referenced this pull request Jun 10, 2024

Remove causal masks cache (#412)

4756196

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove causal masks cache #412

Remove causal masks cache #412

Remove causal masks cache #412

Remove causal masks cache #412

Conversation