-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Anyone using triton python backend? #633
Comments
Hello @YooSungHyun, as per triton-inference-server/server#5989 (comment), the issue seems to stem from the fact that |
@pacman100 thx, if you need something, i will help you. plz just tagging me🤗 |
if i use like this. model = AutoModelForCausalLM.from_pretrained(
optional_config["model_path"],
low_cpu_mem_usage=bool(strtobool(optional_config["low_cpu_mem_usage"])),
load_in_8bit=bool(strtobool(optional_config["load_in_8bit"])),
torch_dtype=getattr(torch, optional_config["torch_dtype"], None),
device_map=device_map,
)
self.model = PeftModel.from_pretrained(
model,
optional_config["aa_lora_weights"],
adapter_name=optional_config["aa_lora_name"],
device_map=device_map,
)
self.model.load_adapter(
optional_config["bb_lora_weights"],
adapter_name=optional_config["bb_lora_name"],
device_map=device_map,
)
model = torch.compile(model)
self.model.eval() it work fine |
Great! So, it works! |
@pacman100 but, this is weired... why important |
Torch.compile should always be at the end of all the preprocessing of the model. For example, when using DDP too, torch.compile is applied after wrapping the model in DDP. |
@pacman100 oh thx |
@pacman100 ah.. how about model.eval()
model = torch.compile(model) model = torch.compile(model)
model.eval() each correct? |
@pacman100 hello? |
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. |
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. |
System Info
python 3.10
tritonserver:23.05
torch 2.0.1
peft 0.3.0
transformers 4.29.2
Who can help?
No response
Information
Tasks
examples
folderReproduction
check, triton-inference-server/server#5989
i want to use PeftModel in tritonserver:23.05, that is not ocurring error, but don't working on....
Anyone else using it like me?
please, give me some tips 😭
Expected behavior
set_adapter is work good
The text was updated successfully, but these errors were encountered: