[BugFix] gemma loading after quantization or LoRA. (#3553)

This commit is contained in:
Taemin Lee 2024-03-22 05:16:57 +09:00 committed by GitHub
parent c188ecb080
commit b7050ca7df
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

View File

@ -340,6 +340,10 @@ class GemmaForCausalLM(nn.Module):
weight_loader(param, loaded_weight, shard_id)
break
else:
# lm_head is not used in vllm as it is tied with embed_token.
# To prevent errors, skip loading lm_head.weight.
if "lm_head.weight" in name:
continue
# Skip loading extra bias for GPTQ models.
if name.endswith(".bias") and name not in params_dict:
continue