I have this function, which either calls a GGUF or GPTQ model.
# Loading LLM model
def load_llm(temperature: float = 0.8,
top_p: float = 0.95,
top_k: int = 40,
max_new_tokens: int = 1000,
context_length: int = 6000,
repetition_penalty: float = 1.1):
model_dir = Path.cwd() / 'model'
model_name_gguf="openhermes-2.5-mistral-7b.Q4_K_M.gguf"
model_name_gptq = 'Mistral-7B-Instruct-v0.2-DARE-GPTQ'
# Check if the gguf model exists
if os.path.isfile(model_dir / model_name_gguf):
model_name = model_name_gguf
# model path
model_path = model_dir / model_name
# If not, check if the gptq model exists
elif os.path.isfile(model_dir / model_name_gptq):
model_name = model_name_gptq
# model path
model_path = model_dir / model_name
print(model_path)
else:
raise Exception("No valid model found in the model directory")
model = AutoModelForCausalLM.from_pretrained(
str(model_path),
model_type="mistral",
gpu_layers=50,
temperature=temperature, # default is 0.8
top_p = top_p,
top_k = top_k, # default is 40
max_new_tokens = max_new_tokens,
context_length = context_length,
repetition_penalty=repetition_penalty)
print(f"Using loaded model: {model_name}")
return model
The model files for the GPTQ are in a folder called Mistral-7B-Instruct-v0.2-DARE-GPTQ
, whereas the GGUF model are just a GGUF file openhermes-2.5-mistral-7b.Q4_K_M.gguf
. I get an error when I try to load the GPTQ, and I suspect that it is because I’m referring to a folder and not a model file. The folder for GPTQ contains:
config.json
model.safetensors
quantize_config.json
special_tokens_map.json
tokenizer_config.json
tokenizer.json
tokenizer.model
Do any of you know how I would refer to the GPTQ model correctly? Thanks in advance!!