Hugging Face

Access thousands of models via Transformers

Using Hugging Face Transformers

Access thousands of open source models through the Hugging Face ecosystem.

#

Installation

bash
pip install transformers torch accelerate

#

Loading Models

python
from transformers import AutoTokenizer, AutoModelForCausalLM

model_name = "mistralai/Mistral-7B-Instruct-v0.2" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto")

inputs = tokenizer("Hello, how are you?", return_tensors="pt") outputs = model.generate(inputs, max_new_tokens=100) response = tokenizer.decode(outputs[0])

#

Optimization Techniques

  • Use 4-bit quantization with bitsandbytes
  • Enable Flash Attention 2 for faster inference
  • Implement model parallelism for large models
  • Cache models locally to reduce download time