Skip to content

Getting Started with Pull-through Caching

This tutorial walks you through setting up a pull-through cache for Hugging Face Hub content. By the end, you'll be able to download models through your Pulp instance with automatic caching.

Prerequisites

  • A running Pulp instance with the pulp_hugging_face plugin installed
  • Access to the Pulp REST API (admin credentials or appropriate permissions)
  • (Optional) A Hugging Face token for accessing private repositories

Create a Hugging Face Remote

First, create a remote that points to Hugging Face Hub. The on_demand policy enables pull-through caching - content will be fetched only when requested.

# Create a remote with pull-through caching enabled
curl -X POST https://your-pulp-instance.com/pulp/api/v3/remotes/hugging_face/hugging-face/ \
  -H "Content-Type: application/json" \
  -u admin:password \
  -d '{
    "name": "hf-remote",
    "url": "https://huggingface.co",
    "policy": "on_demand"
  }'
{
  "pulp_href": "/pulp/api/v3/remotes/hugging_face/hugging-face/a1b2c3d4-e5f6-7890-abcd-ef1234567890/",
  "pulp_created": "2024-01-15T10:30:00.000000Z",
  "name": "hf-remote",
  "url": "https://huggingface.co",
  "policy": "on_demand",
  "hf_token": null
}

Remote Configuration Options

  • name: A unique identifier for this remote
  • url: Base URL for Hugging Face Hub (default: https://huggingface.co)
  • policy: Set to on_demand for pull-through caching
  • hf_token: Optional authentication token for private repositories

Create a Distribution

A distribution defines how content is served to clients. When configured with a remote, it enables pull-through caching.

# Get the remote href from the previous step
REMOTE_HREF="/pulp/api/v3/remotes/hugging_face/hugging-face/a1b2c3d4-e5f6-7890-abcd-ef1234567890/"

# Create a distribution with the remote
curl -X POST https://your-pulp-instance.com/pulp/api/v3/distributions/hugging_face/hugging-face/ \
  -H "Content-Type: application/json" \
  -u admin:password \
  -d "{
    \"name\": \"hf-cache\",
    \"base_path\": \"huggingface\",
    \"remote\": \"$REMOTE_HREF\"
  }"
{
  "pulp_href": "/pulp/api/v3/distributions/hugging_face/hugging-face/f1e2d3c4-b5a6-7890-abcd-ef0987654321/",
  "pulp_created": "2024-01-15T10:35:00.000000Z",
  "name": "hf-cache",
  "base_path": "huggingface",
  "base_url": "https://your-pulp-instance.com/pulp/content/huggingface/",
  "remote": "/pulp/api/v3/remotes/hugging_face/hugging-face/a1b2c3d4-e5f6-7890-abcd-ef1234567890/"
}

Access Hugging Face Content

Now you can access Hugging Face content through your Pulp instance. The first request will fetch from Hugging Face Hub; subsequent requests are served from cache.

Download a Model File

# Download a model configuration file
curl -O http://your-pulp-instance/pulp/content/huggingface/microsoft/DialoGPT-medium/resolve/main/config.json
{
  "architectures": ["GPT2LMHeadModel"],
  "bos_token_id": 50256,
  "eos_token_id": 50256,
  "model_type": "gpt2",
  "n_ctx": 1024,
  "n_embd": 1024,
  "n_head": 16,
  "n_layer": 24,
  "vocab_size": 50257
}

Query Model Metadata

# Get model metadata via the API proxy
curl http://your-pulp-instance/pulp/content/huggingface/api/models/microsoft/DialoGPT-medium
{
  "id": "microsoft/DialoGPT-medium",
  "modelId": "microsoft/DialoGPT-medium",
  "author": "microsoft",
  "sha": "8bada3b4...",
  "pipeline_tag": "text-generation",
  "tags": ["pytorch", "gpt2", "text-generation", "conversational"],
  "downloads": 500000,
  "library_name": "transformers"
}

List Repository Files

# List files in a model repository
curl http://your-pulp-instance/pulp/content/huggingface/api/models/bert-base-uncased/tree/main
[
  {"path": "config.json", "type": "file", "size": 570},
  {"path": "pytorch_model.bin", "type": "file", "size": 440473133},
  {"path": "tokenizer.json", "type": "file", "size": 466062},
  {"path": "tokenizer_config.json", "type": "file", "size": 28},
  {"path": "vocab.txt", "type": "file", "size": 231508}
]

Using with Hugging Face CLI

You can configure the huggingface-cli to use your Pulp instance as the endpoint:

# Set the HF endpoint to your Pulp distribution
export HF_ENDPOINT="http://your-pulp-instance/pulp/content/huggingface"

# Download a model using the HF CLI
huggingface-cli download hf-internal-testing/tiny-random-bert

# Or use it with the transformers library
python -c "
from transformers import AutoModel
model = AutoModel.from_pretrained('bert-base-uncased')
"

Accessing Private Repositories

For private Hugging Face repositories, create a remote with your HF token:

curl -X POST https://your-pulp-instance.com/pulp/api/v3/remotes/hugging_face/hugging-face/ \
  -H "Content-Type: application/json" \
  -u admin:password \
  -d '{
    "name": "hf-private",
    "url": "https://huggingface.co",
    "policy": "on_demand",
    "hf_token": "hf_YOUR_TOKEN_HERE"
  }'

Security Note

Store your Hugging Face tokens securely. Consider using environment variables or a secrets manager rather than hardcoding tokens in scripts.

Next Steps