chinchilla 0.1.4

Last updated: March 3, 2025

0 purchases

Free

Creator: codyrutscher

Languages

Python

Description:

chinchilla 0.1.4

chinchilla
chinchilla is a research toolkit designed to estimate scaling laws and train compute-optimal models for various deep learning tasks.

Expected Use Cases:

Researching the neural scaling law itself
Scaling compute for

Large Language Models (LLM)
Vision Transformers (ViT)
Reinforcement Learning (RL)
Embedding Models
Knowledge distillation

Evaluating compute efficiencies of new algorithms & architectures
Researching the neural scaling law itself

Probably Not For:

Fine-tuning tasks

Data-scarce domains

[!IMPORTANT]
This work builds upon the scaling law formulation proposed in the original Chinchilla paper by DeepMind (2022),
with some modifications detailed in ./docs/changes.md.

Features

Scaling Law Estimation: Fit a loss predictor based on multiple training runs.
Compute-Optimal Allocation: Train the best possible model within a given compute budget.
Progressive Scaling: Iteratively update the scaling law estimation and scale up the compute.
Simulation Mode: Test scaling law estimations in hypothetical scenarios.

Basics
Definitions

N: The number of parameters
D: The number of data samples
C: Total compute in FLOPs (C≈6 ND)
L(N, D)=E+A/Nα+B/Dβ: A loss predictor parameterized by E,A,B,α,β and C

Compute-Optimal Allocation

Optimize the parameters E,A,B,α,β to better predict losses Li from (Ni,Di)
Solve argminN, D L(N, D | C), which can be derived from A,B,α,β

chinchilla Procedure

seed: Sample X training runs (Ni,Di,Li), referred to as seeds
For i = 0 to K:

fit: Optimize the scaling law parameters to fit L(N, D) on the training runs
scale: Configure a new model with a scaled compute
Evaluate the allocation by training a model
append: Add the result to the database of training runs

Installation

[!WARNING]
chinchilla requires Python >= 3.8

From Source (Recommended for Customization)
git clone https://github.com/kyo-takano/chinchilla.git
cd chinchilla
pip install -e .

From PyPI
pip install -U chinchilla

Usage
Below is an example to get started with chinchilla.
import numpy as np
from chinchilla import Chinchilla

cc = Chinchilla(
"your_project__dir",
param_grid=dict(
E=np.linspace(1.1, 1.5, 5),
A=np.linspace(200, 1000, 5),
B=np.linspace(200, 1000, 5),
alpha=np.linspace(0.1, 0.5, 5),
beta=np.linspace(0.1, 0.5, 5),
),
seed_ranges=dict(C=(1e15, 1e16), N_to_D=(10, 100)),
# To search for the model configuration with N closest to suggested:
model_search_config=dict(
hyperparam_grid=dict(
hidden_size=list(range(64, 16384 + 1, 64)),
num_hidden_layers=list(range(1, 50 + 1)),
num_heads=list(range(1, 40 + 1)),
),
size_estimator=estimate_model_size, # You gotta define a function to estimate & return model size also
),
# Parameters you may pre-set
num_seeding_steps=100,
scaling_factor=2.0,
)

# Run the scaling law estimation and training process
for i in range(100 + 5):
# Sample a new model
(N, D), model_config = cc.step(num_seeding_steps=100)

# Define a model
model = YourModelClass(**model_config)

# Train & evaluate the allocation C => (N, D)
loss = train_and_evaluate(model, D)

# Finally, append the training run into the database
cc.append(N=N, D=D, loss=loss)

Ensure you define functionally equivalent versions of:

estimate_model_size: Estimates and returns the model size.
YourModelClass: Your model class definition.
train_and_evaluate: Function to train and evaluate your model.

Simulation
You can also visualize how chinchilla would perform under the given setup and a hypothetical scaling law, optionally with a noise term:
import random

cc.simulate(
num_seeding_steps=401,
num_scaling_steps=1,
scaling_factor=10.0,
target_params=dict(
E=1.69337368,
A=406.401018,
B=410.722827,
alpha=0.33917084,
beta=0.2849083
),
# Add exponentially distributed loss averaging at 0.1
noise_generator=(random.expovariate, (10,))
)

Please see API Reference for more.
Examples
Find a practical application of chinchilla in the examples directory (more to come):

Training Compute-Optimal Rubik's Cube Solvers (100 PetaFLOPs)

Documentation
For a detailed API Reference, tips, differences from the original Chinchilla paper, etc., please browse to ./docs.
Contributing
We welcome your contributions.
Please report bugs and suggest improvements through new issues and pull requests.

License:

For personal and professional use. You cannot resell or redistribute these repositories in their original state.

Files In This Product: (if this is empty don't purchase this product)

There are no reviews.

zed

chinchilla 0.1.4

Languages

Categories

Description:

License:

Share

Files In This Product: (if this is empty don't purchase this product)

Overview

What you can do with it

What you can't do with it

Related Products

Views For YouTube Bot writed on Python

AI-Web-Scraper

quivr

roop

More From This Creator

Coderz ChatGPT Clone Project

Framer Clone From Coderz Youtube

Coderz Teaching Repo

Daksh's Portfolio Website

Quantfi Example Website

chinchilla 0.1.4

Languages

Categories

Description:

License:

Share

Files In This Product: (if this is empty don't purchase this product)

Customer Reviews

License

Overview

What you can do with it

What you can't do with it

Related Products

Views For YouTube Bot writed on Python

AI-Web-Scraper

quivr

roop

zed

More From This Creator

Coderz ChatGPT Clone Project

Framer Clone From Coderz Youtube

Coderz Teaching Repo

Daksh's Portfolio Website

Quantfi Example Website