Kronos: A Foundation Model for Financial Market Data

Financial markets generate vast amounts of data every second. Prices rise and fall, trading volumes fluctuate, and candlestick charts (K-lines) form a language of their own. For researchers and practitioners, making sense of this noisy and complex data is a continuous challenge.

Kronos is the first open-source foundation model designed specifically for financial candlestick data. It has been trained on datasets collected from more than 45 global exchanges, giving it a unique ability to capture the patterns and structures within market behavior. Instead of relying on general-purpose time series models, Kronos treats market data as a “language,” transforming raw numerical values into structured tokens that a Transformer-based model can process and learn from.

This article provides a complete overview of Kronos: its design philosophy, model family, installation process, usage for forecasting, fine-tuning on custom datasets, and important considerations for real-world applications. It is written in clear and practical language, making it accessible for readers with a junior college education or higher.

What Makes Kronos Different?

Time series foundation models (TSFMs) are widely used in domains like weather forecasting and sensor monitoring. However, financial data presents unique obstacles:

High Noise: Market data is extremely volatile, with frequent random fluctuations.
Non-Stationary Behavior: Trends shift as economic conditions and investor sentiment change.
Multi-Dimensional Inputs: A candlestick is not just a single number but includes open, high, low, close, volume, and sometimes additional fields like transaction amounts.

Kronos addresses these issues with a two-stage design:

Tokenizer: Transforms continuous multi-dimensional K-line data into discrete tokens, creating a structured representation of financial data.
Transformer: A large autoregressive model that learns from these tokens to capture the “grammar” of financial markets.

By framing financial sequences as language, Kronos offers a unified approach to multiple quantitative tasks—from forecasting to backtesting strategies.

The Kronos Model Family

Kronos is not a single model but a family of models tailored for different computing environments.

Model Name	Tokenizer	Context Length	Parameters	Availability
Kronos-mini	Tokenizer-2k	2048	4.1M	✅ NeoQuasar/Kronos-mini
Kronos-small	Tokenizer-base	512	24.7M	✅ NeoQuasar/Kronos-small
Kronos-base	Tokenizer-base	512	102.3M	✅ NeoQuasar/Kronos-base
Kronos-large	Tokenizer-base	512	499.2M	❌ Not publicly available

Kronos-mini is lightweight and suitable for experimentation or resource-limited setups.
Kronos-small provides a balance of accuracy and performance.
Kronos-base is stronger and suitable for research-grade projects.
Kronos-large exists but is not open-source.

Installation and Setup

Step 1: Install Python and Dependencies

Kronos requires Python 3.10+. After setting up Python, install the dependencies with:

pip install -r requirements.txt

Step 2: Load a Pre-Trained Model

Kronos integrates with Hugging Face Hub. You can load both the tokenizer and the model in just a few lines:

from model import Kronos, KronosTokenizer, KronosPredictor

tokenizer = KronosTokenizer.from_pretrained("NeoQuasar/Kronos-Tokenizer-base")
model = Kronos.from_pretrained("NeoQuasar/Kronos-small")

Step 3: Initialize the Predictor

predictor = KronosPredictor(model, tokenizer, device="cuda:0", max_context=512)

Here, max_context=512 is a critical parameter. It represents the maximum sequence length the model can handle. Supplying longer input sequences will be automatically truncated by the predictor.

Forecasting with Kronos

Forecasting future candlesticks is one of the most practical uses of Kronos. The workflow is broken down into clear steps.

Step 1: Prepare Input Data

The predictor requires three inputs:

Historical data (df): Must include columns for open, high, low, and close. volume and amount are optional.
Timestamps for historical data (x_timestamp)
Timestamps for the forecast horizon (y_timestamp)

Example:

import pandas as pd

df = pd.read_csv("./data/XSHG_5min_600977.csv")
df['timestamps'] = pd.to_datetime(df['timestamps'])

lookback = 400
pred_len = 120

x_df = df.loc[:lookback-1, ['open', 'high', 'low', 'close', 'volume', 'amount']]
x_timestamp = df.loc[:lookback-1, 'timestamps']
y_timestamp = df.loc[lookback:lookback+pred_len-1, 'timestamps']

Step 2: Run Predictions

pred_df = predictor.predict(
    df=x_df,
    x_timestamp=x_timestamp,
    y_timestamp=y_timestamp,
    pred_len=pred_len,
    T=1.0,         
    top_p=0.9,     
    sample_count=1
)

The predictor outputs a DataFrame with forecasted values for open, high, low, close, volume, and amount.

Step 3: Visualize Results

The provided example script generates a plot comparing predictions against ground truth:

This allows you to visually evaluate how well the model captures short-term dynamics.

Fine-Tuning Kronos on Custom Data

While pre-trained models are powerful, they may not perfectly fit all markets. Kronos includes a fine-tuning pipeline to adapt the model to specific datasets, such as the Chinese A-share market.

Fine-Tuning Workflow

Configuration: Adjust paths and hyperparameters in finetune/config.py.
Data Preparation: Process data using Qlib and split it into training, validation, and testing sets.
Model Training: Fine-tune both the tokenizer and the predictor.
Backtesting: Test the fine-tuned model with simulated trading strategies.

Prerequisites

Install dependencies from requirements.txt.
Install pyqlib:

pip install pyqlib

Prepare Qlib-compatible data following the official guide.

Running the Pipeline

Step 1: Preprocess Data

python finetune/qlib_data_preprocess.py

Step 2: Fine-Tune Tokenizer

torchrun --standalone --nproc_per_node=NUM_GPUS finetune/train_tokenizer.py

Step 3: Fine-Tune Predictor

torchrun --standalone --nproc_per_node=NUM_GPUS finetune/train_predictor.py

Step 4: Backtesting

python finetune/qlib_test.py --device cuda:0

Results include detailed performance metrics and return curves:

Key Considerations for Real-World Use

While Kronos demonstrates strong forecasting ability, there are important points to remember:

Raw Signals vs. Tradeable Alpha: Predictions are raw signals, not ready-made trading strategies. For real use, signals must pass through portfolio optimization and risk management processes.
Custom Data Handling: Example datasets assume Qlib format. Different data sources may require custom preprocessing.
Backtesting Complexity: The demo uses a simple top-K strategy. Real strategies require advanced features such as position sizing, stop-loss rules, and transaction cost modeling.

Frequently Asked Questions (FAQ)

Who can benefit from Kronos?

Kronos is ideal for quantitative researchers, financial analysts, data scientists, and students exploring time series modeling in finance.

What markets does Kronos cover?

The pre-trained model is based on data from 45 global exchanges, making it broadly applicable across equities, crypto, and other asset classes.

Can I directly trade using Kronos forecasts?

No. The forecasts are raw predictions. For live trading, they must be integrated with risk controls and portfolio optimization techniques.

Why fine-tune the tokenizer separately?

Different markets have different statistical properties. Fine-tuning the tokenizer aligns the token distribution with local market characteristics, improving performance.

Citation

If you use Kronos in academic work, please cite the official paper:

@misc{shi2025kronos,
      title={Kronos: A Foundation Model for the Language of Financial Markets}, 
      author={Yu Shi and Zongliang Fu and Shuo Chen and Bohan Zhao and Wei Xu and Changshui Zhang and Jian Li},
      year={2025},
      eprint={2508.02739},
      archivePrefix={arXiv},
      primaryClass={q-fin.ST},
      url={https://arxiv.org/abs/2508.02739}, 
}

Closing Thoughts

Kronos marks an important step toward specialized foundation models for finance. By viewing financial markets as a language, it offers a new way to model complex, noisy, and multi-dimensional data. For researchers and practitioners, Kronos provides not only pre-trained models but also clear tools for customization and backtesting.

The project lowers the barrier for advanced quantitative research, giving both individuals and organizations a foundation to build upon. Whether for academic exploration or as a stepping stone toward real-world trading systems, Kronos opens a pathway to deeper understanding of financial market dynamics.

Kronos Financial Foundation Model: Revolutionizing Market Data Analysis with AI