Kronos: A Foundation Model for Financial Market Data
Financial markets generate vast amounts of data every second. Prices rise and fall, trading volumes fluctuate, and candlestick charts (K-lines) form a language of their own. For researchers and practitioners, making sense of this noisy and complex data is a continuous challenge.
Kronos is the first open-source foundation model designed specifically for financial candlestick data. It has been trained on datasets collected from more than 45 global exchanges, giving it a unique ability to capture the patterns and structures within market behavior. Instead of relying on general-purpose time series models, Kronos treats market data as a “language,” transforming raw numerical values into structured tokens that a Transformer-based model can process and learn from.
This article provides a complete overview of Kronos: its design philosophy, model family, installation process, usage for forecasting, fine-tuning on custom datasets, and important considerations for real-world applications. It is written in clear and practical language, making it accessible for readers with a junior college education or higher.
What Makes Kronos Different?
Time series foundation models (TSFMs) are widely used in domains like weather forecasting and sensor monitoring. However, financial data presents unique obstacles:
-
High Noise: Market data is extremely volatile, with frequent random fluctuations. -
Non-Stationary Behavior: Trends shift as economic conditions and investor sentiment change. -
Multi-Dimensional Inputs: A candlestick is not just a single number but includes open, high, low, close, volume, and sometimes additional fields like transaction amounts.
Kronos addresses these issues with a two-stage design:
-
Tokenizer: Transforms continuous multi-dimensional K-line data into discrete tokens, creating a structured representation of financial data. -
Transformer: A large autoregressive model that learns from these tokens to capture the “grammar” of financial markets.
By framing financial sequences as language, Kronos offers a unified approach to multiple quantitative tasks—from forecasting to backtesting strategies.
The Kronos Model Family
Kronos is not a single model but a family of models tailored for different computing environments.
Model Name | Tokenizer | Context Length | Parameters | Availability |
---|---|---|---|---|
Kronos-mini | Tokenizer-2k | 2048 | 4.1M | ✅ NeoQuasar/Kronos-mini |
Kronos-small | Tokenizer-base | 512 | 24.7M | ✅ NeoQuasar/Kronos-small |
Kronos-base | Tokenizer-base | 512 | 102.3M | ✅ NeoQuasar/Kronos-base |
Kronos-large | Tokenizer-base | 512 | 499.2M | ❌ Not publicly available |
-
Kronos-mini is lightweight and suitable for experimentation or resource-limited setups. -
Kronos-small provides a balance of accuracy and performance. -
Kronos-base is stronger and suitable for research-grade projects. -
Kronos-large exists but is not open-source.
Installation and Setup
Step 1: Install Python and Dependencies
Kronos requires Python 3.10+. After setting up Python, install the dependencies with:
pip install -r requirements.txt
Step 2: Load a Pre-Trained Model
Kronos integrates with Hugging Face Hub. You can load both the tokenizer and the model in just a few lines:
from model import Kronos, KronosTokenizer, KronosPredictor
tokenizer = KronosTokenizer.from_pretrained("NeoQuasar/Kronos-Tokenizer-base")
model = Kronos.from_pretrained("NeoQuasar/Kronos-small")
Step 3: Initialize the Predictor
predictor = KronosPredictor(model, tokenizer, device="cuda:0", max_context=512)
Here, max_context=512
is a critical parameter. It represents the maximum sequence length the model can handle. Supplying longer input sequences will be automatically truncated by the predictor.
Forecasting with Kronos
Forecasting future candlesticks is one of the most practical uses of Kronos. The workflow is broken down into clear steps.
Step 1: Prepare Input Data
The predictor requires three inputs:
-
Historical data ( df
): Must include columns foropen
,high
,low
, andclose
.volume
andamount
are optional. -
Timestamps for historical data ( x_timestamp
) -
Timestamps for the forecast horizon ( y_timestamp
)
Example:
import pandas as pd
df = pd.read_csv("./data/XSHG_5min_600977.csv")
df['timestamps'] = pd.to_datetime(df['timestamps'])
lookback = 400
pred_len = 120
x_df = df.loc[:lookback-1, ['open', 'high', 'low', 'close', 'volume', 'amount']]
x_timestamp = df.loc[:lookback-1, 'timestamps']
y_timestamp = df.loc[lookback:lookback+pred_len-1, 'timestamps']
Step 2: Run Predictions
pred_df = predictor.predict(
df=x_df,
x_timestamp=x_timestamp,
y_timestamp=y_timestamp,
pred_len=pred_len,
T=1.0,
top_p=0.9,
sample_count=1
)
The predictor outputs a DataFrame with forecasted values for open, high, low, close, volume, and amount.
Step 3: Visualize Results
The provided example script generates a plot comparing predictions against ground truth:

This allows you to visually evaluate how well the model captures short-term dynamics.
Fine-Tuning Kronos on Custom Data
While pre-trained models are powerful, they may not perfectly fit all markets. Kronos includes a fine-tuning pipeline to adapt the model to specific datasets, such as the Chinese A-share market.
Fine-Tuning Workflow
-
Configuration: Adjust paths and hyperparameters in finetune/config.py
. -
Data Preparation: Process data using Qlib and split it into training, validation, and testing sets. -
Model Training: Fine-tune both the tokenizer and the predictor. -
Backtesting: Test the fine-tuned model with simulated trading strategies.
Prerequisites
-
Install dependencies from requirements.txt
. -
Install pyqlib
:
pip install pyqlib
-
Prepare Qlib-compatible data following the official guide.
Running the Pipeline
Step 1: Preprocess Data
python finetune/qlib_data_preprocess.py
Step 2: Fine-Tune Tokenizer
torchrun --standalone --nproc_per_node=NUM_GPUS finetune/train_tokenizer.py
Step 3: Fine-Tune Predictor
torchrun --standalone --nproc_per_node=NUM_GPUS finetune/train_predictor.py
Step 4: Backtesting
python finetune/qlib_test.py --device cuda:0
Results include detailed performance metrics and return curves:

Key Considerations for Real-World Use
While Kronos demonstrates strong forecasting ability, there are important points to remember:
-
Raw Signals vs. Tradeable Alpha: Predictions are raw signals, not ready-made trading strategies. For real use, signals must pass through portfolio optimization and risk management processes. -
Custom Data Handling: Example datasets assume Qlib format. Different data sources may require custom preprocessing. -
Backtesting Complexity: The demo uses a simple top-K strategy. Real strategies require advanced features such as position sizing, stop-loss rules, and transaction cost modeling.
Frequently Asked Questions (FAQ)
Who can benefit from Kronos?
Kronos is ideal for quantitative researchers, financial analysts, data scientists, and students exploring time series modeling in finance.
What markets does Kronos cover?
The pre-trained model is based on data from 45 global exchanges, making it broadly applicable across equities, crypto, and other asset classes.
Can I directly trade using Kronos forecasts?
No. The forecasts are raw predictions. For live trading, they must be integrated with risk controls and portfolio optimization techniques.
Why fine-tune the tokenizer separately?
Different markets have different statistical properties. Fine-tuning the tokenizer aligns the token distribution with local market characteristics, improving performance.
Citation
If you use Kronos in academic work, please cite the official paper:
@misc{shi2025kronos,
title={Kronos: A Foundation Model for the Language of Financial Markets},
author={Yu Shi and Zongliang Fu and Shuo Chen and Bohan Zhao and Wei Xu and Changshui Zhang and Jian Li},
year={2025},
eprint={2508.02739},
archivePrefix={arXiv},
primaryClass={q-fin.ST},
url={https://arxiv.org/abs/2508.02739},
}
Closing Thoughts
Kronos marks an important step toward specialized foundation models for finance. By viewing financial markets as a language, it offers a new way to model complex, noisy, and multi-dimensional data. For researchers and practitioners, Kronos provides not only pre-trained models but also clear tools for customization and backtesting.
The project lowers the barrier for advanced quantitative research, giving both individuals and organizations a foundation to build upon. Whether for academic exploration or as a stepping stone toward real-world trading systems, Kronos opens a pathway to deeper understanding of financial market dynamics.