I just had a detailed discussion with Gemini Pro 3.1 and realized that right now, we can use just a high-end graphics card to tokenize the market data from the past 500 candlesticks and the accumulated market data during those candlesticks...
Once tokenized, it means you can train a simple large model on your own computer, with performance similar to GPT-2, but focused solely on price probability analysis. In just 2-3 years, the training costs behind large models don't seem to be reduced by hardware improvements, but rather by a continuous stream of new algorithms to cut costs. Currently, my idea is to try training a simple multimodal model, where price and order book data are simplified into 8 Tokens, trading volume into 2 Tokens, and the news events during that candlestick are simplified into either bearish or bullish information... This way, the model only needs 2-4 Transformer layers and 128 hidden layer dimensions to capture most patterns... I think this approach is very interesting and more fun than just letting AI write quantitative strategies!
View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
I just had a detailed discussion with Gemini Pro 3.1 and realized that right now, we can use just a high-end graphics card to tokenize the market data from the past 500 candlesticks and the accumulated market data during those candlesticks...
Once tokenized, it means you can train a simple large model on your own computer, with performance similar to GPT-2, but focused solely on price probability analysis.
In just 2-3 years, the training costs behind large models don't seem to be reduced by hardware improvements, but rather by a continuous stream of new algorithms to cut costs.
Currently, my idea is to try training a simple multimodal model, where price and order book data are simplified into 8 Tokens, trading volume into 2 Tokens, and the news events during that candlestick are simplified into either bearish or bullish information...
This way, the model only needs 2-4 Transformer layers and 128 hidden layer dimensions to capture most patterns...
I think this approach is very interesting and more fun than just letting AI write quantitative strategies!