AI Workflows

How to use AI models to generate, iterate, and improve strategies — with prompts, patterns, and guardrails.

Overview

EmidLabs treats AI as a quantitative research tool, not a magic trading signal generator. The infrastructure is explicitly designed to work with AI-generated strategies: the strategy DSL is machine-readable, the JSON format is clean and unambiguous, and the results contain enough diagnostic data for iterative refinement.

⚠

AI does not predict markets. AI assists with hypothesis generation, strategy formalization, and systematic iteration. The edge — if it exists — must be validated by the backtest engine, not assumed from the AI's output.

The AI Workflow Loop

The recommended workflow is a loop of four stages:

Hypothesis — You form a market hypothesis in natural language. What behavior do you believe exists? What drives it?

Strategy generation — You prompt an AI model (GPT-4, Claude, etc.) with your hypothesis and the EmidLabs DSL spec. The AI outputs a valid strategy JSON.

Backtest execution — You submit the generated strategy to the EmidLabs API. The engine runs the backtest server-side.

Analysis & iteration — You analyze the result metrics and diagnostics. You feed the results back to the AI and ask it to refine the strategy.

This loop can be automated. You can build agents that generate strategies, execute backtests via the API, parse results, and iterate — without human intervention between each cycle.

#Prompt Engineering

The essential system prompt

The most important thing to include in your system prompt is the full strategy DSL specification. The AI needs to know the exact schema, all available functions, and the serialization requirement.

Key information to include in the system prompt:

The five top-level keys: configuration, inputs, conditions, score, decision.
All built-in functions with their signatures.
The operator list (AND, OR, comparison operators).
The market data variables (close, open, high, low, volume).
The rule that output must be valid JSON only (no markdown, no explanations).
The rule against future data, randomness, and undefined variables.

Example user prompt structure

Prompt structure

Hypothesis:
I believe BTC tends to trend consistently when the 9-period EMA
is above the 21-period EMA on the 1H chart, and RSI is in a
healthy range (not overbought, not oversold). Volume confirmation
reduces false signals in choppy markets.

Generate a strategy that:
- Uses 1H timeframe
- Captures EMA trend alignment
- Requires RSI between 40 and 65
- Adds volume spike confirmation as a bonus signal
- Sets a score threshold that requires trend + rsi at minimum

Output only the strategy JSON. No markdown. No explanations.

Feeding results back to the AI

After running a backtest, share the diagnostic data with the AI to guide refinement:

Iteration prompt

The strategy produced these results:
- trades: 284, winRate: 0.524, expectancyR: 0.18
- conditionsDistributionPct: { trendUp: 0.41, rsiHealthy: 0.55, volConfirm: 0.72 }
- scoreDistributionPct: { 0: 0.38, 20: 0.21, 45: 0.24, 60: 0.17 }

Issues to address:
1. volConfirm fires 72% of the time — it's not selective enough.
   Increase the volumeSpike threshold to 2.0 or 2.5.
2. winRate is 52.4% but expectancy is only 0.18R — the 1:3 RR
   seems to be underperforming. Try adding a crossover signal
   to improve entry timing.

Revise the strategy and output the updated JSON only.

#Automated Iteration

The API is designed to support programmatic iteration loops. Here is a basic Python pattern for an automated research cycle:

research_loop.py

import anthropic
import requests
import json
import time

EMID_API_KEY = "em_YOUR_KEY"
BASE_URL = "https://api.emidlabs.com/api/public/v1"

def generate_strategy(hypothesis: str, previous_result: dict = None) -> dict:
    client = anthropic.Anthropic()
    prompt = f"Hypothesis: {hypothesis}"
    if previous_result:
        prompt += f"\n\nPrevious results: {json.dumps(previous_result, indent=2)}"
        prompt += "\n\nRevise the strategy based on these results. Output JSON only."

    msg = client.messages.create(
        model="claude-opus-4-7",
        max_tokens=1024,
        system=STRATEGY_SYSTEM_PROMPT,  # Include full DSL spec
        messages=[{"role": "user", "content": prompt}]
    )
    return json.loads(msg.content[0].text)

def run_backtest(strategy: dict) -> dict:
    resp = requests.post(
        f"{BASE_URL}/backtest",
        headers={"x-api-key": EMID_API_KEY},
        json={
            "assetPair": "BTC-USDC",
            "initialDate": "2023-01-01",
            "finalDate": "2023-12-31",
            "strategySnapshotJson": json.dumps(strategy)
        }
    )
    backtest_id = resp.json()["id"]

    # Poll for result
    while True:
        result = requests.get(
            f"{BASE_URL}/backtest/{backtest_id}",
            headers={"x-api-key": EMID_API_KEY}
        ).json()
        if result["status"] in ("Completed", "Failed"):
            return result
        time.sleep(0.5)

# Run 5 iterations
hypothesis = "EMA crossover with RSI confirmation on BTC 1H"
result = None
for i in range(5):
    strategy = generate_strategy(hypothesis, result)
    result = run_backtest(strategy)
    print(f"Iteration {i+1}: expectancyR={result['result']['expectancyR']}")

#Best Practices

Avoid overfitting

Do not optimize parameters by running hundreds of variations on the same date range and picking the best. This is curve-fitting, not edge discovery.
Validate promising strategies on a separate out-of-sample date range before drawing conclusions.
Prefer strategies with a clear causal story over those that simply "look good on paper."

Validate AI output before submission

Confirm that all inputs referenced in conditions are defined in the inputs block.
Confirm that all conditions referenced in score are defined in conditions.
Verify the JSON is valid before serializing it into the API payload.

Use diagnostics to guide iteration

If a condition fires on more than 70% of candles, it's not selective. Tighten the threshold.
If a condition fires on less than 5% of candles, consider whether it's rare enough to be noise.
Focus on improving expectancyR, not just winRate. A 40% win rate with 1:3 RR is positive expectancy.

AI limitations to acknowledge

AI does not have access to live market data. It reasons about market dynamics using training knowledge, which may be outdated or wrong.
AI-generated strategies may contain subtle logical errors (undefined references, circular inputs). Always validate.
AI cannot predict which strategies will be profitable. It can only generate and formalize hypotheses.