Artificial Analysis's GDP-Val metric — an aggregate measure combining quality and value. Specific methodology not publicly documented.
Attempts to capture a cost-adjusted quality measure.
Very few data points (2 scores). Proprietary methodology. Score range (1606–1633) suggests a non-standard scale.
Higher is better. Non-standard scale (typical range ~1600).
Very few data points (2 scores).