模型比较论文 - 专知

会员服务 ·

模型比较

Improving the Accuracy of Amortized Model Comparison with Self-Consistency

Arxiv

0+阅读 · 5月12日

The Relative Instability of Model Comparison with Cross-validation

Arxiv

0+阅读 · 6月4日

Zero-Shot Parkinson's Disease Detection from Speech: Comparing Large Audio and Language Models

Arxiv

0+阅读 · 5月24日

On the Reliability of Cue Conflict and Beyond

Arxiv

0+阅读 · 6月11日

JAMMEval: A Refined Collection of Japanese Benchmarks for Reliable VLM Evaluation

Arxiv

0+阅读 · 4月1日

LLM BiasScope: A Real-Time Bias Analysis Platform for Comparative LLM Evaluation

Arxiv

0+阅读 · 3月12日

Earth Embeddings as Products: Taxonomy, Ecosystem, and Standardized Access

Arxiv

0+阅读 · 2月24日

The Relative Instability of Model Comparison with Cross-validation

Arxiv

0+阅读 · 2月8日

Nested Slice Sampling: Vectorized Nested Sampling for GPU-Accelerated Inference

Arxiv

0+阅读 · 1月30日

Improving the Accuracy of Amortized Model Comparison with Self-Consistency

Arxiv

0+阅读 · 1月26日

Improving the Accuracy of Amortized Model Comparison with Self-Consistency

Arxiv

0+阅读 · 1月20日

Improving the Accuracy of Amortized Model Comparison with Self-Consistency

Arxiv

0+阅读 · 1月28日

Efficient prior sensitivity analysis for Bayesian model comparison

Arxiv

0+阅读 · 1月21日

Earth Embeddings as Products: Taxonomy, Ecosystem, and Standardized Access

Arxiv

0+阅读 · 1月19日

Comparative Study of Large Language Models on Chinese Film Script Continuation: An Empirical Analysis Based on GPT-5.2 and Qwen-Max

Arxiv

0+阅读 · 1月21日

参考链接

微信扫码咨询专知VIP会员