arXiv AI

[2412.18750] The Impact of Input Order Bias on Large Language Models for Software Fault Localization

By Advanced AI EditorJune 24, 2025No Comments2 Mins Read

[Submitted on 25 Dec 2024 (v1), last revised 23 Jun 2025 (this version, v3)]

View a PDF of the paper titled The Impact of Input Order Bias on Large Language Models for Software Fault Localization, by Md Nakhla Rafi and 3 other authors

View PDF
HTML (experimental)

Abstract:Large Language Models (LLMs) have shown significant potential in software engineering tasks such as Fault Localization (FL) and Automatic Program Repair (APR). This study investigates how input order and context size influence LLM performance in FL, a crucial step for many downstream software engineering tasks. We evaluate different method orderings using Kendall Tau distances, including “perfect” (where ground truths appear first) and “worst” (where ground truths appear last), across two benchmarks containing Java and Python projects. Our results reveal a strong order bias: in Java projects, Top-1 FL accuracy drops from 57% to 20% when reversing the order, while in Python projects, it decreases from 38% to approximately 3%. However, segmenting inputs into smaller contexts mitigates this bias, reducing the performance gap in FL from 22% and 6% to just 1% across both benchmarks. We replaced method names with semantically meaningful alternatives to determine whether this bias is due to data leakage. The observed trends remained consistent, suggesting that the bias is not caused by memorization from training data but rather by the inherent effect of input order. Additionally, we explored ordering methods based on traditional FL techniques and metrics, finding that DepGraph’s ranking achieves 48% Top-1 accuracy, outperforming simpler approaches such as CallGraph(DFS). These findings highlight the importance of structuring inputs, managing context effectively, and selecting appropriate ordering strategies to enhance LLM performance in FL and other software engineering applications.

Submission history

From: Md Nakhla Rafi [view email]
[v1]
Wed, 25 Dec 2024 02:48:53 UTC (1,962 KB)
[v2]
Wed, 19 Mar 2025 16:08:36 UTC (3,200 KB)
[v3]
Mon, 23 Jun 2025 15:51:16 UTC (1,073 KB)

Previous ArticleGoogle introduces AI mode to users in India

Next Article OpenAI’s first AI device won’t be a wearable or earbud

Advanced AI Editor

Leave A Reply