Skip to content

Commit afff5d5

Browse files
authored
refactor: SortedKeys sweep, deduplicate dedup logic, redistribute misplaced helpers (#41829)
1 parent a443203 commit afff5d5

67 files changed

Lines changed: 351 additions & 725 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
# ADR-41829: Consolidate `sliceutil.SortedKeys` as Canonical Map-Key Utility and Redistribute Misplaced Helpers
2+
3+
**Date**: 2026-06-27
4+
**Status**: Draft
5+
**Deciders**: Unknown (automated refactor by copilot-swe-agent)
6+
7+
---
8+
9+
### Context
10+
11+
The codebase had accumulated a widespread manual pattern for collecting and sorting map keys: `make → append → sort.Strings`. This idiom was duplicated at 40+ sites across `pkg/cli`, `pkg/workflow`, and `pkg/parser`, even though `sliceutil.SortedKeys` already existed and served this exact purpose. Separately, several general-purpose helper functions had drifted into files where they did not belong semantically — for example, `isDescendant` (a generic YAML-indent helper) lived in `codemod_github_app.go`, and safe-output helpers accumulated in `notify_comment.go`. This created maintenance friction: new contributors reinvented the utility, and unrelated helpers were hard to discover in their misplaced locations.
12+
13+
### Decision
14+
15+
We will adopt `sliceutil.SortedKeys` as the single canonical way to collect and sort map keys throughout the codebase, replacing all manual `make → append → sort.Strings` occurrences. Concurrently, we will enforce module cohesion by relocating misplaced helper functions to files whose names reflect their domain — generic YAML helpers to `yaml_frontmatter_utils.go`, engine-to-API-host logic to `engine_api_targets.go`, and safe-output env-var construction to `safe_outputs_env.go`. The duplicate `toEnvVarCase` function is removed in favour of the existing `normalizeJobNameForEnvVar`, which covers the same cases.
16+
17+
### Alternatives Considered
18+
19+
#### Alternative 1: Keep the manual `make → append → sort.Strings` idiom
20+
21+
Every callsite continues to inline the three-step collect-then-sort pattern. This requires no refactor effort and avoids touching 59 files. Rejected because the duplication had already reached 40+ sites and was growing; new callsites were consistently reinventing the pattern rather than discovering the existing utility, confirming the drift would continue unchecked.
22+
23+
#### Alternative 2: Use stdlib `maps.Keys` + `slices.Sort` (Go 1.21+)
24+
25+
Replace the manual idiom with `maps.Keys(m)` followed by `slices.Sort(...)`. This is idiomatic stdlib and requires no project-specific utility. Rejected because `sliceutil.SortedKeys` is already established in the project and composes both steps in one call, reducing boilerplate to a single expression. Migrating to a new stdlib pair would only exchange one two-line idiom for another without adding the readability gain of the single-call form; it would also invalidate any existing usages of `sliceutil.SortedKeys` elsewhere.
26+
27+
#### Alternative 3: Address only the deduplication fix in `cache_integrity.go`; leave the sweep for a later PR
28+
29+
A narrow fix to the inconsistency in `canonicalReposScope` (which hand-rolled dedup while `sliceutil.Deduplicate` was called on the same file) could be merged independently. Rejected because the broader sweep was machine-generated and low-risk; deferring it would leave the 40+ duplicated sites in place with no clear owner to address them later.
30+
31+
### Consequences
32+
33+
#### Positive
34+
- 725 lines deleted, 302 added — net −423 lines of boilerplate across 66 files, improving signal-to-noise ratio in the codebase.
35+
- A single canonical pattern for sorted map iteration eliminates the "which way should I do this?" question for contributors.
36+
- Misplaced helpers are now co-located with semantically related code, making them discoverable and reducing surprise when reading individual files.
37+
- Removing the `toEnvVarCase` duplicate eliminates a potential divergence point if the normalization logic ever needs to change.
38+
39+
#### Negative
40+
- `sliceutil` is now a transitive import dependency for many more packages; any future breaking change to `SortedKeys` (signature, behaviour, or package path) requires updating all 59+ files.
41+
- The sweep is mechanical and broad; reviewers must verify that no callsite had intentional non-lexicographic ordering or special nil-vs-empty semantics that `SortedKeys` does not preserve (e.g., `update_container_pins.go` adds an explicit `nil → []string{}` guard after the replacement).
42+
43+
#### Neutral
44+
- The `import` block in many files changes from `"sort"` to `"github.com/github/gh-aw/pkg/sliceutil"`; tooling (goimports, linters) must be aware of the project-internal package path.
45+
- Function relocation changes which file a `git blame` points to for `isDescendant`, `isExpressionValue`, `getEngineAPIHosts`, `buildSafeOutputJobsEnvVars`, and `systemSafeOutputJobNames` — history is preserved via `git log --follow`.
46+
47+
---
48+
49+
*ADR created by [adr-writer agent]. Review and finalize before changing status from Draft to Accepted.*

pkg/cli/audit_comparison.go

Lines changed: 2 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -8,11 +8,11 @@ import (
88
"os"
99
"path/filepath"
1010
"slices"
11-
"sort"
1211
"strings"
1312

1413
"github.com/github/gh-aw/pkg/constants"
1514
"github.com/github/gh-aw/pkg/logger"
15+
"github.com/github/gh-aw/pkg/sliceutil"
1616
"github.com/github/gh-aw/pkg/workflow"
1717
)
1818

@@ -466,11 +466,7 @@ func collectMCPFailureServers(failures []MCPFailureReport) []string {
466466
serverSet[failure.ServerName] = struct{}{}
467467
}
468468

469-
servers := make([]string, 0, len(serverSet))
470-
for server := range serverSet {
471-
servers = append(servers, server)
472-
}
473-
sort.Strings(servers)
469+
servers := sliceutil.SortedKeys(serverSet)
474470
return servers
475471
}
476472

pkg/cli/audit_cross_run.go

Lines changed: 3 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,10 @@
11
package cli
22

33
import (
4-
"sort"
54
"time"
65

76
"github.com/github/gh-aw/pkg/logger"
7+
"github.com/github/gh-aw/pkg/sliceutil"
88
"github.com/github/gh-aw/pkg/stats"
99
)
1010

@@ -327,11 +327,7 @@ func buildCrossRunMCPHealth(mcpServerMap map[string]*crossRunMCPServerAgg, total
327327

328328
auditCrossRunLog.Printf("Building MCP health for %d server(s) across %d run(s)", len(mcpServerMap), totalRuns)
329329

330-
sortedServers := make([]string, 0, len(mcpServerMap))
331-
for name := range mcpServerMap {
332-
sortedServers = append(sortedServers, name)
333-
}
334-
sort.Strings(sortedServers)
330+
sortedServers := sliceutil.SortedKeys(mcpServerMap)
335331

336332
health := make([]MCPServerCrossRunHealth, 0, len(sortedServers))
337333
for _, name := range sortedServers {
@@ -370,11 +366,7 @@ func buildCrossRunDomainInventory(
370366
domainMap map[string]*crossRunDomainAgg,
371367
runIDs []int64,
372368
) {
373-
sortedDomains := make([]string, 0, len(domainMap))
374-
for domain := range domainMap {
375-
sortedDomains = append(sortedDomains, domain)
376-
}
377-
sort.Strings(sortedDomains)
369+
sortedDomains := sliceutil.SortedKeys(domainMap)
378370

379371
auditCrossRunLog.Printf("Building domain inventory: %d unique domain(s) across %d run(s)", len(sortedDomains), len(runIDs))
380372

pkg/cli/audit_diff.go

Lines changed: 5 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -6,12 +6,12 @@ import (
66
"fmt"
77
"math"
88
"path/filepath"
9-
"sort"
109
"strings"
1110
"time"
1211

1312
"github.com/github/gh-aw/pkg/constants"
1413
"github.com/github/gh-aw/pkg/logger"
14+
"github.com/github/gh-aw/pkg/sliceutil"
1515
)
1616

1717
var auditDiffLog = logger.New("cli:audit_diff")
@@ -92,11 +92,7 @@ func computeFirewallDiff(run1ID, run2ID int64, run1, run2 *FirewallAnalysis) *Fi
9292
}
9393

9494
// Sorted domain list for deterministic output
95-
sortedDomains := make([]string, 0, len(allDomains))
96-
for domain := range allDomains {
97-
sortedDomains = append(sortedDomains, domain)
98-
}
99-
sort.Strings(sortedDomains)
95+
sortedDomains := sliceutil.SortedKeys(allDomains)
10096

10197
anomalyCount := 0
10298

@@ -444,11 +440,7 @@ func computeMCPToolsDiff(run1, run2 *MCPToolUsageData) *MCPToolsDiff {
444440
allKeys[k] = struct{}{}
445441
}
446442

447-
sortedKeys := make([]string, 0, len(allKeys))
448-
for k := range allKeys {
449-
sortedKeys = append(sortedKeys, k)
450-
}
451-
sort.Strings(sortedKeys)
443+
sortedKeys := sliceutil.SortedKeys(allKeys)
452444

453445
diff := &MCPToolsDiff{}
454446
anomalyCount := 0
@@ -649,11 +641,7 @@ func computeToolCallsDiff(m1, m2 *LogMetrics) *ToolCallsDiff {
649641
allNames[k] = struct{}{}
650642
}
651643

652-
sortedNames := make([]string, 0, len(allNames))
653-
for k := range allNames {
654-
sortedNames = append(sortedNames, k)
655-
}
656-
sort.Strings(sortedNames)
644+
sortedNames := sliceutil.SortedKeys(allNames)
657645

658646
diff := &ToolCallsDiff{}
659647
var run1Total, run2Total int
@@ -756,11 +744,7 @@ func computeBashCommandsDiff(run1Tools, run2Tools map[string]ToolCallInfo) *Bash
756744
return nil
757745
}
758746

759-
sortedNames := make([]string, 0, len(allNames))
760-
for k := range allNames {
761-
sortedNames = append(sortedNames, k)
762-
}
763-
sort.Strings(sortedNames)
747+
sortedNames := sliceutil.SortedKeys(allNames)
764748

765749
bashDiff := &BashCommandsDiff{}
766750
for _, name := range sortedNames {

pkg/cli/audit_expanded.go

Lines changed: 2 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -7,12 +7,12 @@ import (
77
"os"
88
"path/filepath"
99
"slices"
10-
"sort"
1110
"strings"
1211
"time"
1312

1413
"github.com/github/gh-aw/pkg/fileutil"
1514
"github.com/github/gh-aw/pkg/logger"
15+
"github.com/github/gh-aw/pkg/sliceutil"
1616
"github.com/github/gh-aw/pkg/timeutil"
1717
"github.com/github/gh-aw/pkg/workflow"
1818
)
@@ -662,10 +662,6 @@ func extractMCPServerNamesFromAwInfo(logsPath string) ([]string, bool) {
662662
return nil, false
663663
}
664664

665-
names := make([]string, 0, len(servers))
666-
for name := range servers {
667-
names = append(names, name)
668-
}
669-
sort.Strings(names)
665+
names := sliceutil.SortedKeys(servers)
670666
return names, len(names) > 0
671667
}

pkg/cli/audit_report_experiments.go

Lines changed: 4 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -10,11 +10,11 @@ import (
1010
"fmt"
1111
"os"
1212
"path/filepath"
13-
"sort"
1413
"strings"
1514

1615
"github.com/github/gh-aw/pkg/constants"
1716
"github.com/github/gh-aw/pkg/logger"
17+
"github.com/github/gh-aw/pkg/sliceutil"
1818
)
1919

2020
var experimentDataLog = logger.New("cli:audit_report_experiments")
@@ -112,11 +112,7 @@ func extractExperimentData(logsPath string) *ExperimentData {
112112
// Derive this-run assignments: the variant selected on the most-recent run is
113113
// the one with the maximum count (ties resolved by sorted order).
114114
assignments := make(map[string]string, len(state.Counts))
115-
names := make([]string, 0, len(state.Counts))
116-
for name := range state.Counts {
117-
names = append(names, name)
118-
}
119-
sort.Strings(names)
115+
names := sliceutil.SortedKeys(state.Counts)
120116

121117
for _, name := range names {
122118
variantCounts := state.Counts[name]
@@ -145,11 +141,7 @@ func formatExperimentLabel(exp *ExperimentData) string {
145141
return ""
146142
}
147143

148-
names := make([]string, 0, len(exp.Assignments))
149-
for name := range exp.Assignments {
150-
names = append(names, name)
151-
}
152-
sort.Strings(names)
144+
names := sliceutil.SortedKeys(exp.Assignments)
153145

154146
parts := make([]string, 0, len(names))
155147
for _, name := range names {
@@ -196,11 +188,7 @@ func deriveLastSelectedVariant(variantCounts map[string]int) string {
196188
return ""
197189
}
198190

199-
variants := make([]string, 0, len(variantCounts))
200-
for v := range variantCounts {
201-
variants = append(variants, v)
202-
}
203-
sort.Strings(variants)
191+
variants := sliceutil.SortedKeys(variantCounts)
204192

205193
selected := variants[0]
206194
maxCount := variantCounts[selected]

pkg/cli/codemod_dependabot_permissions.go

Lines changed: 2 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@ import (
66
"strings"
77

88
"github.com/github/gh-aw/pkg/logger"
9+
"github.com/github/gh-aw/pkg/sliceutil"
910
"github.com/github/gh-aw/pkg/workflow"
1011
)
1112

@@ -241,10 +242,6 @@ func sortedMissingPermissionKeys(missing map[workflow.PermissionScope]workflow.P
241242
}
242243

243244
func sortedRemainingPermissionKeys(remaining map[string]workflow.PermissionLevel) []string {
244-
keys := make([]string, 0, len(remaining))
245-
for key := range remaining {
246-
keys = append(keys, key)
247-
}
248-
sort.Strings(keys)
245+
keys := sliceutil.SortedKeys(remaining)
249246
return keys
250247
}

pkg/cli/codemod_effective_tokens_to_ai_credits.go

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -137,10 +137,6 @@ func convertEffectiveTokensToAICredits(effectiveTokens int) (string, bool) {
137137
return strconv.Itoa(aiCredits), true
138138
}
139139

140-
func isExpressionValue(value string) bool {
141-
return strings.Contains(value, "${{") && strings.Contains(value, "}}")
142-
}
143-
144140
func rewriteTopLevelScalarLine(line, newKey, normalizedValue string) string {
145141
indent := getIndentation(line)
146142
parts := strings.SplitN(line, ":", 2)

pkg/cli/codemod_github_app.go

Lines changed: 0 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -179,10 +179,3 @@ func renameAppToGitHubApp(lines []string) ([]string, bool) {
179179

180180
return result, modified
181181
}
182-
183-
// isDescendant returns true if childIndent is deeper (more indented) than parentIndent.
184-
// It is used as a "belongs to this block" check — any line more indented than the parent
185-
// is treated as being within the parent's scope.
186-
func isDescendant(childIndent, parentIndent string) bool {
187-
return len(childIndent) > len(parentIndent)
188-
}

pkg/cli/experiments_analyze_statistics.go

Lines changed: 2 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -4,11 +4,11 @@ import (
44
"fmt"
55
"math"
66
"os"
7-
"sort"
87
"strings"
98

109
"github.com/github/gh-aw/pkg/console"
1110
"github.com/github/gh-aw/pkg/logger"
11+
"github.com/github/gh-aw/pkg/sliceutil"
1212
"github.com/github/gh-aw/pkg/workflow"
1313
)
1414

@@ -130,11 +130,7 @@ func computeExperimentAnalysis(exp ExperimentVariantStats, cfg *workflow.Experim
130130
}
131131

132132
// Collect variant names in alphabetical order for deterministic output.
133-
variantNames := make([]string, 0, len(exp.Variants))
134-
for name := range exp.Variants {
135-
variantNames = append(variantNames, name)
136-
}
137-
sort.Strings(variantNames)
133+
variantNames := sliceutil.SortedKeys(exp.Variants)
138134

139135
// Compute expected proportions (weighted or equal split).
140136
expectedPcts := expectedProportions(variantNames, cfg)

0 commit comments

Comments
 (0)