当工程团队从几十人扩张到上千人,微服务数量从十几个增长到数千个时,软件供应链的安全问题就不再是一个可以通过人工审计或零散脚本解决的问题。我们面临的第一个具体挑战是:无法在统一的视图下,实时回答“我们的哪个服务正在使用存在高危漏洞 log4j-core-2.14.1?”这样的问题。依赖管理的混乱和安全信息的滞后,正成为悬在头顶的达摩克利斯之剑。
最初的尝试是通过 Jenkins 定期执行 npm audit
或 trivy fs
,然后将结果输出为 JSON 文件存储在对象存储中。这种方式在项目数量少时勉强可行,但很快就暴露了致命缺陷:无法进行聚合查询、无法跟踪漏洞修复的历史趋势、无法关联不同项目间的共享依赖。我们需要一个真正的平台,一个能够存储、分析和可视化海量依赖扫描数据的中央系统。
技术选型决策的权衡
构建这样一个平台,核心是数据存储和前端呈现。
数据后端:为什么选择 TiDB 而不是传统的 MySQL 或 PostgreSQL?
一个中型项目单次全量依赖扫描就可能产生数千条依赖关系记录和数百个漏洞条目。当你有数千个项目,并且需要保留每次构建的历史扫描数据时,数据量会在几年内轻易达到数十亿甚至上百亿行。
- MySQL/PostgreSQL: 传统关系型数据库在百亿级别数据量下,查询性能会急剧下降。我们需要依赖复杂的分库分表(Sharding)方案,但这会带来巨大的运维成本、跨分片查询的复杂性以及后续扩容的痛苦。
- TiDB: 作为一款分布式 NewSQL 数据库,它天然具备水平扩展能力。我们可以像使用单机 MySQL 一样使用它,而底层的数据分片、负载均衡、高可用都由 TiKV 自动处理。对于我们未来的需求——不仅要存储数据,还要进行复杂的在线分析查询(例如,“统计过去半年内,前端团队引入的高危漏洞类型的分布趋势”),TiDB 的 HTAP (Hybrid Transactional/Analytical Processing) 能力是一个巨大的加分项。
前端可视化:Angular、Storybook 与一个非典型的状态管理思路
前端的核心挑战在于如何呈现复杂、高密度的依赖关系图谱。一个项目的依赖图可能包含数千个节点和边,用户需要能够缩放、平移、搜索节点、高亮漏洞路径。
Angular: 团队技术栈以 Angular 为主,其强大的依赖注入和模块化系统非常适合构建这种大型、需要长期维护的内部平台。
Storybook: 依赖图谱这个组件本身就极其复杂,包含了大量的交互逻辑和边界情况。如果将其混在整个应用中开发,调试和测试将是一场灾难。我们决定从一开始就使用 Storybook,将图谱组件作为一个独立的单元进行开发、测试和文档化。
状态管理 - Jotai 哲学在 Angular 中的应用: 依赖图谱的状态是高度碎片化且相互关联的。例如,
当前高亮的节点
、应用的过滤器(如漏洞等级)
、图表的缩放级别
、节点的布局算法
等等。在传统的 Angular 中,我们可能会创建一个巨大的GraphStateService
,用一堆BehaviorSubject
来管理这些状态。在真实项目中,这很快会导致一个难以维护的“上帝服务”。受到 React 生态中 Jotai 的启发,我们决定不直接使用它,而是在 Angular 中实现其核心思想:**原子化状态 (Atomic State)**。每个状态片段都是一个独立的“原子”,组件只订阅它真正关心的原子。这可以用 RxJS 的
BehaviorSubject
封装在一个个小巧、独立的、可注入的服务中来实现,从而避免了状态管理的过度集中。
数据库模型与后端实现
TiDB 的表结构设计是整个系统的基石。我们需要能够高效地存储和查询项目、扫描记录、依赖项和漏洞之间的关系。
-- `projects`: 存储代码仓库信息
CREATE TABLE `projects` (
`id` BIGINT(20) UNSIGNED NOT NULL AUTO_INCREMENT,
`project_name` VARCHAR(255) NOT NULL COMMENT '项目名称, e.g., my-awesome-app',
`repo_url` VARCHAR(512) NOT NULL COMMENT '代码仓库地址',
`created_at` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
`updated_at` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`id`),
UNIQUE KEY `uk_repo_url` (`repo_url`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin;
-- `scans`: 存储每一次扫描的记录
CREATE TABLE `scans` (
`id` BIGINT(20) UNSIGNED NOT NULL AUTO_INCREMENT,
`project_id` BIGINT(20) UNSIGNED NOT NULL COMMENT '关联的项目ID',
`scan_uuid` VARCHAR(36) NOT NULL COMMENT '单次扫描的唯一标识符',
`status` ENUM('PENDING', 'RUNNING', 'SUCCESS', 'FAILED') NOT NULL DEFAULT 'PENDING',
`scanner_version` VARCHAR(50) DEFAULT NULL COMMENT '使用的扫描器版本',
`scanned_at` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
`error_message` TEXT DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `uk_scan_uuid` (`scan_uuid`),
KEY `idx_project_id_scanned_at` (`project_id`, `scanned_at`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin;
-- `dependencies`: 存储扫描出的依赖关系 (扁平化)
-- 这里的 parent_id 用于构建依赖树
CREATE TABLE `dependencies` (
`id` BIGINT(20) UNSIGNED NOT NULL AUTO_INCREMENT,
`scan_id` BIGINT(20) UNSIGNED NOT NULL COMMENT '关联的扫描ID',
`name` VARCHAR(255) NOT NULL COMMENT '依赖包名称',
`version` VARCHAR(100) NOT NULL COMMENT '依赖包版本',
`parent_id` BIGINT(20) UNSIGNED DEFAULT NULL COMMENT '父依赖ID, 构成树状结构',
PRIMARY KEY (`id`),
KEY `idx_scan_id` (`scan_id`),
KEY `idx_name_version` (`name`, `version`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin;
-- `vulnerabilities`: 存储发现的漏洞信息
CREATE TABLE `vulnerabilities` (
`id` BIGINT(20) UNSIGNED NOT NULL AUTO_INCREMENT,
`dependency_id` BIGINT(20) UNSIGNED NOT NULL COMMENT '关联的依赖ID',
`vuln_id` VARCHAR(100) NOT NULL COMMENT '漏洞编号, e.g., CVE-2021-44228',
`severity` ENUM('UNKNOWN', 'LOW', 'MEDIUM', 'HIGH', 'CRITICAL') NOT NULL,
`title` VARCHAR(512) NOT NULL COMMENT '漏洞标题',
`fixed_version` VARCHAR(100) DEFAULT NULL COMMENT '修复版本',
`data_source` VARCHAR(100) DEFAULT NULL COMMENT '漏洞数据源',
PRIMARY KEY (`id`),
KEY `idx_dependency_id` (`dependency_id`),
KEY `idx_vuln_id_severity` (`vuln_id`, `severity`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin;
后端扫描服务(例如用 Go 编写)的核心逻辑是作为一个消费者,从消息队列(如 Kafka)接收扫描任务。
package main
import (
"context"
"database/sql"
"encoding/json"
"fmt"
"log"
"os/exec"
"time"
_ "github.com/go-sql-driver/mysql" // TiDB is MySQL protocol compatible
"github.com/google/uuid"
)
// ScanRequest represents a message from the queue
type ScanRequest struct {
ProjectID uint64 `json:"project_id"`
RepoURL string `json:"repo_url"`
CommitSHA string `json:"commit_sha"`
}
// TrivyResult represents the parsed JSON output from Trivy scanner
type TrivyResult struct {
// ... struct fields matching Trivy's JSON output format
// This part is complex and needs careful mapping.
}
func main() {
// Database connection setup (DSN for TiDB)
dsn := "root:@tcp(tidb-host:4000)/supply_chain_db?charset=utf8mb4"
db, err := sql.Open("mysql", dsn)
if err != nil {
log.Fatalf("Failed to connect to TiDB: %v", err)
}
defer db.Close()
db.SetConnMaxLifetime(time.Minute * 3)
db.SetMaxOpenConns(10)
db.SetMaxIdleConns(10)
// Simplified: process one request. In reality, this would be a loop consuming from a queue.
req := ScanRequest{ProjectID: 1, RepoURL: "https://github.com/some/repo.git"}
processScan(context.Background(), db, req)
}
func processScan(ctx context.Context, db *sql.DB, req ScanRequest) {
scanUUID := uuid.New().String()
// 1. Create a scan record in 'PENDING' state
res, err := db.ExecContext(ctx, "INSERT INTO scans (project_id, scan_uuid, status) VALUES (?, ?, 'PENDING')", req.ProjectID, scanUUID)
if err != nil {
log.Printf("ERROR: Failed to create scan record for project %d: %v", req.ProjectID, err)
return
}
scanID, _ := res.LastInsertId()
// Update status to 'RUNNING'
_, err = db.ExecContext(ctx, "UPDATE scans SET status = 'RUNNING' WHERE id = ?", scanID)
if err != nil { /* ... error handling ... */ }
// 2. In a real system: clone the repo to a temporary directory
// For this example, we assume the code is in `/tmp/repo`
// 3. Execute the scanner
// The command must output JSON to stdout for parsing
cmd := exec.Command("trivy", "fs", "--format", "json", "/tmp/repo")
output, err := cmd.Output()
if err != nil {
// Handle scanner execution error, update scan record to 'FAILED'
errMsg := fmt.Sprintf("Scanner execution failed: %v, output: %s", err, string(output))
db.ExecContext(ctx, "UPDATE scans SET status = 'FAILED', error_message = ? WHERE id = ?", errMsg, scanID)
log.Println(errMsg)
return
}
// 4. Parse the result
var result TrivyResult
if err := json.Unmarshal(output, &result); err != nil {
// Handle JSON parsing error
errMsg := fmt.Sprintf("Failed to parse scanner output: %v", err)
db.ExecContext(ctx, "UPDATE scans SET status = 'FAILED', error_message = ? WHERE id = ?", errMsg, scanID)
log.Println(errMsg)
return
}
// 5. Persist results to TiDB within a transaction
tx, err := db.BeginTx(ctx, nil)
if err != nil { /* ... error handling ... */ }
defer tx.Rollback() // Rollback on any error
// Here lies the complex logic to traverse the parsed result,
// and insert into `dependencies` and `vulnerabilities` tables.
// This logic must handle dependency trees correctly by saving parent IDs.
// e.g., insertDependencies(tx, scanID, result.Packages)
// If all successful, commit the transaction
if err := tx.Commit(); err != nil {
log.Printf("ERROR: Failed to commit transaction for scan %s: %v", scanUUID, err)
return
}
// 6. Finalize scan status
_, err = db.ExecContext(ctx, "UPDATE scans SET status = 'SUCCESS' WHERE id = ?", scanID)
log.Printf("Successfully completed scan %s for project %d", scanUUID, req.ProjectID)
}
这个后端服务的关键在于健壮性:详尽的错误处理,以及将数据写入数据库时的事务性保证。任何一步失败,都不能留下脏数据。
前端架构:原子状态与隔离化组件开发
前端的核心是 DependencyGraphComponent
。它的状态管理是成败的关键。
采用 Jotai 哲学的原子化状态服务
我们不创建一个庞大的 GraphStateService
,而是创建一系列微小的、可注入的服务,每个服务只管理一个状态原子。
// in state/graph-zoom.state.ts
import { Injectable } from '@angular/core';
import { BehaviorSubject } from 'rxjs';
@Injectable({ providedIn: 'root' })
export class GraphZoomState {
private readonly zoomLevelSubject = new BehaviorSubject<number>(1.0);
public readonly zoomLevel$ = this.zoomLevelSubject.asObservable();
setZoom(level: number) {
// Add validation or constraints if needed
this.zoomLevelSubject.next(level);
}
getCurrentZoom(): number {
return this.zoomLevelSubject.value;
}
}
// in state/graph-filter.state.ts
import { Injectable } from '@angular/core';
import { BehaviorSubject } from 'rxjs';
export interface GraphFilter {
severity: ('HIGH' | 'CRITICAL' | 'MEDIUM' | 'LOW')[];
textSearch: string;
}
const INITIAL_FILTER: GraphFilter = { severity: ['CRITICAL', 'HIGH'], textSearch: '' };
@Injectable({ providedIn: 'root' })
export class GraphFilterState {
private readonly filterSubject = new BehaviorSubject<GraphFilter>(INITIAL_FILTER);
public readonly filter$ = this.filterSubject.asObservable();
updateFilter(partialFilter: Partial<GraphFilter>) {
const newFilter = { ...this.filterSubject.value, ...partialFilter };
this.filterSubject.next(newFilter);
}
}
依赖图谱组件的实现
图谱组件通过依赖注入获取这些原子状态服务,并组合它们的 Observables 来驱动视图更新。
// in dependency-graph.component.ts
import { Component, OnInit, OnDestroy } from '@angular/core';
import { combineLatest, Subject } from 'rxjs';
import { takeUntil, map } from 'rxjs/operators';
import { D3ForceDirectedGraph } from 'd3-ng2-service'; // Assuming a D3 library wrapper
import { GraphDataService } from './services/graph-data.service'; // Fetches graph data from API
import { GraphZoomState } from './state/graph-zoom.state';
import { GraphFilterState, GraphFilter } from './state/graph-filter.state';
@Component({
selector: 'app-dependency-graph',
template: `<div id="graph-container"></div>`, // A container for D3.js to render into
// ... styles
})
export class DependencyGraphComponent implements OnInit, OnDestroy {
private graph: D3ForceDirectedGraph;
private nodes: any[] = [];
private links: any[] = [];
private destroy$ = new Subject<void>();
constructor(
private graphDataService: GraphDataService,
// Injecting atomic state services
public zoomState: GraphZoomState,
public filterState: GraphFilterState
) {}
ngOnInit() {
// Combine raw data stream with filter stream
const filteredData$ = combineLatest([
this.graphDataService.getGraphDataForScan('some-scan-uuid'), // This fetches the raw graph
this.filterState.filter$
]).pipe(
map(([graphData, filter]) => this.applyFilter(graphData, filter))
);
// Subscribe to the combined stream to update the graph
filteredData$.pipe(takeUntil(this.destroy$)).subscribe(({ nodes, links }) => {
this.nodes = nodes;
this.links = links;
this.renderGraph();
});
// Subscribe to zoom state separately
this.zoomState.zoomLevel$.pipe(takeUntil(this.destroy$)).subscribe(zoom => {
if (this.graph) {
// Logic to apply zoom to the D3 visualization
}
});
}
private applyFilter(graphData: any, filter: GraphFilter): { nodes: any[], links: any[] } {
// Production-level filtering logic is complex.
// It needs to filter nodes based on severity and text search,
// and then correctly prune the links and potentially orphaned nodes.
// This is a common place for performance bottlenecks.
console.log('Applying filter:', filter);
const visibleNodes = graphData.nodes.filter(node =>
(filter.severity.length === 0 || node.vulnerabilities.some(v => filter.severity.includes(v.severity))) &&
(node.name.includes(filter.textSearch))
);
const visibleNodeIds = new Set(visibleNodes.map(n => n.id));
const visibleLinks = graphData.links.filter(link =>
visibleNodeIds.has(link.source) && visibleNodeIds.has(link.target)
);
return { nodes: visibleNodes, links: visibleLinks };
}
private renderGraph() {
// Logic to initialize or update the D3.js force-directed graph
// with this.nodes and this.links. This is a substantial piece of code
// involving D3 selections, transitions, and event handlers.
if (!this.graph) {
// Initialize graph
} else {
// Update graph
}
console.log(`Rendering graph with ${this.nodes.length} nodes and ${this.links.length} links.`);
}
ngOnDestroy() {
this.destroy$.next();
this.destroy$.complete();
}
}
这种架构的优势在于,DependencyGraphComponent
的职责清晰:它只是一个编排者,消费状态流并调用渲染逻辑。而状态的来源和变更逻辑则被分散到各个独立的、可测试的原子状态服务中。
使用 Storybook 隔离开发
在 Storybook 中,我们可以为 DependencyGraphComponent
创建多个“故事”,每个故事代表一个特定的状态或场景。
// in dependency-graph.stories.ts
import { moduleMetadata } from '@storybook/angular';
import { CommonModule } from '@angular/common';
import { Story, Meta } from '@storybook/angular/types-6-0';
import { of } from 'rxjs';
import { DependencyGraphComponent } from './dependency-graph.component';
import { GraphDataService } from './services/graph-data.service';
import { GraphZoomState } from './state/graph-zoom.state';
import { GraphFilterState } from './state/graph-filter.state';
// Mock data
const mockGraphData = {
nodes: [
{ id: 'app', name: '[email protected]', vulnerabilities: [] },
{ id: 'lodash', name: '[email protected]', vulnerabilities: [{ severity: 'MEDIUM' }] },
{ id: 'react', name: '[email protected]', vulnerabilities: [{ severity: 'CRITICAL' }] }
],
links: [
{ source: 'app', target: 'lodash' },
{ source: 'app', target: 'react' }
]
};
// Mock services
class MockGraphDataService {
getGraphDataForScan(scanId: string) { return of(mockGraphData); }
}
export default {
title: 'Features/Dependency Graph',
component: DependencyGraphComponent,
decorators: [
moduleMetadata({
imports: [CommonModule],
providers: [
{ provide: GraphDataService, useClass: MockGraphDataService },
GraphZoomState, // Use real state services for interaction testing
GraphFilterState,
],
}),
],
} as Meta;
const Template: Story<DependencyGraphComponent> = (args: DependencyGraphComponent) => ({
props: args,
});
export const DefaultView = Template.bind({});
DefaultView.args = {};
export const HugeGraph = Template.bind({});
HugeGraph.play = async ({ canvasElement }) => {
// In a real story, we could use a custom provider to inject
// a mock service that returns a graph with 10,000 nodes
// to test rendering performance.
};
export const WithCriticalFilter = Template.bind({});
WithCriticalFilter.play = async ({ canvasElement, args }) => {
// We can programmatically interact with the component's injected services
// to simulate user actions and test different states.
const filterState = (args as any).__component__.filterState as GraphFilterState;
filterState.updateFilter({ severity: ['CRITICAL'] });
};
通过 Storybook,我们可以在不启动整个 Angular 应用的情况下,独立地验证图谱组件在各种数据和状态下的行为,极大地提升了开发效率和组件的健robustness。
graph TD subgraph Backend A[Message Queue: Scan Tasks] --> B{Go Scanner Service}; B -- Clones Repo & Runs Trivy --> C[FileSystem]; B -- Parses JSON --> D[Dependency & Vulnerability Data]; D -- Writes in Transaction --> E[TiDB Cluster]; end subgraph Frontend F[Angular App] --> G[API Layer]; G -- Queries --> E; F --> H(DependencyGraphComponent); subgraph "Atomic State (Jotai Philosophy)" I[GraphFilterState] J[GraphZoomState] K[SelectedNodeState] end H -- Injects & Subscribes --> I; H -- Injects & Subscribes --> J; H -- Injects & Subscribes --> K; L[User Interaction] --> M{Filter/Zoom Controls}; M -- Calls Methods --> I; M -- Calls Methods --> J; end subgraph "Isolated Development" N[Storybook] -.-> H; O[Mock Services/Data] --> N; end
局限性与未来展望
这套架构解决了我们最初面临的核心问题,但它并非银弹。当前的实现依然存在一些可以优化的方向。
首先,后端的扫描过程是单体的。对于一个包含数十个子项目的 monorepo,全量扫描一次可能耗时很久。未来的迭代方向是将扫描任务拆解得更细,实现对 monorepo 内部包的并行扫描,并将结果原子化地存入 TiDB。
其次,前端的依赖图渲染在节点数超过一万时,浏览器会开始出现性能瓶颈。我们正在探索基于 WebGL 的渲染方案(如 vis-gl
)或采用虚拟化渲染技术,只渲染视口内的节点和边,以支持超大规模依赖图的流畅交互。
最后,当前的平台主要关注“发现”问题。下一步是走向“解决”问题。这包括集成软件物料清单(SBOM)的生成与管理,与 CI/CD 系统联动实现构建阶段的自动阻断,并利用 TiDB 的 HTAP 能力,对历史漏洞数据进行机器学习分析,预测未来可能出现的高风险技术栈或依赖引入模式。