🧐

1. 概述

结构化输出是构建生产级 AI 应用的关键技术。它允许开发者将 AI 模型的输出解析为程序可以直接使用的结构化数据（如 JSON），而非传统的自由文本。这对于数据提取、智能助手命令解析、内容生成与格式化等场景至关重要。

2. 核心概念

2.1 泛型扩展方法

GetResponseAsync<T>() 是 Microsoft.Extensions.AI 提供的最推荐的结构化输出方式。

功能：自动生成 JSON Schema，自动配置 ResponseFormat，自动将响应反序列化为强类型对象。
优势：代码简洁、类型安全、维护成本低。

2.2 响应格式 (ChatResponseFormat)

用于定义 AI 的响应格式：

ChatResponseFormat.Text：纯文本格式（默认）。
ChatResponseFormat.Json：自由格式 JSON 对象。
ChatResponseFormatJson.ForJsonSchema：符合预设 JSON Schema 的结构化输出。

2.3 JSON Schema

JSON Schema 是描述 JSON 数据结构的标准。泛型方法会自动根据 C# 类型定义生成对应的 Schema。

3. 基础实践

3.1 定义数据模型

首先定义期望输出的 C# 类。建议使用 System.Text.Json.Serialization 命名空间下的特性来增强控制。

using System.Text.Json.Serialization;

/// <summary>
/// 个人信息数据模型
/// </summary>
public class PersonInfo
{
    [JsonPropertyName("name")]
    public string? Name { get; set; }
    
    [JsonPropertyName("age")]
    public int? Age { get; set; }
    
    [JsonPropertyName("occupation")]
    public string? Occupation { get; set; }
    
    [JsonPropertyName("location")]
    public string? Location { get; set; }
}

3.2 使用泛型方法调用

使用 chatClient.GetResponseAsync<T> 直接获取结构化结果。

var messages = new[]
{
    new ChatMessage(ChatRole.System, "你是一个信息提取助手。"),
    new ChatMessage(ChatRole.User, "请提取：张伟是一名35岁的软件工程师，目前在北京工作。")
};

// 自动处理 Schema 生成、序列化和反序列化
var response = await chatClient.GetResponseAsync<PersonInfo>(messages);
var result = response.Result; // 获取强类型 PersonInfo 对象

4. 进阶应用场景

4.1 嵌套对象与枚举

处理复杂结构时，可以使用嵌套类和枚举类型。

public class ProductReviewAnalysis
{
    [JsonPropertyName("product_name")]
    public string? ProductName { get; set; }
    
    [JsonPropertyName("sentiment")]
    public SentimentAnalysis? Sentiment { get; set; } // 嵌套对象
    
    [JsonPropertyName("recommendation")]
    public bool Recommendation { get; set; }
}

public class SentimentAnalysis
{
    [JsonPropertyName("sentiment")]
    public string? Sentiment { get; set; }
    
    [JsonPropertyName("confidence")]
    public double Confidence { get; set; }
}

4.2 列表/数组输出

当需要返回对象列表时，建议定义一个包装类来包含列表属性，而非直接请求 List 类型。

public class ContactList
{
    [JsonPropertyName("contacts")]
    public List<ContactInfo>? Contacts { get; set; }
}

// 调用示例
var response = await chatClient.GetResponseAsync<ContactList>(messages);

4.3 流式输出处理

泛型方法 GetResponseAsync<T> 目前不支持流式输出。若需流式处理，需手动收集响应并反序列化。

使用 CompleteStreamingAsync 或 GetStreamingResponseAsync。
拼接完整的 JSON 字符串。
手动调用 JsonSerializer.Deserialize<T>。

5. 国内模型适配 (DeepSeek/Qwen)

国内模型（如 DeepSeek、Qwen）通常不支持标准的 JSON Schema，因此不能直接使用默认的泛型方法配置。

5.1 关键配置

参数设置：调用泛型方法时，必须设置 useJsonSchemaResponseFormat: false。
ChatOptions：使用 ChatResponseFormat.Json 而非 ForJsonSchema。

5.2 提示词工程

必须在 System Message 中详细描述 JSON 格式，包括模板、字段类型约束和示例。

5.3 代码示例

// 1. 配置 ChatOptions
ChatOptions deepseekOptions = new()
{
    ResponseFormat = ChatResponseFormat.Json // 注意：不是 ForJsonSchema
};

// 2. 定义包含详细格式说明的 System Message
var systemMessage = @"你是一个助手。请严格按照以下 JSON 格式返回结果：
{
  ""name"": ""姓名（字符串）"",
  ""age"": ""年龄（整数）""
}";

// 3. 调用泛型方法，关键在于 useJsonSchemaResponseFormat: false
var response = await deepseekChatClient.GetResponseAsync<PersonInfo>(
    messages: new[] { new ChatMessage(ChatRole.System, systemMessage), /*...*/ },
    options: deepseekOptions,
    useJsonSchemaResponseFormat: false // 国内模型必须设为 false
);

6. 最佳实践总结

优先使用泛型方法：GetResponseAsync<T> 是最高效、安全的方式。
模型定义规范：使用 clear 的属性名，配合 [JsonPropertyName]，为复杂字段添加 XML 注释，合理使用枚举和可空类型。
提示词优化：清晰说明字段含义，提供输出示例，特别是针对不支持 Schema 的模型。
异常处理：对于国内模型，需注意模型可能返回非标准 JSON（如包含 Markdown 代码块或注释），代码层面可能需要预处理字符串。
性能考虑：精简数据模型，避免不必要的嵌套，大批量数据建议分批处理。

MEAI_结构化输出