セマンティックカーネルを使用した C# での RAG システムの構築

March 18, 2026 · 4分で読める

Available in: 日本語 · English · Español · Français · Deutsch · Português · 한국어 · Русский · 中文 · العربية · हिन्दी · Polski · Türkçe · Bahasa Indonesia · Nederlands

はじめに

LLM を使用して自分のデータ (会社のドキュメント、製品仕様、社内ナレッジベース) に関する質問に答えようとしたことがあるなら、おそらく、LLM が幻覚を示すか、単に「それに関する情報がありません」と表示されることに気づいたでしょう。それは、モデルは何に基づいてトレーニングされたかしか知らないからです。

RAG (検索拡張生成) はこれを修正します。データに基づいてモデルを微調整する代わりに、クエリ時にドキュメントの関連するチャンクを取得し、それらをコンテキストとして LLM に渡します。その後、モデルは実際のデータに基づいた回答を生成します。

この投稿では、セマンティックカーネルを使用して C# で完全な RAG パイプラインを構築する手順を説明します。

RAG の仕組み

流れは簡単です。

取り込み: ドキュメントをチャンクに分割し、各チャンクのエンベディングを生成し、ベクトルデータベースに保存します。
クエリ: ユーザーが質問すると、クエリの埋め込みを生成し、ベクターデータベースで類似のチャンクを検索します。
生成: 取得したチャンクを、ユーザーの質問とともにコンテキストとして LLM に渡します。

それだけです。魔法は埋め込みにあります。埋め込みはテキストの意味論的な意味をベクトルとして取得するため、単語が正確に一致しない場合でも、関連するコンテンツを見つけることができます。

前提条件

dotnet add package Microsoft.SemanticKernel
dotnet add package Microsoft.SemanticKernel.Connectors.AzureOpenAI
dotnet add package Microsoft.Extensions.VectorData.Abstractions
dotnet add package Microsoft.SemanticKernel.Connectors.InMemory

運用環境では、メモリ内ストアを Azure AI Search、Qdrant、Pinecone、またはその他のサポートされているベクターデータベースに交換します。ただし、インメモリは学習やプロトタイピングには最適です。

カーネルのセットアップ

using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.Connectors.AzureOpenAI;
using Microsoft.Extensions.VectorData;
using Microsoft.SemanticKernel.Connectors.InMemory;
using Microsoft.SemanticKernel.Embeddings;

var builder = Kernel.CreateBuilder();

builder.AddAzureOpenAIChatCompletion(
    deploymentName: "gpt-4o",
    endpoint: config["AzureOpenAI:Endpoint"],
    apiKey: config["AzureOpenAI:ApiKey"]);

builder.AddAzureOpenAITextEmbeddingGeneration(
    deploymentName: "text-embedding-3-small",
    endpoint: config["AzureOpenAI:Endpoint"],
    apiKey: config["AzureOpenAI:ApiKey"]);

var kernel = builder.Build();

2 つのモデルが必要です。1 つはチャット完了 (質問に答える) 用、もう 1 つは埋め込みを生成する (テキストをベクトルに変換する) 用です。

データモデルの定義

ベクターストア内のドキュメントチャンクを表すクラスが必要です。

using Microsoft.Extensions.VectorData;

public class DocumentChunk
{
    [VectorStoreRecordKey]
    public string Id { get; set; } = Guid.NewGuid().ToString();

    [VectorStoreRecordData]
    public string Content { get; set; } = string.Empty;

    [VectorStoreRecordData]
    public string Source { get; set; } = string.Empty;

    [VectorStoreRecordData]
    public int ChunkIndex { get; set; }

    [VectorStoreRecordVector(1536)]
    public ReadOnlyMemory<float> Embedding { get; set; }
}

VectorStoreRecordVector(1536) 属性は、ベクトルストアに埋め込みの次元を伝えます。 text-embedding-3-small モデルは 1536 次元のベクトルを生成します。

ドキュメントのチャンク化

埋め込みを作成する前に、ドキュメントを管理可能なチャンクに分割する必要があります。単純なテキスト分割ツールを次に示します。

public static class TextChunker
{
    public static List<string> SplitText(string text, int maxChunkSize = 500, int overlap = 50)
    {
        var chunks = new List<string>();
        var paragraphs = text.Split("\n\n", StringSplitOptions.RemoveEmptyEntries);
        var currentChunk = new System.Text.StringBuilder();

        foreach (var paragraph in paragraphs)
        {
            if (currentChunk.Length + paragraph.Length > maxChunkSize && currentChunk.Length > 0)
            {
                chunks.Add(currentChunk.ToString().Trim());

                // Keep overlap from the end of the previous chunk
                var overlapText = currentChunk.ToString();
                currentChunk.Clear();
                if (overlapText.Length > overlap)
                {
                    currentChunk.Append(overlapText[^overlap..]);
                    currentChunk.Append(' ');
                }
            }

            currentChunk.Append(paragraph);
            currentChunk.Append("\n\n");
        }

        if (currentChunk.Length > 0)
        {
            chunks.Add(currentChunk.ToString().Trim());
        }

        return chunks;
    }
}

オーバーラップは重要です。これにより、チャンク間の境界のコンテキストが失われないことが保証されます。関連する文が 2 つのチャンクに分割されている場合、重複することは、その文が少なくとも 1 つのチャンクに完全に表示されることを意味します。

ドキュメントの取り込み

次に、ドキュメントをベクターストアに取り込むためにすべてをまとめてみましょう。

var vectorStore = new InMemoryVectorStore();
var collection = vectorStore.GetCollection<string, DocumentChunk>("documents");
await collection.CreateCollectionIfNotExistsAsync();

var embeddingService = kernel.GetRequiredService<ITextEmbeddingGenerationService>();

async Task IngestDocument(string content, string source)
{
    var chunks = TextChunker.SplitText(content);

    for (int i = 0; i < chunks.Count; i++)
    {
        var embedding = await embeddingService.GenerateEmbeddingAsync(chunks[i]);

        var chunk = new DocumentChunk
        {
            Content = chunks[i],
            Source = source,
            ChunkIndex = i,
            Embedding = embedding
        };

        await collection.UpsertAsync(chunk);
    }

    Console.WriteLine($"✅ Ingested {chunks.Count} chunks from {source}");
}

// Ingest some documents
var doc1 = await File.ReadAllTextAsync("docs/product-guide.md");
var doc2 = await File.ReadAllTextAsync("docs/faq.md");
var doc3 = await File.ReadAllTextAsync("docs/troubleshooting.md");

await IngestDocument(doc1, "product-guide.md");
await IngestDocument(doc2, "faq.md");
await IngestDocument(doc3, "troubleshooting.md");

コンテキストを含む回答の生成

次に、RAG 部分です。取得したチャンクを取得し、プロンプトにコンテキストとして含めます。

using Microsoft.SemanticKernel.ChatCompletion;

var chatService = kernel.GetRequiredService<IChatCompletionService>();

async Task<string> AskAsync(string question)
{
    // Step 1: Retrieve relevant chunks
    var relevantChunks = await SearchAsync(question);

    // Step 2: Build context from chunks
    var context = string.Join("\n\n---\n\n",
        relevantChunks.Select(c => $"[Source: {c.Source}]\n{c.Content}"));

    // Step 3: Generate answer with context
    var history = new ChatHistory();
    history.AddSystemMessage($$"""
        You are a helpful assistant that answers questions based on the provided context.
        Use ONLY the information from the context to answer. If the context doesn't contain
        enough information to answer the question, say "I don't have enough information
        to answer that question."

        Do not make up information. Always cite the source document when possible.

        Context:
        {{context}}
        """);

    history.AddUserMessage(question);

    var response = await chatService.GetChatMessageContentAsync(history);
    return response.Content ?? "No response generated.";
}

それを使用する

// Ask questions about your documents
var answer1 = await AskAsync("How do I reset my password?");
Console.WriteLine($"Q: How do I reset my password?\nA: {answer1}\n");

var answer2 = await AskAsync("What are the system requirements?");
Console.WriteLine($"Q: What are the system requirements?\nA: {answer2}\n");

var answer3 = await AskAsync("What's the capital of France?");
Console.WriteLine($"Q: What's the capital of France?\nA: {answer3}\n");
// Should respond with "I don't have enough information" since it's not in the docs

本番環境への移行

インメモリベクターストアはプロトタイピングには最適ですが、運用環境では永続的なベクターデータベースが必要になります。セマンティックカーネルには、いくつかのオプション用のコネクタがあります。

# Azure AI Search
dotnet add package Microsoft.SemanticKernel.Connectors.AzureAISearch

# Qdrant
dotnet add package Microsoft.SemanticKernel.Connectors.Qdrant

# Redis
dotnet add package Microsoft.SemanticKernel.Connectors.Redis

これらはすべて同じ IVectorStore インターフェイスを実装しているため、交換は簡単です。

// Instead of InMemoryVectorStore, use:
using Azure;
using Microsoft.SemanticKernel.Connectors.AzureAISearch;

var vectorStore = new AzureAISearchVectorStore(
    new Azure.Search.Documents.Indexes.SearchIndexClient(
        new Uri(config["AzureAISearch:Endpoint"]),
        new AzureKeyCredential(config["AzureAISearch:ApiKey"])));
```他はすべて同じままです。それが抽象化の美しさです。

## RAG システム構築のヒント

私が苦労して学んだことがいくつかあります。

- **チャンク サイズは非常に重要です。** 小さすぎるとコンテキストが失われます。大きすぎると、無関係なコンテンツにトークンを無駄に消費することになります。 500 ～ 800 トークンから始めて、データに基づいて調整します。
- **オーバーラップにより境界の問題が防止されます。** 通常、チャンク間のトークンのオーバーラップは 50 ～ 100 で十分です。
- **思っている以上に多くの情報を取得します。** `topK = 5` から始めて、ノイズが多すぎる場合は減らしてください。関連するチャンクを見逃してしまうよりは、追加のコンテキストがあったほうが良いでしょう。
- **システム プロンプトは非常に重要です。** 提供されたコンテキストのみを使用することを明確にしてください。その指示がなければ、モデルは「トレーニング データに基づいて」喜んで幻覚を見せるでしょう。
- **ソースを追跡します。** 常にチャンクとともにメタデータを保存して、答えの出所を引用できるようにします。ソースを確認できる場合、ユーザーは回答をより信頼します。
- **必要に応じて再ランク付けします。** ベクトルの類似性は完璧ではありません。クリティカルなアプリケーションの場合は、クロスエンコーダー モデルを使用して再ランキング ステップを追加し、精度を向上させます。

## 結論

RAG は、現在 AI で最も実用的なパターンの 1 つです。これにより、微調整することなく独自のデータに AI を活用した Q&A システムを構築でき、セマンティック カーネルにより C# で驚くほどクリーンになります。インメモリ ストアから始めて、チャンクとプロンプトを正しく取得し、本番環境の準備ができたら実際のベクトル データベースに交換します。

コーディングを楽しんでください!

## リソース

- [セマンティック カーネル ベクター ストアのドキュメント](https://learn.microsoft.com/en-us/semantic-kernel/concepts/vector-store-connectors/)
- [Azure AI Search を使用した RAG パターン](https://learn.microsoft.com/en-us/azure/search/retrieval-augmented-generation-overview)
- [テキスト埋め込みモデル](https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/models#embeddings)