Large Language Model Plagiarism Detection: A Deep Dive into MDIR Technology

Introduction

The rapid advancement of Large Language Models (LLMs) has brought intellectual property (IP) concerns to the forefront. Developers may copy model weights without authorization, disguising originality through fine-tuning or continued pretraining. Such practices not only violate IP rights but also risk legal repercussions.

This article explores Matrix-Driven Instant Review (MDIR), a novel technique for detecting LLM plagiarism through mathematical weight analysis. All content derives from the research paper “Matrix-Driven Instant Review: Confident Detection and Reconstruction of LLM Plagiarism on PC”.

Why Do We Need New Detection Methods?

Limitations of Existing Approaches

Traditional detection methods fall into two categories but suffer critical shortcomings:

Method Type	Key Issues
Retrieval-based	Requires vendor-specific keys/prompts; impractical without training data access.
Representation-based	Only identifies similarity; lacks statistical significance metrics (e.g., p-values).

MDIR’s Innovations

MDIR leverages matrix analysis and probability theory to:

•

Directly compute weight similarity without vendor data
•

Provide rigorous statistical validation

Core Principles of MDIR

1. Matrix Decomposition Techniques

Singular Value Decomposition (SVD)

Decomposes weight matrices into three components:

A = U * S * V^T

•

U, V: Orthogonal matrices (rotation/reflection)
•

S: Diagonal matrix (contains singular values)

Polar Decomposition

Expresses matrices as the product of symmetric positive-definite and orthogonal matrices:

A = P * W  or  A = W * Q

•

P, Q: Symmetric positive-definite matrices (scaling)
•

W: Orthogonal matrix (rotation/reflection)

2. Key Mathematical Tools

Tool	Purpose
Large Deviation Theory	Analyzes extreme event probabilities in random matrices; estimates p-values.
Random Matrix Theory	Studies statistical distribution patterns in matrix elements.

MDIR Workflow Explained

Step 1: Embedding Layer Analysis

Objective: Initial similarity assessment through vocabulary embeddings.

Process:

Extract Embedding Matrices
- •
  
  Model A: E ∈ ℝ^(Vocabulary Size×Embedding Dimension)
- •
  
  Model B: E’ ∈ ℝ^(Vocabulary Size×Embedding Dimension)
Identify Shared Vocabulary
- •
  
  Collect overlapping tokens (e.g., ASCII characters, common English words).
Compute Orthogonal Transformation Matrix
- •
  
  Apply polar decomposition: U = Ortho(E[Shared Tokens]^T * E'[Shared Tokens]).
Validate Permutation Matrix
- •
  
  Find permutation matrix P that maximizes Tr(PU^T)*, revealing vocabulary mapping.

Example:

High similarity between embedding layers may produce patterns like this heatmap (figure).

Step 2: Attention Module Analysis

Objective: Verify if attention mechanism parameters originate from the same architecture.

Key Formula:

Q' ≈ U * Q * W_Q  
K' ≈ U * K * W_K  
V' ≈ U * V * W_V  
O' ≈ W_O^{-1} * O * U^{-1}

•

Q, K, V, O: Query/Key/Value/Output matrices of Model A
•

Q’, K’, V’, O’: Corresponding matrices of Model B
•

W_Q, W_K, W_V, W_O: Inner transformation matrices

Detection Method:

Layer-wise Transformation Calculation
- •
  
  Compute orthogonal matrices W_Q, W_K, W_V for each layer’s attention parameters.
Statistical Significance Check
- •
  
  Use Large Deviation Theory to estimate p-values. A p < 2×10^-23 (10σ standard) indicates plagiarism.

Step 3: MLP Module Analysis

Objective: Examine Multi-Layer Perceptron (MLP) parameter similarity.

Key Formula:

U_X = Ortho(X^T * U^T * X')  
P = argmax_{P∈Permutation Group} Tr(P * U_Up^T)

•

X ∈ {Gate, Up, Down}: MLP gate/upper projection/lower projection matrices
•

U_Up: Orthogonal component of upper projection matrix

Case Studies & Experimental Results

Case 1: Official Fine-tuned Models

Model Pairs:

•

Qwen2.5-0.5B vs Qwen2.5-0.5B-Instruct
•

Meta-Llama-3.1-8B vs Meta-Llama-3.1-8B-Instruct

Results:

•

Embedding similarity p-values extremely low (10^-171,931), confirming shared origin.

Case 2: Continued Pretraining Models

Model Pairs:

•

Qwen2-7B vs Qwen2.5-7B
•

Llama-3-8B vs Llama-3.1-8B-Instruct

Results:

•

Attention modules show significant similarity with p-values up to 10^-1,384,545.

Case 3: Architectural Divergence Verification

Model Pairs:

•

Meta-Llama-3.1-8B vs Qwen3-8B-Base
•

DeepSeek-V3-Base vs Kimi-K2-Instruct

Results:

•

No statistically significant p-values, correctly identifying unrelated models.

Frequently Asked Questions (FAQ)

Q1: What types of plagiarism can MDIR detect?

A: Detects fine-tuning, continued pretraining, pruning, architectural transformations, and obfuscation.

Q2: What computational resources are needed?

A: Runs on a standard PC without GPU, enabling rapid verification.

Q3: Does it support models with different tokenizers?

A: Yes! Similarity is calculated using overlapping token subsets.

Q4: How to interpret p-value significance?

A: Use 10σ standard (p < 2×10^-23) for minimal false positives.

Technical Limitations

Numerical Precision Issues
- •
  
  Matrix decomposition errors may occur, especially with low-precision formats (fp16/bf16).
Extreme p-value Interpretation
- •
  
  Billions of parameters lead to extremely small p-values. Actual significance may be lower due to computational precision limits.

Future Research Directions

Direction	Description
Evasion Techniques	Explore methods like high learning rates to bypass detection.
Semi-Orthogonal p-values	Improve statistical inference for non-square matrices.

Conclusion

MDIR provides a mathematically rigorous framework for efficient LLM plagiarism detection. As models grow larger, such technologies become crucial for maintaining AI ecosystem integrity.

LLM Plagiarism Detection Breakthrough: How MDIR Technology Ensures AI Integrity

Large Language Model Plagiarism Detection: A Deep Dive into MDIR Technology

Introduction

Why Do We Need New Detection Methods?

Limitations of Existing Approaches

MDIR’s Innovations

Core Principles of MDIR

1. Matrix Decomposition Techniques

Singular Value Decomposition (SVD)

Polar Decomposition

2. Key Mathematical Tools

MDIR Workflow Explained

Step 1: Embedding Layer Analysis

Process:

Example:

Step 2: Attention Module Analysis

Key Formula:

Detection Method:

Step 3: MLP Module Analysis

Key Formula:

Case Studies & Experimental Results

Case 1: Official Fine-tuned Models

Case 2: Continued Pretraining Models

Case 3: Architectural Divergence Verification

Frequently Asked Questions (FAQ)

Q1: What types of plagiarism can MDIR detect?

Q2: What computational resources are needed?

Q3: Does it support models with different tokenizers?

Q4: How to interpret p-value significance?

Technical Limitations

Future Research Directions

Conclusion

Related Posts