When someone searches for information today, they increasingly turn to AI models like ChatGPT, Perplexity, or Gemini instead of Google. But these models do not return a list of links. They synthesize the answer and cite the sources they trust most.
The question for anyone who runs a blog or content site is: How do you become one of those trusted sources? The answer lies in structured data, specifically JSON-LD knowledge graphs that help AI models understand not only what your content says, but how it connects to everything else you publish.
In this tutorial, you’ll create a PHP function that automatically generates a JSON-LD knowledge graph for each blog post on your site. There are no plugins, no external APIs, and just one function. It will detect entities in your content, map relationships between posts, and output a unified schema that can be parsed by both Google and AI models like ChatGPT as a coherent system.
Table of Contents
Why is this important now?
AI search engines are replacing blue links with synthesized answers. When someone asks ChatGPT a question, it doesn’t return a list of URLs. It prepares the answer by referring to its trusted sources.
According to AcuraCast’s research on AI search references81% of pages referenced by AI engines use schema markup with JSON-LD as the dominant format. Pages with a structured schema are 3 to 4 times more likely to be cited by ChatGPT or Perplexity than pages without.
Most JSON-LD tutorials teach you to paste statically. اپنے عنوان اور مصنف کے نام کے ساتھ ٹیگ کریں۔ یہ آپ کو گوگل کے انڈیکس میں لے جاتا ہے۔ لیکن یہ آپ کو AI کے ذریعہ حوالہ نہیں دیتا ہے۔
اس کے لیے، آپ کو نالج گراف کی ضرورت ہے:ایک ایسا نظام جہاں آپ کے ادارے(مصنف، سائٹ، عنوانات، ٹولز، متعلقہ مضامین)مستقل شناخت کنندگان کے ذریعے جڑے ہوئے ہوں جن کی مشینیں آپ کی سائٹ کے ہر صفحے پر عمل کر سکتی ہیں۔
میں نے یہ سسٹم اپنے بلاگ کے لیے بنایا ہے۔ تین زبانوں میں 52 پوسٹس کے ساتھ پروڈکشن میں تین ماہ کے بعد، میں نے ChatGPT، Gemini، اور Perplexity سے کہا کہ وہ نتیجے میں آنے والے اسکیما کا آڈٹ کرے۔ ChatGPT نے اسے 10 میں سے 9.1 اسکور کیا اور اسے بلایا“پروڈکشن گریڈ گراف ڈیزائن۔”یہ مضمون آپ کو ایک ہی چیز کی تعمیر کے بارے میں بتاتا ہے۔
شرطیں
اس ٹیوٹوریل کی پیروی کرنے کے لیے، آپ کو ضرورت ہوگی:- آپ کے سرور پر پی ایچ پی 7.4 یا اس سے زیادہ چل رہا ہے۔
- پوسٹس ٹیبل کے ساتھ ایک MySQL یا MariaDB ڈیٹا بیس جو آپ کے بلاگ کے مواد کو اسٹور کرتا ہے(عنوان، سلگ، مواد، اقتباس، create_at، update_at)
- پی ایچ پی کا بنیادی علم:متغیرات، صفوں، فنکشنز، اور ڈیٹا بیس کے سوالات PDO کے ساتھ
- ایک کام کرنے والا بلاگ جہاں آپ PHP فائلوں میں ترمیم کر سکتے ہیں اور اپنے HTML آؤٹ پٹ میں سکیما مارک اپ شامل کر سکتے ہیں۔
پائپ لائن
نظام چار مراحل میں کام کرتا ہے:
Stage 1:PHP queries MariaDB for the post content,metadata,and related post IDs.Stage 2:The system scans the content for known topics and tools using keyword matching.No NLP libraries needed.A simple associative array maps keywords to schema entities.Stage 3:Related posts are fetched and mapped as both navigation links(relatedLink)and knowledge relationships(citation).Stage 4:Everything gets combined into a single@grapharray with five connected entities:WebSite,Organization,Person,WebPage,and BlogPosting.Each entity has a stable@idthat machines can reference across pages.What Static JSON-LD Looks Like(And Why It Falls Short)
Here is what a typical tutorial tells you to add:{"@context":"
"@type": "BlogPosting",
"headline": "My Blog Post",
"author": {
"@type": "Person",
"name": "Jane"
},
"datePublished": "2026-01-15"
}
function getSchemaAuthor($baseUrl){return('@type'=>'Person','@id'=>$baseUrl.'/#author','name'=>'Your Name','description'=>'Your professional description.','url'=>$baseUrl.'/about','image'=>$baseUrl.'/photo.png','jobTitle'=>'Your Title','sameAs'=>('
'
'
)
);
}
function getSchemaOrganization($baseUrl) {
return (
'@type' => 'Organization',
'@id' => $baseUrl . '/#organization',
'name' => 'Your Site Name',
'url' => $baseUrl,
'logo' => (
'@type' => 'ImageObject',
'url' => $baseUrl . '/logo.png'
)
);
}
function getSchemaWebSite(\(baseUrl, \)siteName, \(siteDesc, \)langCode) {
return (
'@type' => 'WebSite',
'@id' => $baseUrl . '/#website',
'name' => $siteName,
'description' => $siteDesc,
'url' => $baseUrl,
'inLanguage' => $langCode,
'publisher' => ('@id' => $baseUrl . '/#organization')
);
}
gave @id Values are the most important detail. /#author, /#organizationand /#website There are persistent identifiers that remain the same on every page.
When a machine reads your homepage and then reads a blog post, it recognizes it. There is one entity in both places. without @ideach page creates a new floating entity that cannot be manipulated by machines.
A decision that matters: publisher There should be an organization, not a person. AI systems provide more trust in content published by organizations than individuals. Even if you’re a solo creator, set up your site as an organization for publishing purposes and mark yourself as a personal author.
Step 2: Create a blog posting schema
This function takes a post from your database and existing language code, then creates a basic BlogPosting entity.function generateBlogPostingSchema(\(post, \)langCode) {
$baseUrl = rtrim(SITE_URL, '/');
\(siteName = getLocalizedSetting('site_name', \)langCode);
\(siteDesc = getLocalizedSetting('site_description', \)langCode);
$defaultLang = getDefaultLanguage();
\(postSlug = \)post('slug');
\(postUrl = \)langCode === $defaultLang
? \(baseUrl . '/' . \)postSlug
: \(baseUrl . '/' . \)langCode . '/' . $postSlug;
\(excerpt = \)post('excerpt')
?: mb_substr(strip_tags($post('content')), 0, 160);
$blogPosting = (
'@type' => 'BlogPosting',
'@id' => $postUrl . '#article',
'headline' => $post('title'),
'description' => $excerpt,
'abstract' => $excerpt,
'url' => $postUrl,
'datePublished' => date('c', strtotime($post('created_at'))),
'dateModified' => date('c', strtotime($post('updated_at'))),
'author' => (
'@type' => 'Person',
'@id' => $baseUrl . '/#author',
'name' => 'Your Name',
'url' => $baseUrl . '/about'
),
'publisher' => (
'@type' => 'Organization',
'@id' => $baseUrl . '/#organization',
'name' => 'Your Site Name',
'logo' => (
'@type' => 'ImageObject',
'url' => $baseUrl . '/logo.png'
)
),
'isPartOf' => ('@id' => $baseUrl . '/#website'),
'mainEntityOfPage' => (
'@type' => 'WebPage',
'@id' => $postUrl
),
'inLanguage' => $langCode,
'wordCount' => str_word_count(strip_tags($post('content')))
);
Two features deserve attention.
abstract Map the post quote. LLMs read the abstract first to decide if the rest of the page is worth processing. If your quote says “In this post I’m looking for some ideas about…” models can leave you completely alone. Put it directly: “To implement a knowledge graph you need five connected entities with constant @id references.” This is something that an LLM can quickly assess.
isPartOf Links the article to the website entity. It tells the machines. “This article belongs to a large scholarly source.” Without it, each post looks like an independent document.
Pay attention to it. author And publisher Both are included @id and inline features. gave @id I relate to the complete entity. @graph. Inline properties are a fallback because some parsers (including Google’s rich results test) don’t always resolve @id Including both references ensures zero validation warnings.
Step 3: Add automatic entity detection.
This is where the static JSON-LD tutorials stop and your knowledge graph begins. Instead of manually tagging each post with its titles, the system automatically scans the content. \(contentLower = strtolower(\)post('content') . ' ' . $post('title'));
$topicMap = (
'midjourney' => ('name' => 'Midjourney', 'url' => '
'prompt'=>('name'=>'Prompt Engineering'),'fintech'=>('name'=>'Fintech UX Design'),'ux design'=>('name'=>'UX Design'),'llms.txt'=>('name'=>'llms.txt','url'=>'
'knowledge graph' => ('name' => 'Knowledge Graph'),
);
$aboutItems = ();
$keywordsList = ();
foreach (\(topicMap as \)keyword => $meta) {
if (strpos(\(contentLower, \)keyword) !== false) {
\(item = ('@type' => 'Thing', 'name' => \)meta('name'));
if (isset(\(meta('url'))) \)item('url') = $meta('url');
\(aboutItems() = \)item;
\(keywordsList() = \)meta('name');
}
}
if (!empty($aboutItems)) {
\(blogPosting('about') = \)aboutItems;
}
The same pattern detects tools mentioned in the content:
$toolMap = (
'midjourney' => ('name' => 'Midjourney', 'url' => '
'claude'=>('name'=>'Claude','url'=>'
'chatgpt' => ('name' => 'ChatGPT', 'url' => '
'figma'=>('name'=>'Figma','url'=>'
);
$mentionItems = ();
foreach (\(toolMap as \)keyword => $meta) {
if (strpos(\(contentLower, \)keyword) !== false) {
$mentionItems() = (
'@type' => 'Thing',
'name' => $meta('name'),
'url' => $meta('url')
);
\(keywordsList() = \)meta('name');
}
}
if (!empty($mentionItems)) {
\(blogPosting('mentions') = \)mentionItems;
}
if (!empty($keywordsList)) {
\(blogPosting('keywords') = array_values(array_unique(\)keywordsList));
}
The difference between about And mentions AI is critical to the quote. about Announces important topics. mentions Declares the tools and references that appear in the content. If there is a post-midsummer tutorial that also mentions Claude, about Midgernia gets and mentions Claude receives.
This distinction helps AI models decide whether or not to refer to your page when someone asks about Midjourney vs. Claude.
A question that comes up often: Do you need NLP for entity detection? A keyword map with no strpos Handles most cases for a personal blog. NLP adds complexity, latency, and dependencies you don’t need. If your topic map has 20 to 30 entries, keyword matching is fast, predictable, and easy to debug.
Step 4: Map the relationships between posts
Each post connects to related posts via two properties:relatedLink Navigation and citation For cognitive relationships.
\(relatedUrls = getRelatedPostUrls(\)post('id'), $langCode);
if (!empty($relatedUrls)) {
\(blogPosting('relatedLink') = \)relatedUrls;
\(blogPosting('citation') = \)relatedUrls;
}
Helper Function Questions a post_connections table:
function getRelatedPostUrls(\(postId, \)langCode) {
$pdo = getDB();
$baseUrl = rtrim(SITE_URL, '/');
$defaultLang = getDefaultLanguage();
\(stmt = \)pdo->prepare(
"SELECT connected_post_id FROM post_connections WHERE post_id = ?"
);
\(stmt->execute((\)postId));
\(connections = \)stmt->fetchAll(PDO::FETCH_COLUMN);
$urls = ();
foreach (\(connections as \)connId) {
\(slug = getPostSlugForLanguage(\)connId, $langCode);
if ($slug) {
\(urls() = \)langCode === $defaultLang
? \(baseUrl . '/' . \)slug
: \(baseUrl . '/' . \)langCode . '/' . $slug;
}
}
return $urls;
}
Why use both? relatedLink And citation At the same url? They signal different things to the machines. relatedLink Says “Readers will want to see these pages.” citation Says “This article builds on the knowledge of these other articles.”
AI models have weights. citation Deciding whether your content is part of a larger knowledge system, more heavily. Using both tells the machines that your related posts aren’t just navigation. They are the sources on which this essay builds.
Step 5: Add Multilingual Support
If your blog is published in multiple languages,workTranslation Combines different language versions of the same article.
$languages = getActiveLanguages();
$translations = ();
foreach (\(languages as \)lang) {
\(lc = \)lang('code');
if (\(lc === \)langCode) continue;
\(translatedSlug = getPostSlugForLanguage(\)post('id'), $lc);
if ($translatedSlug) {
\(translatedUrl = \)lc === $defaultLang
? \(baseUrl . '/' . \)translatedSlug
: \(baseUrl . '/' . \)lc . '/' . $translatedSlug;
\(stmtT = \)pdo->prepare(
"SELECT title FROM post_translations
WHERE post_id = ? AND language_code = ? LIMIT 1"
);
\(stmtT->execute((\)post('id'), $lc));
\(translatedTitle = \)stmtT->fetchColumn() ?: $post('title');
$translations() = (
'@type' => 'CreativeWork',
'@id' => $translatedUrl . '#article',
'headline' => $translatedTitle,
'url' => $translatedUrl,
'inLanguage' => $lc
);
}
}
if (!empty($translations)) {
\(blogPosting('workTranslation') = \)translations;
}
without workTranslationA blog with 50 posts in three languages looks like 150 independent articles for AI models. With that, a single blog looks like 50 pieces of knowledge with multilingual access. Authority strengthens rather than fragments.
Use translation. @type: CreativeWork instead of BlogPosting. This avoids warnings in Google’s rich results test where each translation will be flagged as a separate article that does not contain required fields.
Step 6: Assemble the graph.
Bring everything together: $webPage = (
'@type' => 'WebPage',
'@id' => $postUrl,
'url' => $postUrl,
'name' => $post('title'),
'isPartOf' => ('@id' => $baseUrl . '/#website')
);
$graph = (
'@context' => '
'@graph'=>(getSchemaWebSite(\(baseUrl,\)siteName,\(siteDesc,\)langCode),getSchemaOrganization($baseUrl),getSchemaAuthor($baseUrl),$webPage,$blogPosting));return'';
}

gave json_encode The flag matters. JSON_UNESCAPED_SLASHES Prevents URL escaping. JSON_UNESCAPED_UNICODE Keeps non-ASCII characters readable for multilingual content. Without them, a special character in a blog post title fetched from the database could silently break an entire JSON-LD block.
What does the output look like in production?
Here is the original JSON-LD generated by a real post. shinobis.coma blog about AI tools and UX design:
{
"@context": "
"@graph": (
{
"@type": "WebSite",
"@id": "
"name": "Designer in the Age of AI",
"description": "AI tools and real workflows from a designer who builds with AI.",
"url": "
"inLanguage": "en",
"publisher": { "@id": " }
},
{
"@type": "Organization",
"@id": ",
"name": "Shinobis",
"url": "
"logo": { "@type": "ImageObject", "url": " }
},
{
"@type": "Person",
"@id": "
"name": "Shinobis",
"description": "UX/UI Designer with 10+ years in banking and fintech.",
"url": "
"jobTitle": "UX/UI Designer",
"sameAs": (
"
"
)
},
{
"@type": "WebPage",
"@id": "
"url": "
"name": "One Year with AI: Open Letter to Designers",
"isPartOf": { "@id": " }
},
{
"@type": "BlogPosting",
"@id": "
"headline": "One Year with AI: Open Letter to Designers",
"description": "One year ago I started this journey. Today I write to all designers who are still doubting, fearing, or ignoring AI.",
"abstract": "One year ago I started this journey. Today I write to all designers who are still doubting, fearing, or ignoring AI.",
"url": "
"datePublished": "2026-02-15T09:00:00-05:00",
"dateModified": "2026-03-20T14:30:00-05:00",
"inLanguage": "en",
"wordCount": 1842,
"author": {
"@type": "Person",
"@id": "
"name": "Shinobis",
"url": "
},
"publisher": {
"@type": "Organization",
"@id": ",
"name": "Shinobis",
"logo": { "@type": "ImageObject", "url": " }
},
"isPartOf": { "@id": " },
"mainEntityOfPage": {
"@type": "WebPage",
"@id": "
},
"about": (
{ "@type": "Thing", "name": "Midjourney", "url": " },
{ "@type": "Thing", "name": "Prompt Engineering" }
),
"mentions": (
{ "@type": "Thing", "name": "Claude", "url": " }
),
"relatedLink": (
"
"
),
"citation": (
"
"
),
"keywords": ("Midjourney", "Prompt Engineering", "Claude"),
"workTranslation": (
{
"@type": "CreativeWork",
"@id": "
"headline": "Un año con IA: carta abierta a los diseñadores",
"url": "
"inLanguage": "es"
},
{
"@type": "CreativeWork",
"@id": "
"headline": "AIと一年:デザイナーへの公開書簡",
"url": "
"inLanguage": "ja"
}
)
}
)
}

Compare this to the static version: a BlogPosting With title and author’s name. The difference is not cosmetic. There is a difference between it. “is an article” and “is a knowledge node associated with an author with verified profiles, published by an organization, linked by references to related articles, covering specific topics, and available in three languages.”
Testing your implementation
After deploying, verify on Google’s Rich Results Test. Paste the URL of any post and search your blog posting with all features.
For a deeper audit, copy Block from your page source and paste it into ChatGPT with this prompt: "Audit this JSON-LD schema for AI reference visibility. Score it 1-10 and tell me what's missing." Feedback is surprisingly specific.
When I did this, ChatGPT identified five improvements that raised the score from 8.7 to 9.1.
What I learned after 3 months in production
I have been running this system since early 2026 on a blog with 52 posts in three languages. The keyword "llms txt" reached position 4 on Google. AI models started referencing my content in answers about JSON-LD implementations.
If I were to start today, I would do three things differently.
First, add abstract Property from day one. I added it in three months and the effect was immediate. LLMs use abstracts as the first filter. Perplexity confirmed that the first 200 characters of a page are critical to whether the AI extracts content.
Second, use citation side by side relatedLink From the beginning relatedLink is a navigation indicator. citation Indicates a knowledge relationship. AI models interpret the links between your posts differently depending on which properties you use.
Third, quickly define the publisher as an organization. I started. @type: Person and later changed. AI systems provide greater confidence to organizational publishers.
The system generates JSON-LD on every page load. At this scale (less than 100 posts) the performance impact is negligible. For thousands of posts, create on publish and cache the output.
wrap up
This system is a layer of what is now called generative engine optimization: structuring content so that AI models refer to you in their responses.
Other layers include a llms.txt file at your domain root (which gives the AI crawler a site-level overview) and written content that the AI can extract without the need for additional context (direct statements over narrative intros).
Full source code is running in production. shinobis.com. Each post uses the exact system described here.
The next SEO battleground is not rankings. These are references. And references start with structure.