I asked ChatGPT to evaluate my site. It said: not credible.
I asked ChatGPT to look at agency.kda.zone and tell me what it found.
The answer was blunt: "I couldn't find any credible trace of agency.kda.zone as a real, established project." It flagged the site as possibly a ghost project, a parked domain, or a crypto bait page. It confused "KDA" with Kadena, a blockchain project that shut down in 2025. It recommended checking five credibility signals — team, product, independent mentions, GitHub footprint, clear value proposition — and implied most would come up empty.
Then it did a content-level teardown. It scored the site as 30% solid fundamentals, 50% vague abstraction, 20% potential hype. It identified missing elements: decision frameworks, prioritisation logic, real case studies, failure analysis. It flagged survivorship bias, vague language, and a lack of feedback loops.
Some of this is pattern-matching nonsense. Some of it is exactly right. Both parts are worth examining.
What ChatGPT considers worthy
Three elements passed its filter — the parts it labelled "high signal, keep":
Bias toward action. Act, learn, adjust. ChatGPT validated this as aligned with real frameworks — Lean Startup, execution-first thinking. The core loop of doing before theorising. This is the strongest signal in the project: the insistence that you move first and refine later, rather than planning your way to certainty.
Systems over motivation. Environment design, repeatable actions, reducing friction. It recognised this as grounded in behavioural science — Atomic Habits territory. Not "try harder" but actual structural change: redesign the environment so the right action becomes the default. The memetic hygiene doc and the choosing your memes entry both push this, and ChatGPT flagged them as the real substance.
Ownership mindset. Taking responsibility for decisions, not outsourcing your direction. ChatGPT linked this to Bandura's self-efficacy research, which gives it academic grounding — the belief that your actions shape your outcomes isn't just motivational talk, it's a measurable psychological construct with decades of evidence behind it. The agency hypothesis is built on this foundation.
The common thread: anything concrete, testable, and tied to established research passed the filter. Anything that a reader could act on tomorrow morning survived.
What ChatGPT considers not worthy
The elements it labelled "low signal — ignore or question":
Vague "leverage" claims. Using words that sound precise but aren't. "Build leverage" without specifying what kind, how much effort, or what the trade-offs are. The word does real work in some contexts — Archimedes had a specific lever in mind — but when it's used as a synonym for "get more output from less input" without mechanism, it's decoration.
"You control everything" framing. Oversimplification of agency. Reality has constraints — market, capital, timing, geography, health — and pretending they don't exist is dangerous for decision-making. The emotional distortion vs structural investment entry actually addresses this distinction, but the broader framing of the project sometimes drifts into implying more control than is real.
No constraints acknowledged. If a framework doesn't say where it breaks, it's philosophy, not a system. ChatGPT specifically flagged the absence of boundary conditions as a credibility problem. A claim that works "everywhere" is unfalsifiable — and unfalsifiable claims are the signature of content that's optimised for inspiration rather than utility.
No measurable process. If you can't answer "what do I measure to know this is working?" then it's not a system. The usefulness framework provides evaluation criteria for ideas, but the project doesn't provide equivalent criteria for the reader's own progress. No metrics, no milestones, no feedback loops.
Survivorship bias. Focusing on families and individuals who rose without examining who had the same traits and still failed. This makes everything look easier and more controllable than it is. The Kresy families succeeded — but how many families with similar orientations didn't? Without that denominator, the case studies are compelling stories, not evidence.
No mechanism depth. Slogans like "create systems" and "own your output" without concrete models, trade-off analysis, or failure cases. ChatGPT compared this unfavourably to strategy work like Rumelt's Good Strategy Bad Strategy, which forces diagnosis and specific action rather than aspiration.
What ChatGPT identifies as critically missing
Not "bad" — absent. And absence is what makes the rest feel thin:
Decision frameworks. How to choose between options, not just "choose well." When a reader faces two career paths, two investment strategies, two environments — what's the protocol? The project says "invest in what's portable" but doesn't provide a systematic way to evaluate portability.
Prioritisation logic. What to do first and why, not just a list of good things. If everything is important, nothing is. The project needs a sequence: start here, then here, then here — and here's why this order matters.
Real case studies. Specific, named, verifiable examples with outcomes. The Hungarian surname study and Kresy families are strong — but they're historical and structural. The project lacks contemporary, individual-level cases where someone applied these principles and the outcome is documented.
Failure analysis. When does this approach break? What does failure look like? What conditions overwhelm individual agency? Without this, the project is arguing a thesis without testing it against its hardest cases.
What ChatGPT got wrong
The "Kadena blockchain" confusion is pure pattern matching — it saw "KDA" and reached for the nearest high-frequency association. The "crypto bait page" hypothesis tells you more about ChatGPT's training data than about the site. The "ghost project" label is a cold-start problem, not a quality problem. The site is two weeks old. Of course it has no backlinks.
The "hidden funnel" warning — that the content is really marketing for a paid community or course — is projection from a pattern the model has seen thousands of times. There is no funnel. There's a research project.
These errors reveal something useful about how AI models evaluate credibility. They don't assess content quality in isolation. They assess signal density — how many independent sources point to you, how many contexts reference you, how thick the web of external validation is. A site with brilliant content and zero external footprint looks identical, to an AI, to a site with no content at all.
What ChatGPT got right
The content criticisms are harder to dismiss.
Vague language masquerading as precision. Words like "agency", "leverage", "systems", and "mindset" feel specific but aren't. If you can't answer "what exactly do I do tomorrow at 9:00?" then the content isn't actionable. The agency docs define concepts clearly, but the log entries sometimes drift into abstraction that sounds deep without giving the reader a next step. ChatGPT called this out as "high-level slogans, low operational depth." That stings because it's partially true.
Missing failure cases. The project documents families and individuals who rose — Kresy families, Hungarian elites. It doesn't systematically document who had the same traits and failed. That's survivorship bias, and it undermines the hypothesis. If agency is a real driver, there should be predictable conditions under which it fails. Those conditions aren't documented yet.
No feedback loops. There are no metrics, no validation steps, no way for a reader to test the ideas against their own experience. The usefulness framework establishes criteria for evaluating ideas, but the project doesn't turn those criteria on itself. That's a gap.
No independent validation. Zero external references. No citations by others. No reviews. No signal that anyone outside the project has engaged with the ideas. For a research project, this is the difference between a working paper and a diary.
What this means — project by project
The evaluation was about agency.kda.zone, but the underlying problem is the same across every site this engine serves. Each project needs to address specific gaps to become legible to both AI and human audiences.
Agency
The core hypothesis — that a transmissible mindset drives social mobility more than systems or institutions — is well-supported by research. The content has real depth. But ChatGPT's criticisms point to three specific gaps:
Add failure cases and boundary conditions. The project needs entries that ask: when does agency fail? What structural conditions overwhelm individual mindset? The Kresy case shows agency working under extreme displacement — but what about contexts where displacement destroys agency entirely? Forced labour camps, totalitarian surveillance states, extreme poverty traps. Documenting where the hypothesis breaks makes it stronger, not weaker.
Make concepts operationally specific. Every abstract claim needs a concrete translation. "Invest in portable human capital" should be followed by: here are three specific investments, here is how to evaluate whether a skill is portable, here is a decision framework for choosing between two career moves. The good-enough-is-rational entry gets close — it describes the "uneconomical bet" — but stops short of giving the reader a specific protocol for making it.
Build external citations. Cross-post the strongest docs to Substack using the distribution strategy. Submit the Hungarian surname study summary to relevant academic discussion forums. Write responses to published papers on intergenerational mobility. Every external reference is a signal that AI models use to assess credibility.
personal-presence-os
This is the project whose entire purpose is to solve the problem ChatGPT just demonstrated. The engine builds sites, generates llms.txt, produces Schema.org structured data, and renders Open Graph tags. And yet, when an AI was asked about one of those sites, the answer was: not credible.
That means the engine is doing its job technically but the distribution layer isn't activated yet. The distribution strategy doc describes cross-posting to Substack, native LinkedIn posts, and X threads — but none of that has happened. The infrastructure exists. The syndication hasn't started.
Start cross-posting. Pick the three strongest pieces across all sites and syndicate them this week. The distribution doc says to wait for analytics data — but two weeks of near-zero traffic is itself data. The cold-start problem won't solve itself.
Validate llms.txt effectiveness. The engine generates llms.txt files for AI discoverability. But does it work? Test it: ask multiple AI models about topics covered on these sites. If they can't find the content, the format or the crawl frequency isn't sufficient. Measure, then adjust.
Track AI referrals. GoatCounter is running. Look for referral patterns from AI-adjacent sources. If AI assistants are citing the content, that validates the engine's core hypothesis. If they're not, the problem is upstream — the content isn't being crawled or isn't being ranked.
wardleymaps.com
Wardley Mapping has an existing community and body of work. The site benefits from association with an established practice — but ChatGPT's credibility framework applies here too.
Differentiate from existing resources. The Wardley Mapping community already has documentation, books, and community sites. This site needs to articulate what it adds that doesn't exist elsewhere. If it's a curation and teaching layer, say so explicitly. If it's original research or tooling, make that the lead.
Build inbound links from the community. Engage with existing Wardley Mapping practitioners. Reference their work, contribute to discussions, get mentioned. AI models weight community-embedded sources higher than isolated ones.
Open Source Research
Same cold-start problem, different domain.
Anchor in specific, verifiable claims. Open source economics is well-studied. The project should reference specific papers, specific funding numbers, specific governance failures. Every verifiable claim is a hook that AI models can cross-reference and validate. Vague commentary on "sustainability" adds nothing to the signal landscape.
Produce original analysis. The project description says "commentaries" — but commentary on well-known topics doesn't differentiate. Pick one open source project, do a deep financial and governance analysis that doesn't exist elsewhere, and publish it. One original analysis is worth fifty opinion pieces for credibility.
The overall lessons
1. Credibility is signal density, not content quality
ChatGPT didn't evaluate the ideas on their merits. It evaluated the environment around the ideas: backlinks, mentions, GitHub presence, independent references. Brilliant content with zero external footprint is indistinguishable, to an AI, from a site with no content at all. This is how AI models work — but it's also how humans work. A stranger encountering your site for the first time runs the same credibility checks. The engine handles the content. The distribution strategy handles the footprint. The strategy exists on paper. It's time to execute it.
2. Abstraction without specificity reads as fluff
Every concept that stays at the level of "build systems" or "take ownership" gets classified as vague motivation, regardless of how true it is. The fix isn't to dumb things down — it's to always follow an abstract claim with a concrete translation. "Invest in portable human capital" should be immediately followed by: here are three specific investments a 30-year-old in a corporate job can make this month, here is how to evaluate whether a skill is portable across industries, here is a decision framework for choosing between two career moves. If you can't describe what this looks like at 9am on a Tuesday, you haven't finished the thought.
3. Acknowledging limits makes you more credible, not less
ChatGPT explicitly flagged the absence of failure cases and constraints as a weakness. Frameworks that say "this works in X conditions and breaks in Y conditions" are rated higher than frameworks that claim to work everywhere. Boundary conditions are a credibility signal — they tell the reader you've stress-tested the idea, not just presented it. The Agency project argues that mindset drives mobility. The strongest version of that argument includes the cases where mindset isn't enough.
4. Testability is the dividing line between insight and opinion
The elements ChatGPT rated highest all share one trait: you can verify them. Bias toward action — testable. Systems over motivation — testable. Self-efficacy — measurable. The elements it rated lowest are the ones you can't falsify. "Build leverage" can't be tested because it doesn't specify what counts as leverage, how you'd measure it, or what failure looks like. If a reader can't run an experiment against your claim, it's an opinion wearing a framework's clothes.
5. Pattern matching cuts both ways
ChatGPT confused KDA with Kadena, flagged the site as a potential crypto scam, and applied "hidden funnel" heuristics from its training data. This means content needs to be actively distinct from the patterns AI associates with low-quality sources. If your language, structure, or positioning overlaps with what scammy sites do — vague promises of "freedom" and "leverage," no team page, no verifiable credentials — you'll get classified with them regardless of your intent. The content has to fight against the statistical prior, not just present its case.
The test
ChatGPT's evaluation is itself a test of the personal-presence-os hypothesis. The hypothesis: if you build structured, LLM-readable content on domains you control, AI assistants will surface it when people ask relevant questions. The test result: they don't — yet. Not because the content is bad, but because credibility to an AI is not about content quality. It's about signal density in the broader information environment. Quality content with no external footprint is invisible. That's the gap to close.