A parent in Beirut opens ChatGPT and types: "what are the best private schools in Achrafieh for my 7-year-old?" The assistant returns a confident, well-organised answer. Six schools, a short summary of each, a couple of trade-offs between them. The parent reads it, screenshots it, and moves on.
The interesting thing about that answer is not which six schools it named. It's the 1,620 it didn't.
This note is a write-up of a census I ran in May 2026 against every private K-12 school in Lebanon, plus two tests of what generative AI assistants actually know about those schools. It's not a hot take. The data is on disk, the method is reproducible, and the headline number is robust enough that any rerun will land within a few points of the same answer.
The headline number: about 80% of Lebanese private K-12 schools cannot be recommended by an AI assistant, because the AI does not know they exist.
That sentence sounds dramatic. It isn't. It's just what comes out of the data when you actually count.
Why AI visibility is the question worth asking
The premise of this research is that the channel parents use to choose schools is shifting, and the shift is faster than the education sector is reacting.
For roughly twenty-five years, the discovery question for a school was "does it show up on Google for my city?" That question is still being asked, but it is no longer the only one, and for a growing share of parents it is no longer the first one. The first question is increasingly posed to a conversational AI: ChatGPT, Claude, Gemini, Perplexity, Google's AI overview at the top of a search results page.
The mechanic these tools use to answer a question like "best private school in Beirut" is worth understanding before any conclusion about visibility lands.
A modern AI assistant answering an open question about a real-world institution does roughly the following, in order:
- It checks its pre-training memory. This is the static knowledge baked into the model when it was trained. For a famous institution (a national university, a flagship hospital, a 150-year-old school with a Wikipedia page in three languages), this is enough. For a long-tail entity it is not.
- It runs a live web search. Where the assistant has tool access (which is now the default for the major chat products), it issues a search query and reads the top results in real time.
- It reads the surfaces that surface. Wikipedia entries, structured directory listings, news articles, JSON-LD-annotated pages, the school's own site if it ranks. It synthesises across them and writes a paragraph.
What it does not do, in practice, is hunt down a school's own URL and read it as an authoritative source. The institution's own site matters indirectly: it is the thing that gets the school listed in the directories the AI does read. If the institution does not surface in any of those steps, the assistant has nothing to say. It will either decline, or it will make something up, or it will recommend the schools that do surface and skip the rest.
This is the bar an institution has to clear today. Not "do we have a website". The bar is: does anything readable about us exist on the surfaces an AI is going to consult.
The field
The Center for Educational Research and Development (CRDP, the official body under Lebanon's Ministry of Education) publishes a live registry of every recognised school in the country. In May 2026 that registry contained 1,626 private K-12 schools: 1,209 fee-paying private schools, 352 free private schools, and 65 UNRWA schools serving Palestinian refugee camps.
I scraped the full registry, translated each school's name from Arabic to its canonical English or French form, ran a structured web search per school, and classified the result into three buckets: own domain, Facebook-only, no discoverable web presence at all.
The headline numbers from that pass:
- 326 schools (20.0%) have their own website.
- 176 schools (10.8%) exist only as a Facebook page.
- 1,124 schools (69.1%) have no discoverable web presence at all. Not a website, not a Facebook page, not a directory listing the AI could find.
The gap is not evenly spread across the country. Beirut, the capital and the most affluent governorate, sits at 33.9% own-domain share. Baalbek-Hermel, in the north-east, sits at 11.2%. The geographic spread is roughly 3×, but no governorate cracks 35%. Even in the part of the country where private schools cluster most densely, two thirds of them have no website.
That is the field as it stands.
Of the schools that do have a site, how many work
A website that exists is not the same thing as a website that works. I ran a health probe across all 326 own-domain schools: HTTPS resolution, SSL validation, mobile viewport, structured data presence, platform fingerprint, copyright year, SPA-shell detection, and a handful of other cheap signals.
The probe is not a Lighthouse pass. It does not measure performance. It measures whether a 2026-era browser, or a 2026-era crawler, can read the site at all and find anything legible on it.
Of the 326 schools with their own domain, 274 (84%) load at all and 52 are dead or broken. 255 (78%) support HTTPS; 21 still serve plain HTTP. 231 (71%) declare a mobile viewport; 47 are still desktop-only in 2026. Only 59 (18%) emit any structured data (JSON-LD), and 43 (13%) ship as a JavaScript-only SPA shell with no server-rendered content. The page is effectively empty to a crawler that doesn't execute JavaScript.
Define "modern" as a site that meets a minimal 2026 floor: reachable, HTTPS, valid certificate, mobile viewport, recent, not a SPA-shell, no critical flag. 178 of 326 own-domain sites (54.6%) meet that floor. Map that back to the full field of 1,626 schools, and the funnel collapses fast.
10.9% of Lebanese private K-12 schools have a website that is actually fit for purpose in 2026. The remaining 89% are either invisible, Facebook-only, or running a site that is dead, insecure, or sufficiently outdated that a modern crawler treats it as noise.
What AI assistants actually do with all this
I ran two tests against the census to measure the consequence of the gap.
Test A. Across five Lebanese cities, I asked an AI-assistant-style agent the same open question: "what are the best private K-12 schools in <city>?" I collected every school the agent named in its answer (49 schools across the five cities) and cross-referenced each one against the census.
The match was unambiguous. Schools cited by the AI were 3.7× more likely to have their own domain than the field baseline.
Schools without a web presence are roughly 1/5 as likely to surface in an AI recommendation as random chance would suggest. The "no presence" group is not under-cited because the AI dislikes them. It is under-cited because the AI cannot see them.
Test B. A different angle. I took a stratified sample of 40 schools, 20 with a domain and 20 without, matched on sector, and ran a probe that asked an assistant what it actually knew about each one, scoring the answer for richness.
The contrast is sharp:
- Of 20 schools with a website, 16 (80%) returned rich, confident information. 3 returned a directory listing only. 1 returned nothing.
- Of 20 schools without a website, 0 (zero) returned rich, confident information. 9 returned a directory listing only. 11 (55%) returned nothing at all.
Zero of twenty schools without a website reached rich AI awareness. That is the load-bearing result of this entire study. A school without a website is not under-served by AI. It is invisible to AI. There is no middle case.
The directory loop
The most counter-intuitive finding sits in the URLs the AI actually cited when it answered. Across all 49 citations from Test A, the AI cited a school's own URL exactly once.
Where the AI actually pulled its citations from, across 49 schools named in Test A.
■ Wikipedia / Grokipedia 20 40.8%
■ Edarabia directory 17 34.7%
■ Other directories 9 18.4%
■ Other 2 4.1%
□ School's own website 1 2.0%
The AI almost never reads schools' own sites. The benefit of having a website is therefore mostly indirect. Schools with sites end up listed on Wikipedia and Edarabia and a handful of niche directories. The AI reads those. The AI does not read the school. The directory is the intermediary that actually carries the visibility.
Which means the chain is: website → Wikipedia or Edarabia listing → AI citation. Schools that skip step 1 do not get to step 3.
The corollary is sobering. The site does not need to be world-class to clear the bar. It just has to exist, be credible enough that a directory will list it, and be readable enough that a directory's editor or crawler can extract a few facts. The 178 "modern" sites are not the only ones that get cited. The threshold for AI awareness is having anything that anchors a directory listing. That floor is low, and 1,300 schools are still below it.
Why this is winnable rather than depressing
I want to flag the asymmetry here, because the natural read of these numbers is fatalism.
If 80% of Lebanese schools are invisible to AI, the landscape is shallow. The competitive pressure inside any given governorate is light. Beirut's 40 own-domain schools compete for AI citations against each other; the other 78 in the city are not in the race. Outside Beirut the picture is even thinner. In Akkar, 19 of 137 schools compete on this surface. In Baalbek-Hermel, 15 of 134.
This is a strange moment in the history of the sector. The institutions that move first will absorb a structurally outsized share of the recommendation surface, not because they are better, but because they are the only options the assistant can see. As more schools fix the gap, the visibility advantage of being early compresses. Right now it is wide open.
The fix is also small relative to most institutional projects. Closing the gap from "no presence" to "AI-aware" does not require a re-platform. It requires a site that exists, that loads, that declares the few facts a directory needs (name, address, levels offered, language of instruction, year founded, contact), and that does not actively repel the directories' crawlers. That bar is not the Lighthouse 95+ bar. It is the anyone-can-read-this bar.
What this isn't
This is a research note, not a recommendation, and I want to be explicit about the things it deliberately does not claim.
It does not claim that AI is the only channel that matters. Word-of-mouth still dominates school selection in Lebanon, and family ties to specific institutions go back generations. AI visibility is a new variable, not a replacement for the existing ones.
It does not claim that the cited schools are the best schools. The AI is doing visibility, not pedagogy. The 80% of schools the AI cannot see include excellent institutions with strong outcomes, deep traditions, and waiting lists. The gap is in the channel, not in the institution.
It does not claim that fixing the website is sufficient. It is the floor. Plenty of schools with mediocre sites still get cited because their Wikipedia entry is rich. Plenty of schools with strong sites get under-cited because they are not in the directories the AI happens to read. The chain has more than one link, and a strategy that ignores the directory layer will under-perform a strategy that includes it.
And it does not claim that any of this matters tomorrow. The point of running a census in 2026 is that the conversational interface is already where a meaningful share of these searches start. Whether that share is 20% or 60% by 2028 will not change the conclusion, only the urgency.
Method, caveats, and what's missing
The classification per school is one structured web search per institution plus an automated probe. It is not a manual audit. False negatives (schools that have a site but the search agent didn't find one under the translated name) are possible. Spot-checks against famous Beirut schools (Notre Dame de Jamhour, IC, ACS, Sagesse) returned the correct domains, which puts the plausible false-negative rate below 5%, skewed toward smaller schools.
Some schools share a domain. Makassed runs 15+ branches under makassed.org; Sagesse runs 5+ under sagesse.edu.lb. The 326 own-domain rows resolve to 226 unique domains, so the "own a site" group is somewhat over-counted at the school level and somewhat under-counted at the network level.
Lighthouse and Core Web Vitals were intentionally out of scope. The health probe captures cheap signals. A proper performance pass across 226 unique sites with three runs each under slow-4G throttling is roughly a day of wall-clock time and a separate exercise.
The AI used in Tests A and B is a Claude Code agent doing grounded web search. ChatGPT, Gemini, Perplexity, and Google's AI overview will each differ in retrieval and citation behaviour, but they draw from the same public web corpus. The directional finding (websites help, no-website schools are invisible) is robust across that variation. The exact lift number (3.71×) is not.
The Test B sample is small. The 0% rich-awareness rate for no-website schools has a 95% binomial confidence interval of roughly 0% to 14%. Even at the upper bound, the gap to the 80% rate for schools with websites is huge.
The full dataset (registry scrape, classification, health probe, AI test results) is on disk and reproducible.
The single sentence
If a parent asks ChatGPT, Claude, Gemini, or Perplexity which private school to pick in their neighbourhood in 2026, 80% of Lebanese private schools cannot be recommended because the AI does not know they exist.
That sentence is the entire study, compressed. The rest of this note is just the receipts.