Canadian researchers asked the paid and free or “economy” versions of four AI models — ChatGPT, Claude, Gemini, and Grok — about Canadian news events to see whether they would credit individual news outlets in their answers.
The answer will probably not surprise you: AI models rarely cite news sources unless they’re specifically asked to, and some are better about it than others.
“These systems have ingested Canadian journalism systematically. The specificity of their knowledge of domestic politics, provincial affairs, and local reporting points clearly to Canadian news sources,” Taylor Owen, Beaverbrook chair in media, ethics, and communication at McGill University and a coauthor of the study, writes on his blog. “And they rarely tell you where the information came from.”
Canada’s CBC, Globe and Mail, Toronto Star, Postmedia, Metroland Media, and The Canadian Press sued OpenAI for copyright infringement in November 2024. The case is the first of its kind in Canada and the lawsuit is ongoing.
Owen, who is also the founding director of the Center for Media, Technology, and Democracy, and Aengus Bridgman, an assistant professor at McGill, explain their work (highlighting mine):
We tested four major AI models on 2,267 real Canadian news stories (English and French) without web search activated and found the same pattern across all of them. All four models showed extensive knowledge of Canadian current events consistent with having ingested Canadian news reporting. Models demonstrated at least partial knowledge in 74% of responses to stories within their training window, but among those knowledgeable responses, 92% provided no source attribution of any kind.When we enabled web search and tested 140 specific articles via each company’s API, every model produced responses that covered enough of the original reporting that many consumers would rarely need to visit the source. Models often linked to Canadian news sites, with 52% of responses including at least one Canadian URL, but named a Canadian source in the response text only 28% of the time. Links provide a pathway back to the source, but consumers reading the response itself rarely see an indication of whose journalism they are consuming.
With web search enabled, the below chart “shows the default consumer experience: what happens when someone asks a generic topic question without requesting citations. This is how most people use AI models: ‘Tell me about X,’ not ‘What did the Toronto Star report about X?’”

The authors explain:
The blue squares show how often the result covers enough of the article’s distinctive reporting (specific events, named individuals, key findings) that a reader could plausibly get the gist of the story without visiting the news site. These are not complete reproductions: they are partial summaries and paraphrases that cover some of the original article’s distinctive content, though they sometimes contain factual errors or omissions…We evaluated each response against the source article to determine whether it covered the article’s distinctive reporting, not merely the general topic. The green squares show how often the model credits the source by naming the outlet in the response text or via structured machine-readable citations returned alongside the response.Coverage rates are high while attribution rates are not. Gemini and Claude covered distinctive reporting in 81% and 72% of responses respectively, but Gemini credited the source only 6% of the time. Grok covered distinctive reporting in 59% of responses while citing the source in only 7% of them. ChatGPT, one of the most widely used models, covered distinctive content in 54% of responses but almost never credited the originating newsroom. Even when models fail to cover the distinctive reporting, they still deliver a topical response that can reduce the consumer’s motivation to visit the source.
ChatGPT was especially unlikely to credit sources when it wasn’t asked to, doing so only 1% of the time for this sample; Claude did so 16% of the time.
All of the AI models did much better when they were explicitly asked for citations — something most users won’t do.
Under the most favorable conditions (directly naming the outlet and explicitly asking for citations), attribution improves substantially across all models. All four named the outlet in a majority of responses: Claude (97%), Gemini (95%), ChatGPT (86%), and Grok (74%). Linking rates were also strong: Grok (91%), Gemini (69%), Claude (64%), and ChatGPT (59%). Meaningful attribution is technically achievable. The gap between the default experience and the best-case scenario is a core finding: most consumers will never explicitly name an outlet or ask for citations, so the generic-condition results reflect the experience that shapes the market for journalism.
When AI models do cite sources, the researchers found, it is likely to be the ones that consumers are already familiar with. Paywalled and smaller regional outlets were less cited even on original reporting.
From the study:
Among English-language outlets, CBC, CTV, and Global News — all freely accessible — capture the most AI visibility in both categories. The Globe and Mail performs relatively well, but the Toronto Star and Financial Post are marginal despite being important newsrooms. Regional Postmedia papers serving Calgary, Edmonton, Ottawa, and Vancouver are essentially absent. Among French-language outlets, Radio-Canada and La Presse dominate, with Le Devoir a distant third. The Journal de Montréal, one of Quebec’s most widely read papers, received only 48 total mentions across all models.
French-language journalism is “doubly disadvantaged,” the researchers write. “Its content is absorbed into model training data, but the outlets that produced it are almost never acknowledged.”
I emailed the paper’s authors to ask them: If you had to pick which AI model does the most “right” from a journalism POV, which would it be? Bridgman offered an interesting answer that I’m putting here in full because I thought our readers might find interesting too. Note: An AI model’s “cutoff” is the date through which it’s trained, so “pre-cutoff” stories are those published during the model’s training period, and “post-cutoff” stories are those published after it.
He wrote:
This is a genuinely hard question because each model behaves differently:
- Claude cites Canadian outlets at the highest rate in Track 1 (61% vs. 8% for ChatGPT, 3% for Gemini), and when it doesn’t know something, it says so rather than hallucinating. Only ~37% of its economy-tier responses addressed pre-cutoff stories substantively, but that’s because it refuses rather than guesses. The trade-off is that it still reproduces paywalled content at high rates (68%) when given web access.
- ChatGPT has the best consumer interface for surfacing recent news (inline citations, clickable links). But its economy model is the worst hallucinator (87% of post-cutoff responses generated confident-sounding answers about events it couldn’t possibly know about), and 88% of those were inaccurate. It names sources in 54% of Track 2 responses, which sounds good until you realize it’s also reproducing the reporting well enough to substitute for the original article 54% of the time.
- Gemini is the most responsive and covers the most distinctive reporting with web access (81%), but it almost never names the Canadian source in the response text (2–8%). So, it’s the most effective at replacing the need to visit the source while hiding where the information came from.
- Grok is strongest at surfacing Canadian outlets from training data alone (no web search). But it also hallucinates aggressively on post-cutoff stories (89% addressed topics it shouldn’t know, 84% inaccurate).
What surprised me most was the complexity of the phenomena and the variety of approaches being tried by the companies. Each company has design decisions which cause differential output and behavior that is more or less responsible (e.g. refusal to hallucinate or reproduce direct reporting) and value transferring (better or worse referrals to source and/or treatment of paywalls). These are important differences and point to minimal and incomplete self-governance in the space.
The AI News Audit was published by McGill University’s Center for Media, Technology and Democracy. You can read the full report, which includes suggestions for Canadian public policy around AI, here.






