AI Data Index: Proposal to Enhance Accessibility and Readability of Web Content

Hi everyone,

For the past few months, I’ve been working on a project aimed at making website content easier for AI to read, by reducing ambiguity and speeding up information processing.

:wrench: The Core Idea

The concept is to create a parallel website, built entirely in JSON, designed exclusively for AI, not for human users.

This “mirrored” version offers a clean, structured representation of the site’s content, optimized for semantic understanding by language models.

:open_file_folder: Technical Structure

  • The system is built around an index.json file, which links to secondary JSON files containing structured content.
  • For large websites, the architecture supports a hierarchy of subfolders for better organization.
  • I’ve documented the full structure and implementation guidelines at: aidataindex .org.

:test_tube: Testing Results

In my testing with ChatGPT:

  • Sometimes the model can read the JSON files properly, and when it does, it provides positive feedback, even calling the system “cutting-edge.”
  • However, in many cases, ChatGPT fails to read the files, returns errors (e.g., 500), or fabricates nonexistent content.
  • Even when JSON files are correctly referenced in robots.txt or linked in the of HTML pages, the model rarely picks them up automatically.

:paperclip: Real-world Examples

Here are two complex implementations that pull structured data from real databases:

  • pagineaziende .net/json/index.json
  • compradiretto .it/json/index.json

You can open the links to view the JSON structure directly.

:folded_hands: A Request for OpenAI Developers

I’d like to suggest that the team consider enabling automatic reading of JSON files, especially when those files are clearly indicated via robots.txt and HTML links.

:speech_balloon: Open to Feedback

I know this is an experimental approach and far from perfect — but I’ve put a lot of time and energy into it.

I would greatly appreciate any feedback, insights, or even constructive criticism from this community.

Thanks for your attention!

Marco

This topic was automatically closed after 21 hours. New replies are no longer allowed.