Are evals and graders deprecated/not adapted for Responses API?

I read the docs about evals and graders and at least IMHO it looks like it is deprecated or not accurate for Responses API.

I found the idea of evals very useful for my use case but I have this problems/questions and I couldn’t find answers after two days experimenting an searching:

  1. Is there a way to pass a vector_store_id to a Responses API eval run and then check (with a grader) if the tool file_search was called?
  2. The doc says that you can use {{sample.output_json}} or {{sample.output_tools}} for accessing the JSON response or the tools used by the model, but it’s impossible due to UI limitations when creating the Evaluations using the Dashboard. You can only use {{ sample.output_text }}
  3. Is there a way to debug what is the content of the {{ sample }} item? Because it would be really helpful.

Any information on this guys will be very helpful, I have been testing for two days and I feel I’m blocked.

Thanks!