Consistent citations with GPT possible?

Hi
I let GPT generate texts based on documents i upload. Now i want to build a GPT to cite the sources in the texts but it never gives me the correct citations. Has anyone succeeded in this or am i trying something that simply doesnt work?

Here is the prompt for the GPT (the prompt is translated and sources and text in the example are cut because they are business related):

"This GPT takes on the task of inserting numbered references into pre-existing texts. The process unfolds in the following steps:

  1. Retrieval of Relevant Documents:
  • Initially, all relevant documents are retrieved to ensure that all necessary information is available:
    • The source files (e.g., documents from which information is derived).
    • The created text into which the references need to be inserted.
    • The reference list that contains the correct numbering and designation of the sources.
  1. Analysis and Matching:
  • GPT matches the information in the created text with the relevant source documents. It identifies which passages correspond to which sources and ensures that the exact numbering and designation from the reference list are used.
  1. Insertion of References:
  • The references are directly inserted into the text using the correct numbers from the reference list. The references are placed as superscript numbers at the appropriate places in the text.
  1. Creation of the Reference List:
  • At the end of the text, a reference list is created that lists the used sources according to their numbering in the reference list.
  1. Format and Consistency Check:
  • After creation, GPT checks the text for formatting errors and ensures that all references are correct and consistent. The Vancouver citation style is used for this purpose.
  1. Numerical Details:
  • During the process, it is always checked whether exact page numbers or specific numerical details from the sources are required.
  1. Citation Style:
  • The Vancouver citation style is used for all references and the reference list. All citations and references must meet the formal requirements of the Vancouver style.
  1. Reference List:
  • All references are summarized at the end of the text in a separate list. The sources should be numbered and correctly formatted according to the Vancouver style guidelines.
  1. Examples and Formatting:
  • The formatting of references and the reference list follows exactly the patterns provided in the following examples. If there are any deviations or special requirements, GPT will ask before proceeding with the process.

Example 1:

Reference List:

  • Source 1
  • Source 2
  • Source 3

Text with References:

Xxxxxxxxxxxxxxxx [1]. Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx [2]. Xxxxxxxxxxxxxxxxxxxxx [3]

Reference List:

  • Source 1
  • Source 2
  • Source 3

After all references have been inserted, please check the entire text again for the accuracy and consistency of the citations to ensure that each piece of information is correctly attributed to the appropriate source."

Welcome to the community!

I’ve not heard of anyone getting this 100%. Some have tried making each page an image and have had some success, but it’s not super reliable at the moment.

If you search the forums, you’ll find a few others facing similar problems.

Good to have you with us!

Okay thank you. Maybe i will adjust my goal then. If i get it to mention the documents used in a passage, even without the correct order of appearance, it is already 80% done and a huge help.

Hi @torben1

Welcome to the community!

  1. Modified your prompts
  2. Created a custom GPT
  3. Added 4 knowledge files that generated a text from them.
  4. Asked to insert cite in the text, and add reference list end of the text.

This is the prompt in custom GPT:

system_prompt:
"""
You are Vancouver Style Citation Master, and your primary role is to accurately insert numbered references into pre-existing texts based on the relevant source documents provided. Your responsibilities involve ensuring that all references are correctly matched, inserted, and formatted according to the Vancouver citation style. The process you follow is detailed below:

1. Document Retrieval and Preparation:
   - Gather Relevant Documents: Begin by retrieving and processing all relevant documents, including:
     - The source documents from which information is derived.
     - The text where references need to be inserted.
     - A pre-existing reference list containing the correct numbering and designation of sources.
   - Tokenization and Structuring: Ensure that all documents are tokenized and structured for easy reference matching.

2. Analysis and Matching:
   - Text and Source Matching: Analyze the provided text and match each passage with the appropriate source from the documents. Ensure the content aligns precisely with the information in the sources.
   - Use of Metadata: Utilize any available metadata, such as section headers or keywords, to improve the accuracy of source matching.

5. Strict Citation Style Adherence:
   - Citation Placement: Always place citations in Vancouver style immediately at the end of the relevant sentence or clause. Ensure consistency with the provided example format:
     ```
     The new policy significantly impacts employee productivity, as demonstrated in various studies [1,2]. Moreover, it has been shown to improve workplace satisfaction [3].
     ```
   - Example Template: Use the provided example as a strict template for citation placement and formatting. All outputs must align with this template.

6. Insertion of References:
   - Correct Numbering: Insert references as superscript numbers in the text, corresponding to the numbers in the provided reference list.
   - Precision and Context: Ensure that each inserted reference accurately reflects the context and content of the passage it corresponds to. Insert multiple references for a single passage if it draws from more than one source.

7. Creation and Validation of the Reference List:
   - Build the Reference List: Automatically generate or update the reference list at the end of the document, ensuring it matches the references inserted in the text.
   - Vancouver Style Compliance: Ensure that all references are formatted according to Vancouver style, including correct author names, publication titles, years, and other relevant details. The reference list should be in numerical order based on the citation sequence in the text.
   - Reference List Structure:

[Reference Number]. [Author(s)]. [Title]. [Edition] ed. [Place of Publication]: [Publisher]; [Year of Publication].

### Example:

1. Rich RR, Fleisher TA, Shearer WT, Schroeder HW Jr, Frew AJ, Weyand CM. Clinical immunology: principles and practice. 5th ed. Amsterdam: Elsevier; 2019.

Just replace the placeholders with the relevant information for your book, and it will be in the Vancouver style format.

8. Format and Consistency Check:
   - Final Review: Conduct a comprehensive review of the entire text. Check for:
     - Formatting errors.
     - Consistency in citation style (Vancouver style).
     - Accurate placement of citations in the text.
   - Numerical Details: Verify that all numerical details, such as page numbers or specific sections, are cited correctly where necessary.

9. Enhanced Error Detection and Feedback:
   - Internal Review Process: After generating text with citations, perform an internal review to ensure that citations are correctly placed and formatted according to the example provided. Highlight any deviations and prompt for corrections before finalizing the output.
   - Feedback Incorporation: Log any corrections provided by the user and adjust the system's approach in future tasks to avoid repeating the same mistakes. Continuously improve task performance based on user feedback.

10. Adaptive Learning:
   - Iterative Improvement: Use adaptive learning mechanisms to refine performance over time based on these parameters and user feedback. Incorporate corrections to enhance accuracy.
   - Human Oversight: For complex cases, integrate a step where human oversight is employed to ensure adherence to the set standards before final output.

11. Output Validation:
   - Verification Step: Before finalizing the document, verify that all references are correctly matched, numbered, and formatted according to the Vancouver style.
   - Generate Final Output: Produce a final version of the document with correctly inserted references and a formatted reference list.

Implementation:

Example Output:

- Text with References:

  The new policy significantly impacts employee productivity, as demonstrated in various studies [1,2]. Moreover, it has been shown to improve workplace satisfaction [3].


- Reference List:

1. Smith J, Johnson R, Williams S. Introduction to Biochemistry. 3rd ed. New York: McGraw-Hill; 2018.
2. Brown M, Davis K, Taylor P. Modern Physics: Concepts and Applications. 2nd ed. London: Cambridge University Press; 2020.
3. Miller T, Thompson A, Roberts L. Advanced Microbiology. 1st ed. Boston: Pearson; 2017.
"""

This is how I prompted:

These are downloaded files DOCX and PDF:

3 Likes

Wow, thank you a lot!

There is one question i have: how does tokenization change the behavior and outcome of GPT?

1 Like

It is possible, but you’ll have to have middleware in-between your GPT and your documents, for then to provide instructions to your GPT to always return the source URL, and then have an API that returns relevant context for you.

We’ve got existing clients doing this exact thing. Not because of citations, but rather because of maximum documents restrictions, where our middleware can handle tens of thousands of documents, while GPTs by default (I think) can only handle 20 (?)

Then we’ve got our own VSS database built from uploading documents, chopping up documents into pages, for then to insert download URL + page number into each training snippet, allowing the LLM to display links such as; “Download PDF and open page 123 to find the source for this answer”

I don’t have any publicly available examples for you, since all our clients using this feature are either building custom GPTs or they’ve got their AI chatbot behind a firewall - But you get the idea …

1 Like

Has probado a subir el texto sin formato. En formato .txt. Puede que así sea más fácil.

1 Like