High-Level Vocabulary Words to Improve ChatGPT Responses

Most people are using high school level and high-frequency collegiate vocabulary terms in their ChatGPT prompts, could we be missing out on an opportunity to make ChatGPT prompts that generate better results?

Could there be a performance improvement in GPT-4 responses when using vocabulary that is less commonly used in natural language but more frequently used in professional publications, papers, technical writing, and high-quality editorial publications?

I can only assume that during the learning process associations are formed for high-level content and those specific words contained within it.

With this theory in hand, I set off to do some research and I’d like to share my results here.

Researching High-Level Vocabulary

I asked ChatGPT-4 the following…

Please provide a list of 100 common words that have a higher probability to be contained within writing, articles, papers and publications that are professional, technical, high quality, respected, educational, informative and enlightening.

Sure, here are some words that are commonly found in high-quality, professional, educational, and informative writing, articles, papers, and publications:

  1. Abstract
  2. Analysis
  3. Argument
  4. Assumption
  5. Bibliography
  6. Case study
  7. Citation
  8. Conclusion
  9. Context
  10. Data
  11. Debate
  12. Definition
  13. Discussion
  14. Evidence
  15. Experiment
  16. Findings
  17. Framework
  18. Hypothesis
  19. Implications
  20. Introduction
  21. Investigation
  22. Issue
  23. Journal
  24. Literature
  25. Methodology
  26. Objectives
  27. Perspective
  28. Procedure
  29. Publication
  30. Qualitative
  31. Quantitative
  32. Questionnaire
  33. Research
  34. Results
  35. Review
  36. Scope
  37. Statistics
  38. Study
  39. Survey
  40. Thesis
  41. Variables
  42. Insight
  43. Inference
  44. Observations
  45. Theory
  46. Interpretation
  47. Correlation
  48. Examination
  49. Relevance
  50. Inquiry
  51. Phenomenon
  52. Rationale
  53. Proposition
  54. Validation
  55. Limitations
  56. Significance
  57. Subjective
  58. Objective
  59. Elucidation
  60. Parameters
  61. Paradigm
  62. Categorization
  63. Synthesis
  64. References
  65. Parameters
  66. Editorial
  67. Appendix
  68. Scrutiny
  69. Protocol
  70. Replication
  71. Discrepancy
  72. Anomaly
  73. Experimentation
  74. Articulation
  75. Synopsis
  76. Determination
  77. Assertion
  78. Differentiation
  79. Annotation
  80. Proportion
  81. Manipulation
  82. Corroboration
  83. Complication
  84. Dissection
  85. Substantiation
  86. Generalization
  87. Classification
  88. Rectification
  89. Extrapolation
  90. Deduction
  91. Enumeration
  92. Exploration
  93. Permutation
  94. Correlation
  95. Specification
  96. Stipulation
  97. Comprehension
  98. Interpretation
  99. Documentation
  100. Elaboration

Now to Come Up With Some Useful Prompts

I then asked ChatGPT-4 to come up with some prompt ideas using these words.

Use the list of words to come up with a unique combination of prompts for an AI language model that would generate a valuable piece of content for a business, whether it is research, strategy, copy generation or a novel new idea that combines two high-level concepts to create a prompt for ChatGPT.

Sure, I can provide several prompts using the list of words you’re interested in. Here are ten prompts that you could use:

  1. “Generate a comprehensive analysis on the latest advancements in digital marketing strategies, and discuss their implications for small and medium enterprises.”
  2. “Create a well-researched, in-depth article on the importance of data-driven decision making in contemporary business management. Include case studies to substantiate your points.”
  3. “Elaborate on the theoretical framework behind customer relationship management (CRM) systems, including a detailed review of existing literature and discussion of potential future developments.”
  4. “Draft an insightful business plan for a startup seeking to leverage artificial intelligence in the retail industry. The plan should include market research, competitor analysis, financial projections, and strategic planning.”
  5. “Articulate the significance of sustainable business practices in the context of corporate social responsibility. Provide evidence-based arguments and draw conclusions about future trends.”
  6. “Design a detailed methodology for implementing business process automation in a traditional manufacturing company. Discuss potential obstacles and propose solutions.”
  7. “Create a thorough review of the implications of the Internet of Things (IoT) on supply chain management. Include real-life examples of successful implementations.”
  8. “Generate an insightful and informative white paper on the effects of blockchain technology on the financial industry. Discuss the current state of affairs and extrapolate on potential future developments.”
  9. “Present a comprehensive case study on how data analytics can be used to improve customer service in the hospitality industry.”
  10. “Develop an engaging and educational tutorial on the use of machine learning algorithms for predicting market trends. Include step-by-step procedures and practical examples.”

These prompts can be further tailored to the specific needs of your business, field, or research interests. The AI can then be used to generate a comprehensive, informed, and insightful piece of content based on the prompt.

As you can see the results are quite interesting and many seem like excellent leads to follow for creating even better prompts. I have highlighted my favorites above in bold.

I’ll admit I haven’t done extensive testing, yet. This is a theory based on my basic understanding of how LLMs are trained.

My assumption is that these words are associated with higher quality content and my hypothesis is that using these words within your prompts will produce better-quality output.

What are your thoughts on this theory? Have you found using industry-specific terminology or advanced vocabulary has improved your results?


While I commend your effort and will pay attention to it when creating prompts as learning is desirable, this may come off to some as one mold fits all and I don’t think that is your intent. :slightly_smiling_face:

A variation on what you note that I have found with LLMs such as ChatGPT is to look for what I called prefered words for a conversation or topic. What I mean by that is these are words that when a prompt is rephrased in the completion it is a word used in place of another word with similar meaning but that will appear more often in use by the LLM. These words when used again with LLM prompts tend to pull the generation of the topic closer to the desired outcome.

I originally noted the terminology here

Think of it like a teacher giving you a better way to say something and using a word more closely associated with the area of study.

For example in computer science canonical form is common and as we know there are many terms in computer science borrowed from math, however the use of canonical form with math is less common while the word normal form is much more common. I spent 30 minutes looking for research in math related to canonical form until I found a paper that used both canonical form in context and normal form in context and then changed my search to use normal form with math and the results dramatically and instantly improved.

See: Canonical form - Wikipedia


Good example, thanks for your comment. This is one of those cases where your own expertise and domain knowledge come into play for creating improved prompts. If you know your subject and its terminology well the results will improve.

1 Like

This is what I always suggest to others, I know for sure it worked with InstructGPT, the answers were more coherent, accurate and varied. But I’m not sure about GPT-4.

I also tried to write a guide about it but needed more time to make an evaluation.

Here is my repo that might make sense to you.

Very nice contribution. Thanks for sharing it and your experience.

This seems to be some validation for using these types of words.

1 Like

@MarkFulton I think this is an excellent idea and approach, thanks for sharing. The only potential area of caution I would imagine is if a user is over-optimizing which could cause the model to search find edge cases and miss a wider range of potentially relevant completions. But if used sparingly and deliberately, akin to keywords in an article to increase SEO, this form of prompt engineering optimization (PEO), seems like a fascinating area of research that can be viable for improved outcomes (again, if used properly to avoid over-optimizing).

1 Like

@shatzakis Very welcome. Agreed. Thanks for your feedback.

1 Like