RAG and function calling (Tools) - Langchain4j on Spring

cvalore7 · April 8, 2024, 4:25pm

Hello everyone guys. I’m new here and I’m a newbie on this topics, I’m searching feedbacks on some doubts I have while developing a spring boot application using langchain4j 0.29.0.

Don’t know if it would be a silly question, but, my doubts concerning RAG together with function calling (Tools).

Let’s suppose I’ve setupped a ChatAssistant to use a generic DefaultRetrievalAugmentor with an EmbeddingStoreContentRetriever as retriever.

The same ChatAssistant is also enabled to some tools through a class that exposes some methods annotated with @Tool that solve an equation (just an example, not real use case).

Now, when I invoke the chat method saying, as an example, “solve x^2 + 2x + 1 = 0”, by debugging the undergoing calls to DefaultAiServices, I understood that what happen is:

the Augmentation is done
the message is sent (messages being memory mandatory)
the message response contains tool execution
the tool is executed
the answer of the tools is add as message
stop or go back to 2

In my use case the tool response is a prompt built after calling some external services to gain information.

Said that, the RAG is done for the very first message promp, the one that triggers the function calling, but my question is:

“Wouldn’t it be more useful if the RAG was done on the second prompt? the one created via function calling?”

I can’t understand if, as it is, the RAG on the first message can have significant utility, or would it be better to do RAG “by hand” directly on the second prompt?

What do you think?
Thanks to everyone who will have a little patience to tell me their opinion, and maybe explain to me where I’m going wrong.

Carmelo

wclayf · April 9, 2024, 5:13am

I’ve never done “tool” use before, but probably the RAG is used at the beginning to be sure to inject context information in at the beginning so it knows up front what it’s working on. Then probably any future calls are also sending back the full conversation history every time, so that same context is sent always. I’m just guessing.

I’m a spring boot developer too by the way, but I’m just using directly HTTP calls and not langchain4j.

cvalore7 · April 9, 2024, 7:44am

Hi @wclayf, thanks for your feedback. That’s my guess too honestly.

I just taught that in my real use case scenario it would be more useful to have RAG on the function calling created prompt rather than the first.

Let’s say my first input is “estimate 1234”.
Then the function calling through an external ws gets the resource 1234 and creates the prompt “given <content of resource 1234> and <similar resource to 1234 retrived from embedding store>, do something.”
I thought it would be more useful to have:

First approach:
1 - “given this context , given <content of resource 1234> and <similar resource to 1234 retrived from embedding store>, do something”
2- AI response

rather then

Second approach:
1 - “given this context , estimate 1234”
2- “given <content of resource 1234> and <similar resource to 1234 retrived from embedding store>, do something”
3- AI response

(that btw I could do by hand, but then having the agumentor configured I would augument twice the same conversation snippet)

But probably being the full conversation sent the two approaches could be similar.

I’ll wait for other feedbacks and if nothing happen I will accept your solution, thanks.

Carmelo

wclayf · April 9, 2024, 4:07pm

You could check inside the github codebase and see if there’s additional logging you can turn on (like full debug logging) and maybe get the Langchain4j to print out it’s full API query JSON too, to see what it’s really doing.

I might try Langchain4j. I recently tried to use the new Java 22 Panama, which theoretically is a cleaner way to call Python, or other languages, directly from Java in shared memory/variable space, but I didn’t get it to work yet, because I decided to only give it one day of my time, thus far, and was getting errors. Langchain4j might be easier to integrate with Java (obviously!) but in my mind the “real” langchain is the Python one.

ai4j · April 9, 2024, 6:37pm

Hi, could you please share more details about your use case?

Usually, RAG is done on the original query from the user, but this is definitely not a must.
Using tools to retrieve more information on demand is also a good strategy. But without knowing more details it is hard to advise something specific.

BTW, AI Services in LangChain4j is a high-level API for building LLM-powered applications, which should be suitable for 80% of use cases. That flow (User → RAG → LLM → Tool → LLM → User) is very common, but might be limiting for some use cases. In the long run, we will be making this flow more customizable, but for the moment you could use low-level API (directly using ChatLanguageModel, ChatMemory, DefaultRetrievalAugmentor, ToolSpecifications, etc) to build any flow you need.

If you need to do RAG using the output of the tool as a query, another (maybe not the prettiest) option is to inject DefaultRetrievalAugmentor (or just EmbeddingStoreContentRetriever) in the object that has a @Tool method and before returning from a @Tool method, call DefaultRetrievalAugmentor/EmbeddingStoreContentRetriever and get more content from it. Then, return original tool output + content retrieved from RAG.

Edited:

Another option that comes to mind is to, instead of using tools, implement a custom ContentRetriever that will retrieve content from external API and plug it into DefaultRetrievalAugmentor.

If you have a limited set of external services, you could implement one ContentRetriever per external service and use LanguageModelQueryRouter to route user query to one (or multiple) of them. Here is an example.

Or, you can implement a single custom ContentRetriever which will transform/route user query using LLM or any other custom logic.
Classification might help here.

Hope this helps. BTW, we have a discord server where you can get more help.

ai4j · April 9, 2024, 6:59pm

Indeed, enabling logging should help a lot.

cvalore7 · April 10, 2024, 9:25am

Hi, thank you for your feedback.

My use case is pretty the same of what I said in a previous message:

Let’s say my first input is “estimate 1234”.
Then the function calling through an external ws gets the resource 1234 and creates the prompt “given <content of resource 1234> and <similar resource to 1234 retrived from embedding store>, do something.”

So my tool executing is responsibile to find relevant information wrt the question and generating a prompt that would contain the question fetched resource from ws and the embeddings fetched similar resources, upfront to a simple “estimate this” question.

I think customizing the flow could be useful.

I will try to leverage the low-level API to achieve what I want and understand whether it’s a better approach to RAG the tool ouput or not.

BTW, yeah, I know I can inject the retriever and augment the tool output “manually”, but being this assistant the generic Chat one and having configured the RAG on it, then both the “automatic” and “manual” augmentation would be done, resulting in a lot of useless tokens.

Thanks!
Carmelo

cvalore7 · April 10, 2024, 9:26am

I will have a look into the other option!! Never heard about, thank you.

Carmelo

visrow · April 13, 2024, 5:09pm

Can you try this with Tools4AI library . In your situation it will work like this

<dependency>
    <groupId>io.github.vishalmysore</groupId>
    <artifactId>tools4ai</artifactId>
    <version>0.9.6</version>
</dependency>

AIProcessor processor = new SpringOpenAIProcessor(applicationContext);
String prompt = "I need to add 2 numbers";
Object object = processor.processSingleAction(prompt);
String answer =  processor.query(prompt,(String)object);

The actions can be java methods, Pojos, Httprest or shell script.

(Disclaimer - I am the developer of Tools4AI project)

Topic		Replies	Views
How to Chain Tools in OpenAI Function Calls? API	16	282	January 13, 2025
Question on LLM tool "usage" & framework API api	4	2479	November 4, 2023
How a LLM-based application, integrates a custom function (API)? Community api	5	15066	July 9, 2024
Retrieving knowledge base answers using Function Calls Prompting api	9	4094	October 11, 2023
What is the exact system prompt that gets inserted when calling a tool? API gpt-4 , chatgpt , api , assistants-api	6	444	January 1, 2025

RAG and function calling (Tools) - Langchain4j on Spring

Related topics