When does the model start to process a user's speech (and apply the current context) in a conversation?

Title pretty much says it all. In a conversation, when does the model start to process the user’s speech in terms of generating a response? Is it when they have finished speaking or earlier? At which point is the system context used/grabbed?