We can observe that small children understand certain words - more exactly: the meaning of certain words (in certain situations) - already when they are not capable of producing words themselves yet.
The prehistoric people probably already had certain meanings for words when they could not speak them yet. At that time these meanings were represented by other signs, e.g. by gestural or/and mimic signs, both more and more combined with phonetic signs (but these phonetic signs could not yet be combined into words at that time).
When they later realized that due to a physiological change (lowering of the larynx) they could produce many more phonetic signs than before, the meanings shifted from a system based on the combination of gestural, mimic and phonetic signs to a more complex system based on more phonetic signs that could be combined to words.
Evolutionarily said, the probability of words appearing and the probability of beings with the ability to use words appearing are about equally small. But if one of the two possibilities is there, then the other possibility is also there. Why should e.g. a being that can use words not be able to find corresponding words? Or the other way round: Why should e.g. words, if they have already appeared, not find beings which use words? Do you notice that the answer of your question presupposes the existence of beings which know what words are and at the same time presupposes just existence of words?
By the way: Words alone are not yet the only meaningful thing of a language. The meaningfulness of a language results from the combination of words to sentences and texts, i.e. to the syntax, to the grammar, to the linguistic language.
And again, exactly what I just wrote can be observed in small children too: The words they use are soon meant as sentences - one-word sentences -, and these one-word sentences are followed by two-word sentences, then three-word sentences up to multi-word sentences, with which syntax is then largely mastered.
So, (a) in one way the words are there earlier than the sentences (that is the way they interpret by the culture already educated elders), but (b) in the other way the sentences are there earlier than the words. With words, one usually does not want to say something word-like, but more: to express a meaning that goes beyond words, i.e.: sentences or texts.
As long as all this does not work linguistically, it already works semiotically. And this is also confirmed by the observations.