Liu said his study assumed that most commercial applications are operating in a setting where they use some sort of vector database to help return multiple possible results into a context window. Language models’ best use case: Generating content “So I think it’s a bit more nuanced than ‘You should always use a vector database, or you should never use a vector database’,” he said. But they are good at finding the one thing that is clearly relevant when most other things are not relevant. Language models are bad at differentiating between many things that are closely related or which seem relevant, Liu explained. Results will depend specifically on the sort of content contained in the documents the LLMs are analyzing. Liu cautioned, however, that the study isn’t necessarily claiming that sticking entire documents into a context window won’t work. “If you’re searching over large amounts of documents, you want to be using something that’s built for search, at least for now,” said Liu. Stanford University’s Nelson Liu, study lead author, agreed that if you try to inject an entire PDF into a language model context window and then ask questions about the document, a vector database search will generally be more efficient to use. Wiederhold pointed to the study as evidence that vector databases will remain viable for the foreseeable future, since the study suggests semantic search provided by vector databases is better than document stuffing. Vector databases like Pinecone help developers increase LLM memory by searching for relevant information to pull into the context window. Semantic search preferable to document stuffing Last week, industry insiders like Bob Wiederhold, COO of vector database company Pinecone, cited the study as evidence that stuffing entire documents into a document window for doing things like search and analysis won’t be the panacea many had hoped for. Furthermore, performance substantially decreases as the input context grows longer, even for explicitly long-context models.” The study found that LLMs performed best “when relevant information occurs at the beginning or end of the input context, and significantly degrades when models must access relevant information in the middle of long contexts. Anthropic just released a new model called Claude 2, which provides a huge 100k token context window, and said it can enable new use cases such as summarizing long conversations or drafting memos and op-eds.īut the study shows that some assumptions around the context window are flawed when it comes to the LLM’s ability to search and analyze it accurately. LLM companies like Anthropic have fueled excitement around the idea of longer content windows, where users can provide ever more input to be analyzed or summarized. If an LLM could take an entire document or article as input for its context window, the conventional thinking went, the LLM could provide perfect comprehension of the full scope of that document when asked questions about it.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |