How useful is Deep Research in Academia?

on March 24, 2025

I’ve been trying out this Local Deep Research tool over the past few days. It’s supposed to be a version of OpenAI’s Deep Research that can run locally on your computer instead of in the cloud. You can configure it to use whatever LLMs you want, so if you have a fast GPU installed, you can keep everything on your machine, and you don’t have to pay.

Regardless of how well it works (we’ll come back to that), I don’t think I would recommend this to the average researcher just yet. This project is very new and very much not polished. There is a web interface, but it’s very bare-bones, and doesn’t support access control at all. That tends to preclude common setups like deploying it on a single GPU box for your entire team, assuming you’re not okay with anyone on the network having access to it. Even if you were, it seems to be able to only run a single query at a time, so if Alice is over here researching lemurs, Bob is going to have to wait until that finishes before he can research tree frogs. This is definitely designed for a single user.

Furthermore, the tool is still under heavy development, and some basic functionality appears to be broken. I came in this Monday to find that 70 commits had been pushed to the repository over the weekend, completely reorganizing the code base. These changes appeared to break support for searching local documents (which I had to fix myself). Additionally, I have encountered other issues with using certain external LLMs and with using Ollama for generating document embeddings.

To sum it up, if you’re reasonably tech-savy and you want to give this a shot, go ahead. But if you want to get actual work done, I would hold off.

My Setup

To be fair to this tool, I’m trying to use a configuration that’s fairly different from the default. My workstation doesn’t have a discrete GPU, so I wanted to use UF’s NaviGator service for all the LLM stuff. NaviGator is pretty limited in what LLMs it allows you to use for free, so I chose Llama 3.3 70B, which was the largest model available. It was a bit challenging to get set up, but this was not really LDR’s fault. It was mostly because NaviGator doesn’t actually implement all of the OpenAI API specification, resulting in a flurry of HTTP 422 errors. I had to modify the LDR code to work around this.

Also, parts of the NaviGator dashboard are just broken. Welcome to the AI University, I guess.

Local Search

One of the LDR features that I was most excited about was its ability to search a collection of local documents saved to your computer. This is really useful if you (like me) already have a painstakingly-curated collection of hundreds of high-quality papers. (By default, LDR is capable of searching the scientific literature, but is limited to open access for obvious reasons.)

I quickly ran into trouble trying to actually use this feature, though. First, as I mentioned earlier, the local search feature appears to be broken in the current LDR version. (I intend to submit a PR soon that fixes it.)

Before it can search anything, LDR has to index the documents, which means splitting them into small chunks, embedding the chunks with a language model, and saving the embeddings to a vector database. However, it seems that the developer of this tool did not anticipate my very large (>700 PDFs) collection of documents. By default, it tries to perform the embeddings locally on the CPU, but this takes forever, and it kept crashing for me. Therefore, I edited the configuration to offload the embedding to an Ollama instance I had running on a spare Jetson. This still takes an annoyingly long time to embed (around 1 hour), but at least it completes successfully.

Sort of.

When I tried to use the generated embeddings, I discovered a problem. All the queries it was doing on the vector database were returning zero results. A few minutes with the debugger, and I learned that the LDR code apparently expects embeddings to be normalized, which Ollama doesn’t seem to do. I modified the code to normalize all the generated embeddings before adding them to the database, and then sat back and waited for it to embed my documents. Again.

Results

When I finally got everything running, I decided to try some actual research. I’m writing a literature review right now for this year’s ASABE AIM submission, so I figured this was as good a use-case as any.

LDR has two different options for research, a “Quick Summary” and a “Detailed Report”. I’ve tried both of them, and I suspect that the quick summary option is what most people will use most of the time. The detailed report takes a lot longer, and it does produce a lot more voluminous output. However, I found that most of that output was only marginally relevant, and I didn’t have time to read it all anyway.

Relevance in general seems to be a bit of an issue for LDR. For instance, I asked it to summarize recent research on SAM, and found that a substantially chunk of the papers it cited didn’t even mention SAM. In fact, many were published before the original SAM paper. Even worse, I’ve seen some cases where it seems to completely hallucinate the content of a paper. It will make some statement in its report and give a citation, and when I actually go and check that citation, I will find that the paper it cited is completely unrelated to the statement it made. In short, if a student turned in one of these reports as a class assignment, I probably wouldn’t give them a passing grade.

None of these citations are related to SAM. In fact, all but one were published years before SAM was released.

Speaking of citations, LDR seems to have a minor but vexing bug where it will spit out the same citation over and over again. You can end up with dozens of duplicate citations in a generated report. I’ll have to see if I can fix that easily.

LDR has a tendency to generate duplicate citations.

At the time I was testing it, LDR was only able to search Arxiv for scientific papers. (It looks like the newest version can search Semantic Scholar too). The problem with Arxiv is that anyone and their mother can upload something there. LDR has absolutely no ability to ascertain the quality of a source, and I find that it will often throw in random stuff that hasn’t been published in a journal and hasn’t been highly cited. It would be nice if we could make it aware of some of the signals for a high-quality paper. (This is one of the big reasons I wanted it to search my local paper collection as well.)

That’s not to say that the results were useless. LDR did manage to surface some papers that I had completely missed during my initial literature review. When I asked it about robotic weeding, for example, it found a new dataset that’s very relevant to the project I am working on. I thought I had performed a pretty comprehensive survey of the datasets out there, but I had missed that one. Overall, for each query I performed, I generally found two or three papers in the citations that I ended up reading and including in my literature review.

Future Use

As I used LDR more and more, I began to realize that I was mostly ignoring the reports (which tend to be vague, inane, and plagued with hallucinations) and mostly just looking at the citations. Slowly, I’ve realized that what I really want is a tool that generates an annotated bibliography as its output. Maybe we can add an option to LDR to do that? I would find this very helpful.

I also can’t put too much stock in these first impressions, seeing that LDR is by no means finished and appears to be changing very quickly. Additionally, I’m sure that, over time, I will discover ways to eke out even more performance. I would like to test, for instance, how much the choice of model impacts performance. I believe that OpenAI’s deep research uses a reasoning model, but I am not using one, mainly because NaviGator doesn’t have any available at this juncture. (Also, reasoning would take what is already a slow process and make it positively glacial.) Potentially, though, reasoning could help alleviate some of the hallucination issues I had.

I think that I will continue to use LDR in the future. This is not merely the sunk-cost fallacy at work on the amount of time I’ve spent configuring it; I do find it genuinely useful. That’s despite the fact that it doesn’t really save me any time at all when doing literature reviews. Most of the time to do a literature review is spent not searching for relevant papers, but reading the papers you find. I think LDR is useful when it comes to the searching part, but I still have to do all the reading.

By any chance, are you an academic researcher? Have you tried out any form of Deep Research? If so, what are your thoughts?

Categories:

Uncategorized

Tags:

LLM writing

How useful is Deep Research in Academia?

My Setup

Local Search

Results

Future Use

No responses yet

Leave a Reply Cancel reply