by Daniel Petti

Generative AI use in scientific writing, as in many professional domains, doesn’t exactly have a sterling reputation. On other hand, the argument is a little bit more nuanced than the “AI will make us as gods but also turn us into paperclips” vs. “AI is a scourge and every time you use it, Sam Altman eats a kitten” factions that tend to arise on social media. I’ve slowly been drifting towards the altogether fairly mundane conclusion that AI is a tool, and, like most tools, can be used well or used poorly. I think that AI use in scientific writing suffers from a version of the Toupee Fallacy: As much as we love to ridicule egregious examples of AI slop slipping through peer review, the truth is that for every misplaced rat penis there are probably 10 subtle uses of AI that aren’t even noticeable.

I myself have been using AI tools in my writing for several years. Initially, LLMs were a convenient way of drawing figures in Python without having to remember the intricacies of Matplotlib’s annoying API. This made it a lot easier to experiment with different figure designs. I’ve also found that LLMs can be useful when faced with that annoying reviewer who insists that everything be written in the passive voice. Even relatively small LLMs do a decent job of rendering your initial prose sufficiently stilted and turgid for scientific publication. (The reverse version of this also suggests a fun new way to retaliate when someone emails you one of those annoying, passive-voice “I’m sorry that you were offended” non-apologies.) Furthermore, I sometimes use AI tools to help with initial research for literature reviews.

That being said, it hasn’t all been sunshine and rainbows. I know that a lot of people think that LLMs are good for revising, but I gave up on this a long time ago. Sometimes it corrects a superfluous comma or two, but, by and large, I find that (surprise!) I’m usually a better writer than ChatGPT. That being said, I am a native English speaker with a lot of writing experience; I could imagine these tools being far more useful in a situation where the author has a shakier grasp of language fundamentals.

The AI Figure Creation Process

The other day, Dr. Li asked me to create a figure for a keynote presentation that he’s giving at the IFAS AI Summit. We’ve been collaborating for a few months with some of his former students on a perspective paper. During the writing process, one of the grad students made some really slick-looking figures. He never told us exactly how he did it, but the implication was that he was using Adobe Illustrator with AI-generated assets. Dr. Li’s main take-away from this process was that he wanted us to be able to do the same thing. Well, specifically, he wanted me.

I did my best to replicate this pipeline. It turns out that AI image generators, when prompted carefully, are actually pretty good at maintaining a consistent style and color scheme, which is frankly more than I expected. I then cleaned up the outputs manually and shoved them all together in Inkscape. He seemed pretty pleased with the results. Of course, it took me, like, two hours to figure all that out. Who says using AI doesn’t save you time.

Asset Generation

First, my figure needed some icons. My normal process for this is to search the internet for something suitable. However, I find this to be sub-optimal because it can be difficult to ensure that everything aligns stylistically. Initially, I attempted to coax various LLMs into outputting SVG code directly. This… sort of worked, but I tended to end up with output that was way more simplistic and abstract than I wanted. Eventually, I tried feeding similar prompts into Google Gemini and allowing it to generate images. Even though this doesn’t produce the editable SVGs that I want, I find that the results that I get from the Nano Banana image generator are typically much more usable. I’m not totally happy with having to rely heavily on a non-UF-hosted model for this, but as long as nobody tells I.T., I think it should be fine.

Here is a typical prompt that I used for making icons:

Please generate an icon representing “agentic AI”. The icon should be in the lineal style with a triadic orange-purple-green color scheme. The icon will be used in a scientific figure.

This is the result:

A couple of things to note: first of all, it mostly nailed the style and color scheme specifications. Furthermore, I’ve seen that this seems to be fairly consistent across multiple generated images, which is awesome. Overall, the icon largely makes sense except for a little bit of weird stuff going on with the circular border. That’s not the end of the world, though.

Also, I would prefer if the output not include text at all. I find that text in AI-generated images is usually more trouble than it’s worth, as it’s really hard to keep the fonts consistent and match them to the rest of the figure. I might experiment in the future with explicitly asking it not to include any text in the output.

Cleaning up the Assets

As I mentioned previously, the image assets that I produced with Gemini are not totally usable in their raw form. For one, Gemini appears to be incapable of generating PNGs with transparent backgrounds. Therefore, I manually removed the white backgrounds in Affinity Photo. (You can certainly use Photoshop or GIMP for this too; it’s just a matter of selecting all the white pixels.)

Additionally, the results occasionally contained weird asymmetries or extraneous graphic design elements which offended Dr. Li’s delicate sensibilities. Therefore, for many of the images, I also did some manual cleanup work in Affinity Photo. This, of course, takes time, but still less time than drawing them from scratch.

Assembling in Inkscape

The final step was taking these assets and using them to create the complete figure design in Inkscape. I used Inkscape to add any additional text and design elements, but ended up keeping the generated assets as embedded PNGs in the Inkscape document. I did experiment with Inkscape’s tracing functionality, but I couldn’t quite get the results I wanted. The output from Gemini is 1024×1024 pixels, which is pretty generous, so I didn’t have any obvious issues with resolution.

A Less Successful Figure

At some point, it’s likely that you will be tempted to try generating an entire figure using AI. Dr. Li tried this for a proposal that we’re writing on turf grass breeding. He emailed me the results, with a request that I “clean them up”:

AI-generated overview figure with obvious mistakes — The original, AI-generated figure.

Well, we’re not exactly in rat penis territory, but clearly, we’re still a ways away from something you can publish in Nature. I don’t know about you, but I personally never leave the house without my Dowrweiling Spectomttter and Data Acquipition Unit. And what’s with the weird leaves on the plants in the “greenghouse”?

As entertaining that these errors are, I would argue that they’re not actually that important. After all, basic misspellings and the like are relatively easy to correct. More insidious are issues that stem from the model fundamentally not understanding the information that it’s trying to convey. The difference between me and the model is that if I were to draw this figure from scratch, I would have some sort of intention, something that I’m trying to communicate. The AI does not have that. At best, it has a scrap of my own intention diluted through a 20-word prompt and the linear projections in a transformer layer.

This lack of understanding leads to the most subtle and difficult to correct issues with this diagram. For instance, why is the gantry in the greenhouse growing tentacles that touch each of the plants? That’s not how any of those sensors actually work; they capture data without any direct contact with plant materials. And, in the diagram below: Why is “time-series input data” shown as an output of the transformer model? That box actually contains the model input. I mean, it even says so in the heading! Those types of errors are what make “cleaning up” this figure the most difficult. They require me to spend several hours in Affinity Photo meticulously moving pixels around.

For reference, here is what the figure ended up looking like once I was done with it:

AI-generated overview figure with mistakes corrected. — The “cleaned up” version of the same figure.

Conclusion: A Tale of Two Workflows

Final figure showing future directions for AI research in Agriculture. — The final version of the figure.

Dr. Li ended up using the first figure that I made in his keynote. Overall, it did take me awhile to make it, but I think most of it was due to the fact that I was figuring out the workflow as I went along. Were I to try it again, I could probably do it much faster. Overall, I am happy with how this figure turned out. I think it is high quality and I produced it much faster than I could have without AI.

On the other hand, the second AI-generated figure for the proposal was a lot more onerous. At the start, it looked like this method would be more efficient, as we were delegating a lot more work to the AI. However, although the result seemed superficially decent, a moment’s examination revealed that it had serious issues stemming from the AI being fundamentally incapable of performing this high-level conceptualization work. I think that the reason that my original workflow worked so much better than this was because it very deliberately relegated the AI to a domain that it was good at: generating simple icons that conform to a detailed, clearly-stated specification. Meanwhile, all the high-level conceptualization and design was left to humans, which is where it probably still belongs for the time being.

Overall, I’m glad that I had this experience. I think that I learned useful lessons that will come in handy for future papers that require nice-looking figures. Good visual communication is such a useful skill in science that it’s worth figuring out how to do it properly. How AI fits into that process may be somewhat counterintuitive.

Categories:

tech

Tags:

LLM writing

Try this One Weird Trick for Better Figures