Acad | Integrating AI into Research

date

Sep 22, 2024

slug

integrating-artificial-intelligence-into-research

status

Published

summary

Recent advances in AI, particularly generative large language models, enhance research efficiency through AI-augmented annotations, assisted programming, content-based file management, grammar correction, and idea generation, offering tools like Label Studio, GitHub Copilot, and BotAI for improved workflows.

AI-augmented Annotations

Annotations are time-consuming and generally expensive, especially if you want to shorten the expected completion time of an annotation task. However, scholars in various fields have demonstrated the potential of artificial intelligence in labeling certain types of data. For example, in a PNAS work and our early attempts, we both found that ChatGPT may have equivalent or even better ability to finish annotations tasks.

Among the available tools, I'd like to recommend two that we could leverage for daily annotation tasks: Label Studio and Python.

Label studio is a well-designed collaborative annotation platform that supports AI-assisted annotations.

Automatic annotations using Python and openai

However, while Label Studio is efficient enough for collaborative works, it fails to quickly generates labels for data exploration, allowing you quickly evaluate a hypothesis. Please refer to OpenAI’s official documentation for further information.

Assisted Programing

Many modern code editors now support embedded AI-assisted coding. For instance, GitHub Copilot and Cursor offer such capabilities. In my own research pipeline, I've found two scenarios where these tools are particularly helpful.

Remind you of built-in and library function properties. I often switch between Python and R, which leads to confusion about function names in different packages. For example, you might unconsciously use df.head() in R and head(df) in Python. Additionally, if you frequently use libraries like ggplot2, it's challenging to remember every detail of each plotting component. In these cases, you can simply ask the AI for a quick solution, saving time you'd otherwise spend reading documentation.

Auto-completion for easy tasks. For simple, repetitive tasks—such as writing a for loop—you can use AI-powered inline completion. This saves considerable time and effort.

Content-based File Management

Another good practice of using AI in research is to rename your messy filenames with its content. Here I provide a light-wight tool designed for finishing this simple but really helpful task: Riffo.

Riffo is an AI-driven file management tool that renames your files based on their content. When downloading papers online, you often encounter messy filenames containing DOIs, series numbers, or other random characters. With Riffo, you can rename these files in seconds without opening them or manually inputting anything.

Grammar Correction

While many of us have realized AI's potential for correcting common grammar errors, I bet few of us have a convenient way to do so. Here, I'll share the most convenient method I've found.

BotAI is a quick, light, and highly customizable tool to correct minor errors in your writing across various platforms. When you need to fix a paragraph or a specific chunk of text, simply select it and press a customizable button in BotAI's settings. Within seconds, it will correct and replace the original text.

Of course, you may discover additional tools suitable for various scenarios. For instance, I'm using Notion AI to write this very blog post.

Brainstorming Ideas

Does AI generate new research ideas? In a recently study by Si et al. on arXiv, they find that AI actually generates novel ideas that might be helpful in doing research. In our own experience, in some cases (frankly speaking, definitely not all), ChatGPT can propose some explanations to unexpected findings. Among them, we have found one explanation generated by ChatGPT was correct, after doing a hypothesis testing on the data.

Hey, ChatGPT, can you do research for me? No but I can help.

References

Gilardi, F., Alizadeh, M., & Kubli, M. (2023). ChatGPT outperforms crowd workers for text-annotation tasks. Proceedings of the National Academy of Sciences, 120(30), e2305016120.

Ouyang, R. & Yu, J. (2023). ChatGPT Outperforms Humans in Annotation: A Cross-domain, Bilingual Experiment. PolyMeth 2023.

Si, C., Yang, D., & Hashimoto, T. (2024). Can LLMs Generate Novel Research Ideas? A Large-Scale Human Study with 100+ NLP Researchers. arXiv preprint arXiv:2409.04109.

Wang, W., Ning, H., Zhang, G., Liu, L., & Wang, Y. (2024). Rocks Coding, Not Development: A Human-Centric, Experimental Evaluation of LLM-Supported SE Tasks. Proceedings of the ACM on Software Engineering, 1(FSE), 699-721.