Kontakt

Direktor

Prof. Dr. Stefan Debener

+49 (0)441 798-4271

+49 (0)441 798-5522

Geschäftsstelle

Sandra Marienberg (aktuell in Mutterschutz/Elternzeit)

Lea Hinrichs (Vertretung Marienberg, ab 01.08.2023)

+49 (0)441 798-5523

+49 (0)441 798-5522

A7 0-035

AI Code Writing Guidelines

Code sharing is an essential part of open science, however researchers often struggle with the time-consuming process of providing efficient and well-documented code. AI tools can potentially reduce this workload and make code more efficient, accessible, and better documented when used correctly. For this reason, we provide help and tools for staff and students to support their correct use. We strongly encourage you to read the University guidelines and use AI in accordance with them.

What is AI?

History of AI

Artificial intelligence (AI) has evolved through a series of advances, beginning with neural networks in the 1960s. These networks mimicked the structure of the human brain, enabling computers to learn from data and perform tasks such as pattern recognition and classification.

Machine learning emerged in the 1980s, introducing algorithms that could learn from data without explicit programming. This led to significant advances in areas such as natural language processing (NLP) and computer vision.

Deep learning, a subset of machine learning, gained prominence in the 2010s. It uses multi-layered artificial neural networks, enabling more complex tasks and improved performance.

Generative AI

Generative AI takes the next step by enabling computers to create new content. It does this by training models on massive amounts of data, allowing them to learn patterns and generate similar results.

Transformer, a breakthrough architecture in NLP, paved the way for generative AI by enabling models to capture long-range dependencies in language. When combined with deep learning, Transformer-based models can generate human-quality text, translate languages, and write various types of creative content.

Artificial General Intelligence (AGI)

Artificial General Intelligence (AGI) is basically human-level intelligence, but for machines. Unlike today's AI, which excels at specific tasks (like playing chess or recognizing faces), AGI would be able to tackle a wide range of intellectual challenges and learn new things on its own. It is the future direction of AI development, however, it's still theoretical for now.

As an intuitive example, current AI can be compared to a super-powered calculator, excelling at number-crunching but lacking versatility. AGI would be more like a human, able to apply its knowledge in many different situations.

That's where generative AI comes in. It's an area of AI that's particularly good at creating new things, like art or music. It's a stepping stone to AGI because it shows that machines can learn and be creative, both of which are crucial for general intelligence. Note that the key term here is ‘creative thinking’ by machines.

So while we don't have AGI yet, the advances in generative AI and other areas are paving the way for its development. It's a work in progress, and as AI becomes more sophisticated, we may be able to work with artificial general intelligence in the future.

Usage of LLM or AI tools for coding

Brainstorming with AI: From Problem to Code

Before diving into the code itself, AI can be a valuable tool in the initial planning stages. By using AI's ability to process information and identify patterns, you can initiate a brainstorming session to translate the problem into an efficient code flow.

This can include

  1. Function identification: AI can help identify the individual functions needed to solve the problem. By analyzing the problem statement, AI can suggest a breakdown into manageable tasks, each represented by a function.
  2. Flow optimization: Once the functions are identified, AI can help optimize the flow of execution. It can analyze potential interactions between functions and suggest the most efficient order in which to call them for smooth and logical code execution.
  3. Anticipate challenges: AI can be used to analyze the problem and suggest potential roadblocks or areas of complexity that may arise during coding. This forewarning allows you to plan for these challenges and proactively develop solutions.

Coding aids

After planning your code, generative AI tools can furthermore assist with coding tasks, especially in the early stages of software development or scripting. However, even for experienced researchers and developers, generative AI can speed-up processes and assist with more advanced tasks.

Below you can find a list of purposes that generative AI can be used for with respect to coding:

  • Code generation: Generative models can automatically generate code based on natural language descriptions, saving developers time and effort. However, this functionality is still in its infancy and it is strongly recommended to double check the generated code . Moreover, try to be as specific as possible (mention programming language and if known the software libraries you would want to use, describe the order of execution if applicable, etc.) to generate the best results.
  • Code completion: Models can suggest relevant code snippets or completions, streamlining the coding process.
  • Code documentation: Generative models can generate comprehensive documentation for code, improving its maintainability and clarity. Most tools are capable of providing documentation of the code in form of a natural language description of the developed workflow.
  • Code Refactoring: Refactoring is a term used in computer science that describes adapting your code to improve its performance and readability, or to make it more generalizable to other domains. Models can help with this task by identifying and suggesting code refactoring options, which improve code structure and efficiency.
  • Debugging: Generative models can analyze code and identify potential bugs or areas for improvement, or can help you understand error messages.
  • Testing: When producing software it is crucial to test the tool. This includes testing the robustness of the tool in various developmental environments, on different operating systems or using unit tests to test for stability against unexpected input. Generative AI can support developers in testing their tools.
  • Code Translation: Lastly, AI tools can be used to translate code from one programming language into another. This feature is helpful in the domain of neuroimaging since the main analysis tools are either written in MATLAB or Python. With the support of generative AI, researchers can now easily translate between those two programming languages. This opens the door for new collaborations and can help to bring together specialists from the different domains.

Caveats when using AI for code writing

When using AI tools for coding, be aware that you should state how the AI-tool was used (e.g., for setting up the analysis pipeline) in your master thesis / article / etc.. Also check, that you do not include sensitive information in you prompts (e.g., participant name) and that you use AI in accordance with the university guidelines stated above.

AI-assisted coding can be of great help, but it will not replace thinking and problem-solving. It is your responsibility to verify the correctness of AI-generated code. Note, that debugging AI-generated solutions might take longer than writing the code by yourself.

You will at some point notice that the tool generated generic, inconclusive, or even wrong solutions. This could be due to one of the following reasons:

  1. The AI lacks context and misunderstands the implications of your code. The AI tool misinterprets your intent. For example, if you provide too complex or too little code, or if you included functions from toolboxes without stating this information in the prompt.
  2. The AI has no understanding of what you want. This could be caused by the use of imprecise prompts. For each tool there are many blogs which give advise on prompting. You can even ask the AI itself how to best write prompts.
  3. The AI is biased towards the data it was trained on. Check for the date when the AI-system was trained and, if available, the source of information. If the AI’s training data is older than the version of your coding program or toolboxes, it may provide outdated solutions. E.g., if function inputs or calculations change with an updated of a toolbox, this might not be included in the AI solution.

Common Tools (summarized in Octobor 2024)

ChatGPT

ChatGPT is the most well-known LLM, as it has turned into a generic trademark when people refer to LLMs in general. Developed by OpenAI, it is capable of understanding and generating human-like text based on the input it receives. It has been trained on a large undisclosed set of books, websites, articles, online forums, and instruction manuals. It is available in two versions:

  • Free version: Grants access to GPT-4o mini and limited access to GPT-4o. It is proficient in a wide range of topics, but according to itself, its training dataset has not been updated after September 2021. However, it has web-browsing capability that allows it to fetch real time information from the internet.
  • Paid version (20$/month): Grants unlimited access to GPT-4, GPT-4o and GPT 1o. GPT 1o is in the preview stage. The fascinating feature of 1o compared to previous versions is that it can “think” and solve more complex tasks. All these models training set is based on the resources available until the October of 2023. Another interesting feature of these models is the presence of user-created GPTs as “GPT Store”. Users can create custom versions of ChatGPT by, for example, uploading documents, thereby tailoring ChatGPT for specific purposes. Consensus is one of these tools and is an AI research assistant capable of searching 200M papers and get scientific-based answers.

As of today, there are also tools to incorporate ChatGPT for programming purposes (e.g., in Visual Studio Code).

Github Copilot

It is an AI-based programming assistant developed by GitHub in collaboration with OpenAI. It integrates within an integrated development environment (IDE) such as Visual Studio Code. According to GitHub, it can result in coding that is, on average, 55% faster . It can be used for code completion, generation, organization, and commenting. GitHub Copilot also offers 'Copilot Chat', which relies on ChatGPT 4 for understanding and generating language. Here, you can request code for a specific purpose, and it will generate it for you. Its training set includes all the languages that appear in public repositories, thus languages with less representation in those repositories cannot be reliably programmed with the help of Copilot. Its pricing is 10$/month, however, it is free for students.

Llama

It is an open-source AI model developed by Meta. The current version, 3.2, is available with different parameter sizes for different devices: 1B/3B parameters for mobile devices and 11B/90B for more advanced computational needs. The main advantage of Llama is its security, as the data does not leave the device on which it is being run.

Google’s AI: Gemini

Gemini, formerly known as Bard, is also trained on a massive undisclosed dataset of text and code; however, it is being constantly updated as the new resources are being created. It is assumed that the data was trained on up to the early month of 2024.

  • Gemini: The free version.
  • Gemini Advanced: Paid Version (22€/Month). This includes a Google One subscription.

Academic Cloud

This service is available freely for a few institutes across Germany and all universities/research facilities located in Lower Saxony (Niedersachsen). You can log in using your university credentials to use the many services it offers. As of today, its Chat AI supports large language models such as ChatGPT 4 and LLaMa.

Webmaster (Stand: 24.10.2024)  | 
Zum Seitananfang scrollen Scroll to the top of the page