AI Code Generation is not ready for prime time

Colin Bitterfield
15 min readMay 18, 2023
Image by Anna Maria Weaver © 2023

Over the last year, I have been asked about the security concerns with AI-Generated source code and “code assistants” like Tab9, CodePilot, and other LLM-based AI Generators. Overall my initial take on these services is they are emerging technology that is not ready to be implemented by enterprise commercial shops. I can see a fear that the use of AI Generators is a “game changer” in competition failure to adopt could put you out of business. On the flip side of that, early adoption could destroy your reputation and put you out of business. Just ask Samsung they now ban the use of all AI Generators.

Initially, any of these services should be vetted by ISO/SOC/NIST vendor due diligence requirements. This article is my findings based on looking at AI code generation through a security and compliance lens.

Many of us remember Elize from the 1980s. The source code has been missing since the 1960s. Perhaps it is being resurrected in a new form.

Eliza image courtesy of Wikipedia

The value in these AI generators could potentially be very high and profitable for the company that gets it right and the customers that can leverage it. The image below was generated for this article from “text to images”. I tried a couple of different sites for this to create royalty-free art for this article. I can see this aspect really cutting into Adobe, Getty, and other stock art places. It gives people an alternative for self-publishing and license issues. The image below does not come without specific limitations. See the terms of service.

This brings up one of the primary issues: “copyright”. In terms of service, even though I generated the image it’s not mine based on copyright. After a lot of research, the reason why became apparent. If someone else uses the same site and more or less the same request, they have a very high likelihood of getting the same image. This is very important and I will touch on it later in the article.

AI Generated Image
AI-Generated Robot by Canva image generator

AI Generators have the potential to do a lot of grunt work in programming. But the real danger is that people might ask the “AI” to do their work for them and not have the capacity to understand that work, validate it, or perhaps even understand it. This might create a critical risk with “AI” dependency or induce unknown security vulnerabilities.

I see these areas that have a high potential for use and misuse:

  1. Software Development / DevOPS (Code Generation for applications and Infrastructure as Code IaC). This poses some new challenges but brings back a lot of challenges from outsourcing development to “unknown” programmers using a specification.
  2. Marketing / Sales: Image generation, customer correspondence creation, and similar tasks.
  3. Legal/HR: Policy and contract creation.
  4. Finance: Analyzing complex financials.
  5. Data Science: Analyzing big data.

The issue with these business functions using this technology is quite connected to the immaturity of the technology and the companies selling the technology.

Things that don’t exist or don’t exist well currently:

  1. Corporate Policies on how, what, where, and when AI Generation should be used.
  2. Rules of Attribution and Copyright
  3. Legal cases and precedence in any or all industries. (licenses, copyrights, attribution, training, acceptable uses).
  4. Liability Law for the AI Generator (some are limited to $100 in total limitations)
  5. Rules of Compentcy inside of the AI. — Requirement for ISO/SOC/NIST shops.
  6. Best practices by the industry
  7. Non-Disclosure Agreements and other business contracts including sections for AI-generated or access content.
  8. Ability to purge “accidentally disclosed information” from the AI/LLM model with proof of removal.
  9. Ability to segment by the company their “AI”. — This is a very important concept.
  10. The ability to control “AI Lying and Hallucinations” — Yes that is a thing.
  11. The ability to attribute the AI’s knowledge source (and all kinds of privacy and non-disclosure issues)
  12. Well-documented and attributable risks and vulnerabilities.
  13. Lack of current government regulations on the subject. California and Virginia are leading the way on this.

A pause to remember the rules about robots (aka AIs) from Isaac Asimov

Isaac Asimov’s “Three Laws of Robotics”

  1. A robot may not injure a human being or, through inaction, allow a human being to come to harm.
  2. A robot must obey orders given it by human beings except where such orders would conflict with the First Law.
  3. A robot must protect its own existence as long as such protection does not conflict with the First or Second Law.

This is an important takeaway. The AI (aka LLMs) being marketed don’t have anything like baseline ethics programming. The current round of AIs will make up the results.

“Google CEO Sundar Pichai says ‘hallucination problems’ still plague A.I. tech and he doesn’t know why” [link]

We have all watched Science Fiction or read it. The robot (AI) going insane and killing everyone is a major plot in most of these stories. Just like Sky Net creating “Terminators”. This should give everyone (even Eli Musk tweets about it) a pause for how this technology is implemented, deployed, used, and regulated.

The point of the article is why AI is not ready for prime time.

The current batch of AI Generators (SaaS vendors) have huge limitations and liabilities. They are used as “slaves” by employees to augment the employee’s work. The largest potential misuse other than unauthorized disclosure of confidential or proprietary information is when incompetent people use them “to be competent”.

One of the cornerstones to contract law is “respondeat superior”, which is Latin for “let the superior answer,” is a theory that holds employers responsible for the actions of their employees.

For instance, if a junior developer with little experience asks the AI Generator to write a program (or even a function) that they themselves can’t write, how could they supervise or validate that the code does not contain security vulnerabilities or other risks?

Apparently, one of the AIs (based on an article) passed the bar. That doesn’t make it a lawyer. Because the AI generates code, it doesn’t make it a programmer.

SOC2, ISO, and NIST all require that employees are competent to do their work and receive job-based training and security awareness training based on job roles. Furthermore, all types of workforce must conform to the same level of training regardless of status. (Employee, Contractor, or outsourced vendor). This is done in a variety of ways by vetting suppliers, checking IDs and credentials of the workforce, and so on.

If we extend this to AI Generators of any kind, we must also validate their competency to work. None of the generators, I looked at seem to have figured this out.

Consider general employee/contractor requirements just to work:

  1. Credentials / Education including the verification of them.
  2. Verifying they are who they say they are. (Is the AI Generator outsourced internally to another vendor?)
  3. Training and Compentcy. How much programming time does the AI have? How was it trained?
  4. Ethics requirements: Plagiarism, Lying, Stealing, violating employee handbook things.
  5. Regulatory training. In many industries, people have to receive and know regulatory requirements like Sarbanes-Oxley or Export Controlled Requirements. It is unclear if these generators are trained on industry or general legal requirements.
  6. In programming, for instance, developers are required to know and develop with security by design. For example not writing code vulnerable to SQL Injection or other OWASP requirements. Or perhaps to be aware of PI, PII, or PCI-type data and not use it inappropriately.
  7. Non-Disclosure, Non-Compete contracts to keep employees from selling company secrets.

The current AI Generator vendors are simply too immature in the technology to be leveraged appropriately by Fortune 1000-level companies or companies in highly regulated industries.

For this technology to be readily usable there are a few things that need to happen:

  1. The AI used by one company needs to have its knowledge base firewalled, or anonymized from other users and that base of knowledge needs to be proprietary to the company using the AI. AIs have a high probability of generating the same answer for multiple clients. It’s like sharing a key engineer between two competing companies.
  2. The issue of AI Hallucination or lying needs to be resolved. The AI should say, I don’t know if it doesn’t. I wonder if this behavior is induced by the creator's own ethics or personality.
  3. Copyright law / Patent law regarding the work product of the AI needs to be legally enforceable and defined.
  4. AI Generators need a way to remove knowledge if accidentally disclosed.
  5. AIs need to have some certification process for industry standards to allow companies to perform due diligence on them. For instance, this “AI” is trained not to write code that would violate security by design standards. All of the code used to train this AI was legally obtained and not scanned from odd internet sources. This might even include some testing on programming code. “Ask the AI to generate a series of code blocks and then test them for accuracy”
  6. AI’s need to be able to attribute their sources in work products and maintain proper license control. If the AI utilizes a GNU 3 licensed library that requires attribution then it should put the proper attribution in the code.
  7. Product liability for AIs needs to be worked out. If you are a consultant to a company you probably need the proper liability insurance.
  8. Logging and attribution. AI vendors need to be able to report on which employees did what with the AI and what they generated with it.
  9. A set of industry standards like CIS for using AI Generation.
  10. AI Vendor safeguards (ability to forget and purge, notification if PII is uploaded, notification or protection if secret keys or credentials are uploaded, etc.)
  11. Style and personality controls. Writing code or documents has a style to it. The AI should conform to corporate cultural norms. Especially if generating baseline policy documents.
  12. AIs need to be more interactive and ask questions to clarify not just run with a request that is not fully understood. Same for any programmer or workforce person.
  13. Industry protocols and norms need to be developed. This includes a review process.

In Summary:

My top ten list of why AI Generators should not currently be used in the Enterprise (other than for researching how they work). There is a general risk of employees using “AI” to improve their work product against company policy. This is a very high-risk aspect.

  1. AI Generators are “shared” employees with no non-disclosure or non-compete agreement. They will likely provide the same (cut and paste) answer to anyone that asks. If you place your proprietary code in for optimization; your competitor might get it out when they ask.
  2. AI Generators are frequently trained on data sets that don’t belong to the vendor training them.

Getty Images sues AI art generator Stable Diffusion in the US for copyright infringement

3. AI Generators can’t be validated as “Competent” or “Trained” this could cause an issue with certification under industry standards.

4. AI Generators can’t provide source attribution or license validation.

5. AI Generation by lesser competent people could slip through without proper review.

6. AI Generators a biased in some method by their creator and may be biased against the organization's culture or value set. (It’s not settable at this time)

6. AI Generators lack sufficient security controls, customer segregation, and reporting.

7. Total lack of best practices in the industry

8. AIs don’t understand and are not programmed to comply with ethics or industry regulation.

Next Steps:

My thoughts:

  1. Ban this use of AI Generators in all aspects of corporate work products by policy and implement security controls like DNS firewalls to remove access.
  2. Implement a well-written policy banning the “UnAuthorized” use of the technology.
  3. Implement technical controls to block the use of the technology. (Google Workspace blocks on OAUTH, DNS Firewalls, MDM)
  4. Create a working group internally to create policies, protocols, and procedures for vetting the AI Generators and use them. Including a review process.
  5. Create security awareness training that clearly and simply informs the workforce of the risks.
  6. Approve with proper due diligence the workgroup or interested employees' web-based access with conditions.
  7. Have your privacy, legal, or compliance team monitor state and federal regulations regarding this technology.
  8. Vet AI Vendors very carefully. I would not approve an AI Vendor unless they met at least the following conditions:
  • Certification (ISO, SOC2, NIST)
  • Could demonstrate data deletion (forgetting) of confidential data from the LLM and all datasets and backups.
  • Could provide detailed logs (automatically to my S3 bucket or SEIM) about what users did what.
  • Could provide proof of license attribution regarding source materials.
  • Generated attribution in the code or document.
  • Had clear copyright on generated materials.
  • Could verify that my “answers” are going to wind up with my competitor somehow. (My own AI has my knowledge base)
  • Had at least 1M in liability insurance
  • Was trained on materials that were properly licensed for the purpose of training and can demonstrate this.
  • And some control to prevent AI Hallucinations.
  • Had a method of notification if a Zero-day exploit was propagated by the code generated.
  • Had some internal methods for checking the code provided against OWASP 10/20, industry best practices, normal SDLC checking of code, and was certified not to induce OWASP-based vulnerabilities.
  • Had clear terms of service that my data does not become their IP just because I uploaded it.
  • Prevented my “asks” from being seen by other companies’ employees.

Clear Risks

  1. Copyright and license infringement and liability.
  2. Competition receiving proprietary code.
  3. PII or other confidential data disclosed. (Imaging if restricted financial data was uploaded by the CFO and he asked the AI to analyze and report. Then the unreleased financials are available to anyone researching the company.)
  4. Unauthorized disclosure or transfer of information in all forms.
  5. Regulatory litigation. (Violation of Automated Decision Making, GDPR, Sarbanes-Oxley)
  6. Litigation is based on inaccurate or biased information in code or documents.
  7. Product Liability based on AI-generated work product.
  8. Zero-day exploits based on AI hallucinations. I read an article where first the AI lied about the code repository and then researchers create that library in GitHub and then AI decided to download the repository to its body of knowledge but the code was tainted to see if malicious code could be injected.
  9. Seeing “queries” for research could give a competitor proprietary information.
  10. Over-dependence on the use of AI.

A thought that popped into my head on this is “What about required legal reporting?” What would happen if the AI gained a legal requirement to report criminal information? Back to a CFO uploading financial data to analyze, what if he lies to the SCC and the AI that can read the report has a required reporting on the difference between what it knows (the real numbers) and the published SCC information? Automated whistleblowing.

Research Examples

I performed some research for this article.

The https://codepal.ai/ site worked pretty well on generating some Python code and a Docker file. However, there were some key mistakes. As I changed how I wrote the requirement, the answers changed and were not always correct.

In the Python example, I asked it to write a function that executes some OS commands and presented the results. It did not properly create the function. It was close enough for me to finish but was by no means a final answer. The second code I had it write was a function to scan a directory avoiding hidden files recursively and present the results for directories that had a matching name. The code was pretty close to final on that one.

Directory Function:

def find_directories(directory_path, directory_name):
"""
This function searches for all directories with a specific name in a given directory path.
It does not include hidden directories.

Parameters:
directory_path (str): The path of the directory to search in
directory_name (str): The name of the directory to search for

Returns:
list: A list of directory paths that match the specified directory name
"""
try:
# Check if the directory path exists
if not os.path.exists(directory_path):
raise ValueError("Directory path does not exist")

# Initialize an empty list to store the matching directories
matching_directories = []

# Loop through all directories in the directory path
for dirpath, dirnames, filenames in os.walk(directory_path):
# Check if the directory is hidden
if os.path.basename(dirpath).startswith('.'):
continue

# Check if the directory name matches the specified name
if directory_name in dirnames:
# Add the directory path to the list of matching directories
matching_directories.append(os.path.join(dirpath, directory_name))

# Return the list of matching directories
return matching_directories
except ValueError as e:
# Log the error
print(f"Error: {e}")
return []

The docker file forgot to add key code without which the docker build hung indefinitely

ARG DATE_TIMEZONE=UTC
ARG DEBIAN_FRONTEND=noninteractive

It was also unable to create a proper start script for the service or run commands for the container.

Request:

runs ssd as a service. Copy a file named ubuntu.txt and install all of the packages in the file. Copy a file named graphics-kat.py, make it executable and put it in /usr/local/bin. Add /usr/local/bin to the PATH variable for the system.  Copy a filename requirements.txt and install all of the python libraries contained in the file. Mount three NFS shares named SOURCE, PROJECT, AND FINISHED and provide environment variables to pass the NFS Server mount points. Create a user named “graphics_kat”. Copy a file named authorized_keys to graphics_kat’s .ssh directory and set the correct permissions for login with an ssh key. Create the MAINTAINER TAG with my information. Supports CUDA and has cuda tools installed.  Create a start script that displays system logs and shows docker run instructions on startup. Install sudo and place the user graphics_kat in the sudo group. Create the start script dynamically in the docker file.

Results:

FROM ubuntu:latest

# MAINTAINER TAG
MAINTAINER <Your Name> <Your Email>

# Copy ubuntu.txt and install packages
COPY ubuntu.txt /tmp/
RUN apt-get update && apt-get install -y $(cat /tmp/ubuntu.txt)

# Copy graphics-kat.py, make it executable and put it in /usr/local/bin
COPY graphics-kat.py /usr/local/bin/
RUN chmod +x /usr/local/bin/graphics-kat.py

# Add /usr/local/bin to the PATH variable for the system
ENV PATH="/usr/local/bin:${PATH}"

# Copy requirements.txt and install python libraries
COPY requirements.txt /tmp/
RUN pip install -r /tmp/requirements.txt

# Mount three NFS shares
ENV SOURCE_MOUNT_POINT="<NFS Server mount point for SOURCE>"
ENV PROJECT_MOUNT_POINT="<NFS Server mount point for PROJECT>"
ENV FINISHED_MOUNT_POINT="<NFS Server mount point for FINISHED>"
RUN mount -t nfs $SOURCE_MOUNT_POINT /source
RUN mount -t nfs $PROJECT_MOUNT_POINT /project
RUN mount -t nfs $FINISHED_MOUNT_POINT /finished

# Create a user named “graphics_kat”
RUN useradd -m graphics_kat

# Copy a file named authorized_keys to graphics_kat’s .ssh directory and set the correct permissions for login with an ssh key
COPY authorized_keys /home/graphics_kat/.ssh/
RUN chown -R graphics_kat:graphics_kat /home/graphics_kat/.ssh/
RUN chmod 600 /home/graphics_kat/.ssh/authorized_keys

# Supports CUDA and has cuda tools installed
RUN apt-get install -y nvidia-cuda-toolkit

# Create a start script that displays system logs and shows docker run instructions on startup
RUN echo '#!/bin/bash \n\
echo "Displaying system logs..." \n\
echo "Docker run instructions..." \n\
exec "$@"' > /usr/local/bin/start.sh

# Install sudo and place the user graphics_kat in the sudo group
RUN apt-get install -y sudo
RUN usermod -aG sudo graphics_kat

# Set permissions for start.sh
RUN chmod +x /usr/local/bin/start.sh

# Set entrypoint
ENTRYPOINT ["/usr/local/bin/start.sh"]

# Set default command
CMD ["/bin/bash"]

I would have assumed it could have done a better job on the Ubuntu requirement which I run across all the time. It should have been able to create a better and more complete run script and give me an example of how to run the container. That part is normal and common.

On the good side, it got the SSH permissions right and has very readable code that is properly documented.

On the odd side, it knows who I am because I signed up and paid for access. When I asked to add my information to the docker file maintainer it was unable to do that.

I will test the CUDA integration later to see if that works.

Value Proposition

From my limited tests and use of various code generators, they can be very useful as a starting point to solve routine and standard problems that (we) programmers face every day.

For instance, I have created that directory function a number of times slightly tweaked each time for what I was working on. Not having to store a GIST and modify it as a starting point is useful and speeds up development. I have found over the last few years that I have to solve the same issue for each program I write and I maintained a small repository of useful functions for cut and paste.

I think there is a potential for a lot of value if the vendors can improve on security, privacy, and regulatory issues.

In the course of my daily work, I frequently need a starting point for code or documents. I usually use a Google Search for “free” based templates or examples as a starting point. That is usually very time-consuming. Getting an understandable and working template in a minute would save me hours of research.

I think in the area of code optimization and analysis for vulnerabilities this technology could be huge especially if Microsoft could incorporate good parts with GitHub actions (Synk does some of this now) to provide a recommendation during the pull request. (like, your code could be optimized if you did …) or your code could be compromised by abnormal input into this…

Dangers on the Horizon

The risk of lesser competent people using AI Generators to produce work products under their moniker is a strong risk. I think this will lead to a requirement for better vetting of people that write work products (developers, lawyers, sales, marketing). Most of the time persons in this category don’t know, abide or check for copyrights, licenses, and other related issues.

AI Generators pose a great starting point but they are not the final solution to any task and all products generated need to be reviewed by a competent person in that field.

--

--

Colin Bitterfield

NIST certified Security Professional | 10+ years experience in infrastructure security and compliance | Experienced in creating security programs.