The ongoing development of generative artificial intelligence (AI) text programs such as ChatGPT-4 is being compared to the emergence of the internet, and the massive investments from tech companies indicate these programs will continue to grow and transform society. As these programs become integrated into our daily lives, i’s crucial for security teams to learn how to use them. However, it’s equally important to remain cautious about the risks they pose.
AI’s Storied History
AI has a history full of common terms which are commonly misunderstood. It’s advisable to briefly review its history to ensure a clearer understanding of the implications and consequences of its use.
Though the term artificial intelligence was coined in the late 1950s, it was not until the 2012 development of neural networks that the field was propelled forward. Neural networks are a mathematical system that learns skills by finding statistical patterns in enormous amounts of data, such as reviewing thousands of cat photos to recognize cats.
In 2018, Google, Microsoft, and OpenAI used neural networks trained on vast amounts of text from the internet to create language learning models (LLMs). LLMs learn the statistical patterns and structures of language to perform various language tasks such as machine translation, summarization, and question-answering.
Generative artificial intelligence, such as ChatGPT and Bard for text, and DALL-E and Midjourney for images, are programs built upon LLMs and enhanced by reinforcement learning techniques. The result is systems which identify patterns in large quantities of data to create new, original material such as writing computer code, carrying on conversations, and developing art.
Benefits to Security Professionals
Given the ability of LLMs to generate high-quality natural language text, security companies are understandably eager to leverage their capabilities to find information, synthesize research, analyze large amounts of data, assess risk, and provide mitigation recommendations. The speed at which ChatGPT and other LLMs perform these tasks makes their use even more appealing.
- Reduced Workload: Analysts can use the programs to edit text, summarize or create notes from large amounts of open-source information, and filter information using specific phrases and words.
- Cognitive Load Reduction: The programs can reduce the cognitive load of analysts managing risky or sensitive information by putting a barrier between the original source and the analysts.
- Training and Analysis Development: The programs can apply structured analytic techniques to create threat scenarios and templates for tabletop exercises and other structured training.
Risks for All
Despite their benefits, generative AI programs pose significant risks and challenges for organizations and individuals in a variety of industries. The common thread is AI systems remain unproven. The lack of human oversight and ethical concerns compound the risks as companies often lack the time and financial incentives to enact sophisticated evaluation methods and robust training programs to assist team members in understanding AI.
- Data Privacy and Security: LLMs utilize large amounts of data to refine their models and improve responses. This creates the risk of security breaches and unauthorized access to information if employees enter sensitive data or client information into LLMs. This could also lead to unauthorized access to proprietary information, reputational or legal damage to the client or company, and in a worst-case scenario – new threats directed at the client. ChatGPT is a cloud-based service, and user inputs and outputs may be temporarily stored on OpenAI’s servers. In short, users do not have full control over the data they enter into ChatGPT.
- Outdated Knowledge: Though they are trained on massive datasets sourced from the internet, most programs do not have access to the most up-to-date information. This may lead to answers which are sourced from outdated knowledge.
- Hallucinations: The programs are designed to present information in a confident manner despite any potential biases in training datasets, whether or not it is factually correct. This can lead to plausible but false content, errors, and lack of grounding and factuality in responses often called “hallucinations.”
- Data Poisoning: Threat actors can introduce biased or inaccurate data to intentionally manipulate LLMs. This type of attack, known as “data poisoning”, can indefinitely influence the outputs and behavior of LLMs. For example, malicious actors can buy domains, fill them with text and images, or add sentences to Wikipedia to alter the entries within the model’s data set.
- Confirmation & Misinformation: Users have no avenue within the program to identify if the information they receive is truthful. This can lead to the spread of misinformation as untrained users rely on the responses without any attempt to fact-check the results.
- Overconfidence in Reasoning: Though the term artificial intelligence often leads users to believe the programs can reason, this is currently not the case. They are trained to learn patterns and structures which gives the illusion of analysis and can lead to overconfidence in the program. AI tools cannot yet communicate, plan, evaluate, or do deliberative thinking–human abilities that allow for critical thinking. As such, AI tools are incapable of performing the sophisticated analysis most valuable to security teams.
Looking Ahead
A number of unresolved questions remain surrounding the development of AI and LLMs. How society attempts to answer these questions will dramatically shape the evolution and use of these emerging technologies.
- AI is Decentralized: Observers have rightly pointed out that AI is not monolithic. It is a mix of different technologies produced by a variety of actors in academic, private sector, and entrepreneurial settings. Who created the AI tool? Will its original purpose align with subsequent uses? How will the data be stored, shared, or sold? Who stands to profit?
- Regulation and Governance: A wave of legislation at the state level is already attempting to regulate AI and other automated systems. The Biden Administration released a Blueprint for an AI Bill of Rights aimed at protecting citizens against threats posed by emerging technologies. However, government responses to complex problems are often slow, uneven, and at times contradictory. Who decides how AI will be regulated? Will technological advancement continue to outpace regulation?
Expected Improvements: How powerful will AI systems become? And how quickly? In the near term, how fast will the list of companies creating plugins for ChatGPT expand? In the medium term, what jobs are AI systems most likely to replace? In the long term, will researchers succeed in creating artificial general intelligence (AGI) – a machine capable of performing all of the functions of a human brain?
Recommendations – A Cautious Approach Forward
Though the risks are not insignificant, the integration of generative AI programs into our daily lives is inevitable. A measured approach can help balance the need for caution while allowing us to gain access to their significant benefits. Many of the following recommendations were formulated with security companies in mind, but are also applicable to a wide variety of industries.
- Goals. Teams should set monthly goals for the use and learning of generative AI programs. This experimentation will help them to understand the programs’ strengths and weaknesses for their use-cases and allow them to keep up with the rapidly developing technology.
- Honesty. Companies should prioritize honesty when utilizing generative AI. While vendors may not have issues with its use for simple documents or email replies, they may not approve of its use for confidential products or sensitive emails. Open discussions between stakeholders, partners, and vendors can help set expectations and prevent mistakes before they occur.
- Facts. Any information provided by the programs should be fact checked.
- Standards. Companies should develop Standard Operating Procedures for the use of generative AI programs. This will help to ensure the technology is used responsibly and ethically, with appropriate safeguards in place to mitigate the risks of bias, error, and malicious misuse.
Author: Jonathan Barsness, Concentric’s Senior Intelligence Analyst/Training and Evolution Manager and Timothy Davis, Intelligence Analyst