AI and ML Security Risks That Data Scientists Rarely Think About

ByBoston Institute of Analytics July 1, 2026July 21, 2026

A few months ago I was doing a penetration test for a mid-sized company that had recently deployed a machine learning system for customer churn prediction. The data science team was proud of it and honestly, they should have been. The model was well-built, the training pipeline was clean, the accuracy metrics were solid. These are exactly the kinds of real-world challenges and workflows often discussed in a data science course where building reliable models is only one part of the process.

But within about forty minutes of starting the external assessment, I was reading their model’s training logs. Within an hour, I had found credentials for the cloud storage bucket where raw customer data was sitting.

Nobody had done anything wrong with the model. The security problem was everywhere around it.

This is the conversation that doesn’t happen enough in data science and ML engineering. Everyone learns how to build the system. Very few people spend time thinking about how someone would attack it and what they’d find when they do. That’s why a modern data science course should also introduce learners to security, data protection, and risk management alongside machine learning concepts.

The Assumption That Gets Teams Into Trouble

There’s a mental model I see constantly in organisations building AI and ML systems: the model is the product, and everything else is just infrastructure. The security team handles the infrastructure. The data team handles the model. Nobody handles the gap between the two.

That gap is where the problems are.

When I approach an ML system during a security assessment, I’m not trying to attack the model directly — at least not at first. I’m looking at it the way I’d look at any web application: what’s exposed, what accepts input, what stores sensitive data, and what happens when I send something unexpected.

The answer is almost always the same. Jupyter notebook servers sitting open on port 8888 with no authentication token. FastAPI inference endpoints deployed to production without rate limiting or input validation. MLflow tracking servers accessible from outside the network perimeter, full of experiment logs that include environment variables — which sometimes include API keys.

None of this requires sophisticated attacks. The first thing in any assessment is a service enumeration — finding out what’s actually running and reachable. Understanding which TCP and UDP ports expose which services, and what those services accept is the foundation of that work. And in ML infrastructure, you consistently find things running on non-standard ports that were never supposed to face the internet, accessed by people who had no idea they were reachable from outside.

What Penetration Testers Actually Do to ML Systems

Let me walk through what a typical assessment of an ML deployment looks like in practice.

The first stage is reconnaissance. Before touching anything, you want to understand what’s there. For a company with a cloud-deployed ML system, that means looking at what subdomains are exposed, what ports are open on their instances, and what services are responding. The tools used in this phase — network scanners, service fingerprinting, API discovery — are the same toolkit for any penetration test, applied to ML infrastructure the same way they’d be applied to any other system. What you find at this stage usually tells you most of what you need to know.

The second stage is testing the APIs. ML systems serve predictions through HTTP APIs. Those APIs accept input, process it, and return output. From a security standpoint, that means they’re subject to the same class of vulnerabilities as any other web API — and the same techniques apply. Injection in query parameters. Broken object-level authorisation, where modifying a user ID in a request retrieves someone else’s model history or stored features. Server-side request forgery through endpoints that accept URLs for data ingestion. The same attack techniques that testers apply to any API work on ML APIs too — often more effectively, because ML APIs tend to be built by people whose primary focus was getting the predictions right, not preventing misuse.

The third stage, and this is where things get specifically interesting for AI systems, is looking at what the system does with its data. Where does training data live? Who can access it? Are the pipelines that pull data into training jobs protected? Can you influence what the model learns by introducing data through a writable endpoint? Can you extract information about training data by querying the model with carefully chosen inputs? These are questions that rarely come up during model development. They come up during security assessments.

The Part About Credentials

Going back to that churn prediction system I mentioned at the start — the credentials I found weren’t in an obvious place. They were in an MLflow experiment log. A data scientist had run a training job that pulled data from S3 using credentials stored in an environment variable. The MLflow tracking server logged the full environment of every experiment run. Nobody had thought to filter sensitive keys out of those logs, because the tracking server wasn’t supposed to be accessible from outside the network.

It was accessible from outside the network.

This pattern — credentials ending up in places that weren’t supposed to be reachable, but are — is one of the most consistent findings in ML security assessments. CI/CD pipelines for model training frequently have access to cloud credentials, package registry tokens, and storage permissions that are far broader than necessary. When those pipelines use third-party packages without pinning to specific versions, the attack surface extends to every dependency in the build chain.

The May 2026 TanStack supply chain attack illustrated this at scale. Malware injected into a widely used JavaScript library spread through legitimate CI/CD pipelines to numerous organisations that consumed the compromised packages through automated build workflows — specifically because those pipelines had access to credentials that made them worth targeting. ML infrastructure is not exempt from this class of threat. It’s frequently more exposed to it, because the tooling is newer and the security patterns around it are less established.

Why This Is Specifically a Data Science Problem

I want to be direct about something: this isn’t a criticism of data scientists or ML engineers. Building production AI systems is genuinely hard, and most teams are moving fast with limited resources. Security is often the first thing that gets deferred.

But there’s a knowledge gap worth addressing. The people building these systems often don’t have a background in application security. The concepts — input validation, authentication patterns, least-privilege access, secrets management — aren’t typically part of a data science or ML engineering curriculum. And the security teams that might catch these issues often don’t have enough context about how ML infrastructure works to know what questions to ask.

That gap is where vulnerabilities accumulate. An inference endpoint with no authentication because “it’s internal.” A training job with admin-level cloud credentials because “it’s easier.” Jupyter notebooks accessible to the team and, inadvertently, to anyone with network access.

Regulatory pressure is starting to close this gap from the outside. The EU AI Act’s Article 15 requires that high-risk AI systems achieve an appropriate level of cybersecurity resilience, with full enforcement of high-risk system mandates effective August 2, 2026 under the current implementation timeline. These aren’t recommendations — they’re enforceable obligations covering risk management, data governance, logging, and cybersecurity across the entire action layer of an AI system, including the APIs those systems expose. Non-compliance for organisations operating in or serving the EU market carries significant penalties. At some point, “we didn’t think about security during model development” stops being an acceptable explanation.

What Practical Security Baseline Looks Like

You don’t need a full penetration test to start taking this more seriously. There’s a basic self-assessment that any team deploying an ML system should be able to run.

Map what you’re actually exposing. What ports are open on your cloud instances? What services are running on them? Can they be reached from outside your network? In cloud environments where infrastructure spins up quickly, the answer is often different from what teams expect.

Test your APIs the way an attacker would. Send unexpected input types. Send oversized payloads. Try modifying user identifiers in requests. See what error messages come back — whether they reveal internal system details that help map the architecture. Most of this doesn’t require specialist tooling — but if you want to approach it systematically, the same toolkit used in any penetration test covers the basics. It requires about an hour and a working knowledge of how web APIs function.

Audit where credentials live. Are API keys or cloud credentials stored in environment variables that get logged? Are they in configuration files that might end up in version control? Do your CI/CD jobs have permissions broader than necessary for the specific tasks they run?

Pin your dependencies. This applies to model training code, pipeline orchestration, and any tooling in the ML stack. A package installed six months ago and not thought about since might have had a vulnerable version published in the meantime. Supply chain attacks targeting ML tooling are an active threat, not a theoretical one.

Review who can reach your infrastructure. Who can access your MLflow server? Your model registry? Your feature store? If the answer is “anyone with network access to the instance,” that’s worth revisiting before someone else revisits it for you.

The Supply Chain Dimension

Most ML systems depend on a stack of open source libraries — PyTorch, scikit-learn, Hugging Face Transformers, ONNX, various data processing frameworks. Those libraries have maintainers, release pipelines, and dependency chains of their own. When any part of that chain is compromised, every downstream project that installs an affected version runs malicious code during build or installation — without any warning, and often with access to every secret and credential the CI/CD environment holds. Following recent software supply chain attack news has become a habit for most ML teams, since detailed writeups of past incidents tend to surface faster than the patches that fix them.

This isn’t hypothetical. Supply chain attacks targeting developer tooling have been escalating consistently. The pattern is always the same: find a widely used package, compromise its release pipeline, push a malicious version, and wait for the installs to come in. For ML teams, the risk is compounded by the fact that training pipelines often run with broad cloud permissions — because someone needed to access a storage bucket or a model registry quickly, and nobody went back to tighten the scope afterward.

Pinning dependencies to specific versions and auditing your pipeline’s permission scope aren’t glamorous tasks. But they’re the difference between being unaffected when the next supply chain incident hits and spending a week rotating credentials and auditing what was accessed.

The Honest Assessment

The same patterns that show up in traditional software development show up in ML infrastructure — just faster, and with higher stakes, because the data involved is more sensitive and the systems are making more consequential decisions.

The underlying issue isn’t technical. Most vulnerabilities in ML systems aren’t novel or sophisticated. They’re standard web application vulnerabilities applied to a newer target. The issue is that the people building these systems often don’t know they need to be thinking about them, and the people responsible for security often don’t know enough about ML infrastructure to find them.

That gap is narrowing. The attacks are getting more targeted. The organisations that end up in incident reports are consistently the ones that deferred these conversations until after something went wrong.

Having them earlier is considerably cheaper.

Cyber Security & Ethical Hacking

The Difference Between Cybersecurity and Ethical Hacking

March 6, 2026June 30, 2026

While cybersecurity focuses on protecting systems, networks, and data from cyber threats through defensive security strategies, ethical hacking involves legally testing systems by simulating cyber attacks to identify security vulnerabilities. Although cybersecurity covers many different areas of digital security, ethical hacking is a specialized role…

Cyber Security & Ethical Hacking

Iranian Hackers Launch ‘SpearSpecter’ Spy Operation Targeting Defense & Government Agencies

November 18, 2025December 12, 2025

The international cybersecurity community is on alert after reports revealed a new wave of cyber espionage known as “SpearSpecter”, launched by the Iranian hacker group APT42 also referred to as Phosphorus or Charming Kitten. This development highlights the urgent demand for expert skills in cyber…

Cyber Security & Ethical Hacking

Cybersecurity Weekly Recap: Key Threats, Vulnerabilities & Updates, 17–23 January

January 25, 2026February 20, 2026

Cyber threat technologies move quickly! Because these technologies evolve at such a rapid pace, an individual must continuously learn about cybercrimes and defence policies in order to keep up with the changes. By enrolling in a structured cybersecurity course, you will gain a better understanding…

Cyber Security & Ethical Hacking

How Fake Job Offers Are Being Used for Cyber Attacks in 2026

July 10, 2026July 20, 2026

Introduction The job market in 2026 is characterized by extreme digitization, rapid speed, and enhanced connectivity. Candidates now have a multitude of ways to apply for jobs through online job sites, company websites, professional networking sites, and recruitment campaigns on social media. However, with the…

Cyber Security & Ethical Hacking

June 2025 Cybersecurity Weekly Recap [June 9 – 13]: AI Exploits, Zero-Days, Botnets & Ransomware Updates

June 14, 2025June 23, 2025

You are reading our June 2025 cybersecurity weekly recap! In June of 2025, we have seen significant changes including how threat actors exploited AI, critical zero-day vulnerabilities, botnet campaigns, ransomware developments, and law enforcement actions. Here are the highlights that have defined the security landscape…

Cyber Security & Ethical Hacking

Latest Cybersecurity News Roundup (11 July – 17 July 2026): AI Threats, Critical Vulnerabilities, and Why Cybersecurity Skills Matter More Than Ever

July 18, 2026July 18, 2026

Latest Cybersecurity News: 11 July – 17 July 2026 Cybersecurity keeps changing at an almost unreal pace, like every week just brings another round of new threats, fresh vulnerabilities, and little innovations that show up out of nowhere. The week spanning 11 July to 17…

AI and ML Security Risks That Data Scientists Rarely Think About

The Assumption That Gets Teams Into Trouble

What Penetration Testers Actually Do to ML Systems

The Part About Credentials

Why This Is Specifically a Data Science Problem

What Practical Security Baseline Looks Like

The Supply Chain Dimension

The Honest Assessment

The Difference Between Cybersecurity and Ethical Hacking

Iranian Hackers Launch ‘SpearSpecter’ Spy Operation Targeting Defense & Government Agencies

Cybersecurity Weekly Recap: Key Threats, Vulnerabilities & Updates, 17–23 January

How Fake Job Offers Are Being Used for Cyber Attacks in 2026

June 2025 Cybersecurity Weekly Recap [June 9 – 13]: AI Exploits, Zero-Days, Botnets & Ransomware Updates

Latest Cybersecurity News Roundup (11 July – 17 July 2026): AI Threats, Critical Vulnerabilities, and Why Cybersecurity Skills Matter More Than Ever

Leave a Reply Cancel reply

Top Enrolled Courses

BIA® Schools

Quick Links

The Assumption That Gets Teams Into Trouble

What Penetration Testers Actually Do to ML Systems

The Part About Credentials

Why This Is Specifically a Data Science Problem

What Practical Security Baseline Looks Like

The Supply Chain Dimension

The Honest Assessment

Similar Posts

Leave a Reply Cancel reply

Talk to our expert

Enquire for free master class

Boston School of Technology & AI

Boston School of Management

Boston School of Finance

Boston School of Animation & Design

Boston School of Media & Communications

Boston School of Corporate Training

Top Enrolled Courses

BIA® Schools

Quick Links