How do AI Applications Work?

Advanced AI models can be used in clinical trial operations, while keeping data secure.

November 1st, 2025

For clinical trial operations, AI (specifically large language models) have great potential. Why? Clinical trials are complicated by design, full of data - much of which is unstructured, and have complex workflows. Further, there is a great deal of human interpretation and judgement involved. AI excels at these types of problems but understanding how it all works is key to unlocking its use.

Pre-AI: The Start of SaaS

In November of 1999 (exactly 23 years before ChatGPT was released), Salesforce launched the first version of their platform. The feature set wasn't new - it was a CRM application - but the delivery was: Salesforce was exclusively cloud software. At the time, nearly all business software was purchased and installed on corporate servers run by teams of IT people who ensured uptime, security, and maintenance. The "No Software" Salesforce marketing campaigns spoke to the business frustrations at the time. Namely, that it took too long to evaluate, install, and maintain complex business systems. Salesforce promised instant installation and automatic upgrades – stop worrying about running software and start using it to run your business. There was one major drawback: could you trust Salesforce with your data?

Not only did Salesforce pioneer cloud software but they (along with the industry) developed standards, procedures, and literally new jargon to address data security concerns. Today, most business software is developed and used as SaaS - software as a service. ChatGPT and other AI applications are no exception, but have a different set of objectives.

How does ChatGPT work?

This post won't go deep on how large language models (LLMs) work (see this great ft.com article for background). ChatGPT – and others like Claude, Gemini, or Llama – are AI assistant applications. They're built on frontier models which are the biggest and most advanced LLMs that understand semantics, facts, and are tuned to follow instructions. To be more user-friendly, they're developed with a personality, and to be responsible they have safeguards to avoid creating harmful content. The large in large language model refers to the model size, and by extension its nuanced understanding of language. To achieve this, models go through an extensive training process which involves reading vast amounts of data - much of which is from the public internet.

Part of what makes an LLM good is the diversity of training data it sees. That means the model companies – OpenAI, Anthropic, Google, Meta, etc – have an incentive to vacuum up as much content as possible. Another key aspect of LLM quality comes from human review and feedback. Legions of people review model output and rate responses and that data is fed back into the training cycle. Again, this post intentionally ignores many details but for now, remember that models get good by 1) reading as much data as possible, and 2) human review & feedback.

Knowing those two things, you might start to understand the latest business concern around LLMs and AI assistants: if I use these systems, will my corporate data be used for training? Also, is my data used for training or feedback? We'll address these concerns below – for now, there is one more thing to understand about how AI applications work.

Applications vs Models

We described ChatGPT as an AI application above because it's more than just one AI model. The ChatGPT application orchestrates several models together into one app. For example, verbally talking to the ChatGPT app uses voice-to-text models, and when it automatically generates titles for chat conversations it's probably using a small model like GPT-mini. Software vendors who create AI applications can do the same thing. If a developer makes a calorie-counting app, they'd probably start with a model that understands images or video to identify food. Then, would cross reference that with a database of foods and calorie counts.

Think of AI models as standalone programs with very specific inputs and outputs:

Model	Input	Output
GPT-5.1 Text completion	"Mary had a little …"	"... lamb"
Qwen-VL-Plus Vision model		"A photo of a cat"
Phi-4-multimodal Speech to text		"Hello, can you hear me?"

Developers can include models in their applications in a way that's similar to other components like databases, third party libraries, etc. Some models are small enough to fit on a phone, and some others require a powerful PC with a GPU. But today, most of the highest quality models – especially those best suited for complex reasoning – are massive and take multiple powerful servers to run.

OpenAI's GPT models, Anthropic's Claude, and others, are such models. In order to meet business and enterprise needs, cloud providers like AWS, Microsoft Azure, or Google Cloud will run the models on their infrastructure and provide access in a way that is secure and private. What this means is developers can include the capabilities of these models and build them in applications, but in a way that does not share customer information with the model maker.

To sum up, software vendors can create AI applications using SaaS best practices for security and privacy. AI can be viewed as a new set of software components and those models can be run as part of the vendors' architecture. This means you get all the benefits of the models without the risk of data leaking to the model maker.

Questions for Vendors

Of course, that's not the whole picture. It's fair game to ask software vendors the same question you'd ask the model makers or other SaaS applications: where does my data go? Do you use it for training? What control do I have over this?

Even if a software vendor uses LLMs privately as part of their architecture, they can still build and use features similar to AI vendors. For example, you've seen the thumbs-up and thumbs-down icons on chat applications – these give human feedback to AI applications that help rate the quality of responses. Application developers can use this later to improve prompts or to refine their use of an LLM.

It's fair to ask where this data goes and how it helps refine or improve an AI application. For example, if a user rates a chat response, does that rating improve a model that benefits all customers, or just one company? Is there any chance that the text of conversations used for one customer refine the model such that other customers benefit, or more concerningly, is there any way for one customer to glean the conversation of another customer?

The answers to these questions depend on the software architecture and, to a large extent, the use case of the application.

HumanTrue's Answers

To wrap this post up, we'll give our answers to the questions we asked above.

Where does my data go? All data uploaded to HumanTrue is encrypted and stored securely in our cloud environment. No data leaves our environment for the purpose of analysis (e.g., to AI model makers).

Do you train AI models on customer data? No. HumanTrue uses off-the-shelf AI frontier models, run privately to our own architecture. Customer data is not used for training, model distillation, or fine-tuning.

What control do I have over my data? You have full control, your data is your data. You can remove it at any time.

The Future of Software

All software will evolve to include AI. We're committed to building AI-native applications for clinical trial operations, all while keeping customer data secure, private, and not used for training.

Share this on LinkedIn

Want to learn more?

We'd love to show you how to get started using AI for clinical operations

Request a Demo