LLMOps

AI Medical Assistance - HuggingFace, FAISS, Langchain, PyPDF, Groq API, Flask, Docker, Aqua Trivy, AWS

Author:

Kimchann Chon

Date:

Aug 28, 2025

This project introduces an artificial intelligence system designed to provide medical information. It is a chatbot that can answer health-related questions. The goal is to offer a tool that delivers clear and evidence-based responses. The system is built for clarity and reliability, not for providing diagnoses or replacing doctors.

The core of this project is a large language model. This type of AI is trained on a massive amount of text data. It learns patterns in language and can generate human-like text. However, these models have a significant limitation. Their knowledge is fixed at the point of their last training update. They can also sometimes produce incorrect or fabricated information. This is a major concern for any medical application, where accuracy is critical.

To solve this problem, the system uses a specific knowledge source. All its answers are based on the Gale Encyclopedia of Medicine, Second Edition. This is a respected reference work written by medical professionals. The encyclopedia provides a strong foundation of verified information. The AI is not asked to rely on its internal, possibly outdated or generalized knowledge. Instead, it is directed to use only the content from this trusted book.

The process begins with preparing this knowledge. The text from the encyclopedia is converted into a format the AI can understand. This is done using an embedding model from Hugging Face. This model turns sentences and paragraphs into numerical representations, called vectors. These vectors capture the semantic meaning of the text. Similar ideas and terms are placed close together in a mathematical space. This collection of vectors is then stored in a local database called FAISS. This database allows for very fast searching and retrieval of the most relevant text passages.

When a user asks a question, the system does not immediately send it to the large language model. First, it searches the FAISS database. It takes the user's query and finds the most relevant excerpts from the medical encyclopedia. These excerpts are then passed to the large language model, along with the original question. The AI is given a specific instruction. It is told to answer the question using only the provided text from the encyclopedia. It is instructed not to use its own knowledge. This method ensures that every response is grounded in a reputable source. It greatly reduces the chance of the model inventing facts.

The Langchain framework is used to manage this entire process. It connects the different components, handling the retrieval of data and the interaction with the language model. It provides the structure that allows the system to work as intended. For the language model itself, the project uses the Groq API. This API provides extremely fast processing speeds. This speed is important for creating a responsive chatbot that feels natural to use.

The application is built as a web service. The backend is developed with Flask, a Python framework for building APIs. This backend handles user requests, runs the retrieval and question-answering process, and sends the answers back. The entire application is then packaged into a Docker container. Containerization ensures that the software runs consistently in any environment, from a developer's laptop to a cloud server.

Security is an important consideration. Before deployment, the Docker image is scanned for known vulnerabilities using Aqua Trivy. This tool checks the software dependencies for security issues. This step helps to identify and fix potential problems early. The process of building, testing, and deploying the application is automated using Jenkins. This CI/CD pipeline ensures that updates can be delivered reliably and efficiently.

Finally, the application is deployed on AWS App Runner. This service automatically runs the containerized application. It manages scaling, load balancing, and other infrastructure concerns. This allows the project to be available on the internet without the complexity of manual server management.

The outcome is a specialized tool for medical information. A user can ask a question about a symptom, condition, or treatment. The system retrieves the most relevant information from the Gale Encyclopedia of Medicine. It then formulates a concise and direct answer based on that content. The response includes a reference to the source material. This approach provides users with a clear path to verify the information. It is made clear that the tool is for informational purposes only. It explicitly states that it is not a substitute for professional medical advice, diagnosis, or treatment. Users are always directed to consult a healthcare provider for any personal health concerns.

This project demonstrates a practical application of AI for a specific domain. It addresses the known weaknesses of large language models by tethering them to a verified knowledge base. The focus is on building a reliable and useful assistant, not a autonomous medical agent. The technology stack is chosen for effectiveness and practicality. Each component, from Hugging Face models to AWS deployment, serves a clear purpose in creating a stable and functional system. The result is a focused application that prioritizes accuracy and user safety above all else.


DEMO




KIMCHANN

© Kimchann Chon. 2025

CHON

KIMCHANN

© Kimchann Chon. 2025

CHON

KIMCHANN

© Kimchann Chon. 2025

CHON