How to build your own AI bot to answer questions about your documents

A local AI for your own documents can be really useful: Your own chatbot reads all important documents once and then provides the right answers to questions such as:

"What is the excess for my car insurance?"

"Does my supplementary dental insurance also cover inlays?"

If you are a fan of board games, you can hand over all the game instructions to the AI and ask the chatbot questions such as:

"Where can I place tiles in Qwirkle?"

We have tested how well this works on typical home PCs.

See also: 4 free AI chatbots you can run directly on your PC

Requirements

To be able to query your own documents with a completely local artificial intelligence, you essentially need three things: a local AI model, a database containing your documents, and a chatbot.

These three elements are provided by AI tools such as Anything LLM and Msty. Both programs are free of charge.

Install the tools on a PC with at least 8GB of RAM and a CPU that is as up-to-date as possible. There should be 5GB or more free space on the SSD.

Ideally, you should have a powerful graphics card from Nvidia or AMD. This overview of compatible models can help you.

By installing Anything LLM or Msty, you get a chatbot on your computer. After installation, the tools load an AI language model, the Large Language Model (LLM), into the program.

Which AI model runs in the chatbot depends on the performance of your PC. Operating the chatbot is not difficult if you know the basics. Only the extensive setting options of the tools require expert knowledge.

But even with the standard settings, the chat tools are easy to use. In addition to the AI model and the chatbot, Anything LLM and Msty also offer the embedding model, which reads in your document and prepares it in a local database so that the language model can access it.

More is better: Small AI models are hardly any good

There are AI language models that also run on weak hardware. For the local AI, weak means a PC with only 8GB RAM and a CPU that is already a few years old and does not have a good Nvidia or AMD graphics card.

AI models that still run on such PCs usually have 2 to 3 billion parameters (2B or 3B) and have been simplified by quantization.

This reduces memory requirements and computing power, but also worsens the results. Examples of such variants are Gemma 2 2B or Llama 3.2 3B.

Although these language models are comparatively small, they provide surprisingly good answers to a large number of questions or generate usable texts according to your specifications — completely locally and in an acceptable amount of time.

IDG

However, when it comes to the local language model taking your documents into account, these small models deliver results that are somewhere between “unusable” and “just acceptable.” How good or bad the answers are depends on many factors, including the type of documents.

In our initial tests with local AI and local documents, the results were initially so poor that we suspected something had gone wrong with the embedding of the documents.

Further reading: Beyond Copilot: 13 helpful AI tools for PC users

Only when we used a model with 7 billion parameters did the responses improve, and when we used the online model ChatGPT 4o on a trial basis, we were able to see how good the responses can be. So it wasn’t the embedding.

In fact, the biggest lever for local AI and own documents is the AI model. And the bigger it is, i.e. the more parameters it has, the better. The other levers such as the embedding model or the chatbot (Anything LLM or Msty) and the vector database play a much, much smaller role.

Embedding & retrieval augmented generation

Ionos

Your own data is connected to the AI using a method called embedding and retrieval augmented generation (RAG).

For tools such as Anything LLM and Msty, it works like this: Your local documents are analyzed using an embedding model. This model breaks down the content of the documents into its meaning and stores it in the form of vectors.

Instead of a document, the embedding model can also process information from a database or other knowledge sources.

However, the result is always a vector database that contains the essence of your documents or other sources. The form of the vector database enables the AI language model to find objects in it.

This process is fundamentally different from a word search and a word index. The latter stores the position of an important word in a document. A vector database for RAG, on the other hand, stores which statements are contained in a text.

This means: The question:

"What is on page 15 of my car insurance document?"

does not usually work with RAG. This is because the information “page 15” is usually not contained in the vector database. In most cases, such a question causes the AI model to hallucinate. Since it does not know the answer, it invents something.

Creating the vector database, i.e. embedding your own documents, is the first step. The information retrieval is the second step and is referred to as RAG.

In retrieval augmented generation, the user asks the chatbot a question. This question is converted into a vector representation and compared with the data in the vector database of the user’s own documents (retrieve).

The results from the vector database are now transferred to the chatbot’s AI model together with the original question (augment).

The AI model now generates an answer (generate), which is made up of the information from the AI model and the information from the user’s vector database.

Comparison: Anything LLM or Msty?

We have tested the two chatbots Anything LLM and Msty. Both programs are similar. However, they differ significantly in the speed with which they embed local documents, i.e. make them available to the AI. This process is generally time-consuming.

Anything LLM took 10 to 15 minutes to embed a PDF file with around 150 pages in the test. Msty, on the other hand, often took three to four times as long.

We tested both tools with their preset AI models for embedding. For Msty this is “Mixed Bread Embed Large,” for Anything LLM it’s “All Mini LM L6 v2.”

Although Msty requires considerably more time for embedding, it may be worth choosing this tool. It offers good user guidance and provides exact source information when citing. We recommend Msty for fast computers.

Further reading: Does your next laptop really need to be an AI PC?

If you don’t have this, you should first try Anything LLM and check whether you can achieve satisfactory results with this chatbot. The decisive factor is the AI language model in the chatbot anyway. And both tools offer the same range of AI language models.

By the way: Both Anything LLM and Msty allow you to select alternative embedding models. In some cases, however, the configuration becomes more complicated. You can also select online embedding models, for example from Open AI.

You don’t have to worry about accidentally selecting an online embedding model. This is because you need an API key to be able to use it.

Anything LLM: Simple and fast

Install the Anything LLM chatbot. Microsoft Defender Smartscreen may display a warning that the installation file is not secure. You can ignore this by clicking on “Run anyway.”

After installation, select an AI language model in Anything LLM. We recommend Gemma 2 2B to start with. You can replace the selected model with a different one at any time later (see “Change AI language model” below).

Now create an area in the configuration wizard or later by clicking on “New workspace” in which you can import your own documents. Give the workspace a name of your choice and then click on “Save.”

The new workspace now appears in the left bar of Anything LLM. Click on the icon to the left of the cogwheel symbol to import your document for the AI. In the new window, click on “Click to upload or drag & drop” and select your documents.

After a few seconds, they will appear in the list above the button. Click on your document again and select “Move to Workspace,” which will move the documents to the right.

A final click on “Save and Embed” starts the embedding process, which may take some time depending on the size of the documents and the speed of your PC.

Tip: Don’t try to read the last 30 years of PCWorld as a PDF right away. Start with a simple text document and see how long it takes your PC. This way you can quickly assess whether a more extensive scan is worthwhile.

Once the process is complete, close the window and ask the chatbot your first question. In order for the chatbot to take your documents into account, you must select the workspace you have created on the left and then enter your question in the main window at the bottom under “Send a message.”

Change the AI language model: If you would like to select a different language model in Anything LLM, click on the key symbol (“Open Settings”) at the bottom left and then on “LLM.” Under “LLM provider,” select one of the suggested AI models.

The new models from Deepseek are also offered. Clicking on “Import model from Ollama or Hugging Face” gives you access to almost all current, free AI models.

Downloading one of the models can take some time, as they are several GB in size and the download server does not always deliver quickly. If you would like to use an online AI model, select it from the drop-down menu below “LLM provider.”

Tips for using Anything: Some Anything LLM menus are a little tricky. Changes to the settings usually have to be confirmed by clicking on “Save.”

However, the button for this quickly disappears from view on longer configuration pages. This is the case, for example, when changing the AI models under “Open Settings > LLM.”

Anyone who forgets to click the button will probably be surprised that the settings are not applied. It is therefore important to look out for a “Save” button every time you change the configuration.

In addition, the user interface in Anything LLM can be at least partially switched to another language under “Open settings > Customize > Display Language.”

IDG

Msty: Versatile chatbot for fast hardware

The Msty chatbot is somewhat more flexible than Anything LLM in terms of its possible uses. It can also be used as a local AI chatbot without integrating its own files.

With Msty, several AI models can be loaded and used simultaneously. Installation and configuration are similar to Anything LLM.

IDG

What Anything LLM calls “Workspace” is called “Knowledge Stack” in Msty and is configured under the menu item of the same name at the bottom left.

Once you have created a new knowledge stack and selected your own documents, you start the embedding process via “Compose.”

It may take some time for this to be completed. Back in the main window of Msty, enter your question in the input field below.

In order for the chatbot to take your local documents into account, you must click on the knowledge stack symbol below the input field and place a tick in front of the desired knowledge stack.

Solving problems with incorrect or missing answers

If the answers to your documents are not satisfactory, we recommend that you first select a more powerful AI model. For example, if you started with Gemma 2 2B with 2 billion parameters, try Gemma 2 9B. Or load Llama 3.1 with 8 billion parameters.

If this does not bring sufficient improvement or your PC takes too long to respond, you can consider switching to an online language model. This would not see your local files or the vector database of your local files.

However, it will receive the parts of your vector database that are relevant to the given question. With Anything LLM, you make the change separately for each workspace. To do this, click on the cogwheel icon for a workspace and select the provider “Open AI” under “Chat settings > Workspace LLM provider” to be able to use a model from ChatGPT.

You will need to enter a paid API key from Open AI. This costs 12 dollars. The number of responses you receive depends on the language model used. You can find an overview at openai.com/api/pricing.

If it is not possible to switch to an online language model for data protection reasons, the troubleshooting guide from Anything LLM can help. On the one hand, it explains the basic possibilities of embedding and RAG and, on the other, shows the small configuration wheels that you can turn to get better answers.

This articles is written by : Nermeen Nabil Khear Abdelmalak

You can Enjoy surfing our website categories and read more content in many fields you may like .

Why USAGoldMines ?

USAGoldMines is a comprehensive website offering the latest in financial, crypto, and technical news. With specialized sections for each category, it provides readers with up-to-date market insights, investment trends, and technological advancements, making it a valuable resource for investors and enthusiasts in the fast-paced financial world.

Recent:

Best PC computer deals: Top picks from desktops to all-in-ones | usagoldmines.com

iOS 26 Streamlines Apple Music Replay Joe Rossignol | usagoldmines.com

AutoMix in iOS 26 Adds DJ-Like Song Transitions to Apple Music Juli Clover | usagoldmines.com

iPadOS 26 Gets New 3D Graphing Feature for Math Notes Juli Clover | usagoldmines.com

A system inspired by the human brain has quietly been activated at a US nuclear lab, and it has no o...

After a series of tumors, woman’s odd-looking tongue explains everything Beth Mole | usagoldmines.co...

Tested! The best Chromebooks you can buy in 2025 — from budget to premium | usagoldmines.com

How to build your own AI bot to answer questions about your documents | usagoldmines.com

Requirements

More is better: Small AI models are hardly any good

Embedding & retrieval augmented generation

Comparison: Anything LLM or Msty?

Anything LLM: Simple and fast

Msty: Versatile chatbot for fast hardware

Solving problems with incorrect or missing answers

Recent:

Best PC computer deals: Top picks from desktops to all-in-ones | usagoldmines.com

iOS 26 Streamlines Apple Music Replay Joe Rossignol | usagoldmines.com

AutoMix in iOS 26 Adds DJ-Like Song Transitions to Apple Music Juli Clover | usagoldmines.com

iPadOS 26 Gets New 3D Graphing Feature for Math Notes Juli Clover | usagoldmines.com

A system inspired by the human brain has quietly been activated at a US nuclear lab, and it has no o...

After a series of tumors, woman’s odd-looking tongue explains everything Beth Mole | usagoldmines.co...

Tested! The best Chromebooks you can buy in 2025 — from budget to premium | usagoldmines.com

Report: Pixel 10 Series to Feature Improved Speakers Tim | usagoldmines.com

Apple Is Giving the iPhone Its Own ‘Emoji Kitchen’ Jake Peterson | usagoldmines.com

Apple Plans to Release Delayed Siri Apple Intelligence Features in Spring 2026 Juli Clover | usagold...

Isaacman’s bold plan for NASA: Nuclear ships, seven-crew Dragons, accelerated Artemis Eric Berger | ...

The Shokz Open-Run Pro Bone Conduction Headphones Are on Sale for $125 Naima Karp | usagoldmines.com

It's Not Just You, a Lot of Sites and Services Are Down Jake Peterson | usagoldmines.com

These Are the Only Two Ways to Actually Keep Mosquitoes Away Beth Skwarecki | usagoldmines.com

Apple Quietly Fixed Zero-Day Exploit Used in Paragon Spyware Attack Juli Clover | usagoldmines.com

No, those amazing deals on Facebook aren't real - it's a scam, and here's how to spot it | usagoldm...

This Android AirTags rival finally got the one big feature it's been missing hamish.hector@futurenet...

Can't access Spotify or a part of Google? Everything we know about this outage impacting major servi...

This German startup wants to build portable quantum computers using diamonds - and says its QPU will...

Apple previews new import/export feature to make passkeys more interoperable Dan Goodin | usagoldmin...

Galaxy Watch 8 Lineup Images Show Off Ultra-Inspired Design Tim | usagoldmines.com

Here's How to Use Each Head on Your Massage Gun Most Effectively Meredith Dietz | usagoldmines.com

Garmin Just Announced Its Answer to the Apple Watch Ultra Beth Skwarecki | usagoldmines.com

Take a Break From WWDC 2025 With Apple's Chill Coffee Shop Playlist Joe Rossignol | usagoldmines.com

Holidaymakers under threat from devious new cyber threat - here's how to stay safe | usagoldmines.c...

Live from WWDC 2025 – TechRadar podcast unpacks that massive iPadOS update and looks through Liquid ...

Edifier’s new retro-style wireless speaker range looks very cool, and has the features to take on JB...

PCIe 7.0 has been announced, offering superfast speeds for the components inside your PC – but don’t...

Engineer creates first custom motherboard for 1990s PlayStation console Benj Edwards | usagoldmines....

AMD’s powerful AI chips can finally be unleashed on Windows PCs | usagoldmines.com

I Like iOS 26's New Back Gesture Better Than Android’s (When It Works) Pranay Parab | usagoldmines.c...

Apple Watch Ultra 2 With Black Titanium is Now Available Refurbished Joe Rossignol | usagoldmines.co...

This GPU-like internal card combines 28 M.2 SSDs to offer up to 109GB/s read speed and 224TB storage...

Microsoft makes fun of macOS Tahoe’s Liquid Glass redesign for ripping off Windows Vista – but Apple...

“Two years of work in two months”: States cope with Trump broadband overhaul Jon Brodkin | usagoldmi...

Google AI mistakenly says fatal Air India crash involved Airbus instead of Boeing Ryan Whitwam | usa...

Microsoft’s AI helper, Copilot Vision, is now live | usagoldmines.com

Imilab C30 Dual review: 2 lenses, 1 smart monitoring solution | usagoldmines.com

Nothing Confirms US Availability for Phone (3), Will Work on T-Mobile and AT&T Tim | usagoldmine...

Nine Tips to Grill More Safely This Summer Allie Chanthorn Reinmann | usagoldmines.com

33 of the Gayest Straight Movies Ever Made Ross Johnson | usagoldmines.com

Instagram Will Soon Let You Edit Your Grid Jake Peterson | usagoldmines.com

Apple Begins Selling Refurbished Mac Studio With M4 Max and M3 Ultra Chips at a Discount Joe Rossign...

Google left months-old dark mode bug in Android 16, fix planned for next Pixel Drop Ryan Whitwam | u...

This tiny ChatGPT feature helps me tackle my days more productively | usagoldmines.com

9 reasons why you should buy a Chromebook | usagoldmines.com

Microsoft throws shade at macOS Tahoe’s familiar new vista | usagoldmines.com

Pixel Buds Pro 2 Gets Updated, But It’s Totally Minor Tim | usagoldmines.com

Pixel Will Now Let VIP Contacts Bypass Your 'Do Not Disturb' Mode David Nield | usagoldmines.com

17 Reasons to Wait for the iPhone 17 Tim Hardwick | usagoldmines.com

Steve Jobs' Iconic Speech at Stanford Now Available in Higher Quality Joe Rossignol | usagoldmines.c...

The Konami Press Start livestream wasn't groundbreaking, but the Silent Hill remake tease has got me...

This cheap Hi-Res Audio music player is like a modern iPod mini, with the funky colors to match | u...

Got ChatGPT Plus? You can now get 3 months for 50% off with this simple trick john-anthony.disotto@f...

Figma unveils big new updates for design and dev - but I'm mostly excited about the rollout of this ...

Microsoft Copilot targeted in first “zero-click” attack on an AI agent - what you need to know | us...

This iPhone Bluetooth audio issue frustrates me every day, but iOS 26 is finally going to fix it mar...

Metal Gear Solid Delta: Snake Eater will have several game modes, but I'm most excited about the new...

Smart tires will report on the health of roads in new pilot program Jonathan M. Gitlin | usagoldmine...

Watch: Micro Center’s opening day was a glorious celebration of PC geekery | usagoldmines.com

Without these $20 noise-canceling headphones, I’d lose my mind in noisy offices | usagoldmines.com

This free all-in-one tool fixes common Windows problems | usagoldmines.com

This $16 USB cooling pad keeps my laptop fast and quiet, even in summer heat | usagoldmines.com

Scientists snap photos of the Sun’s south pole for the first time ever | usagoldmines.com

Vocal backlash forces Wikipedia to pause AI-generated summaries | usagoldmines.com

PNY’s new dual USB-A/C flash drive hits an extreme 1,000 MB/s | usagoldmines.com

The Pixel Watch 2 Is at Its Lowest Price Ever Right Now Pradershika Sharma | usagoldmines.com

Amazon Is Having a Great Father’s Day Sale on Power Tools Becca Lewis | usagoldmines.com

iOS 26 and macOS Tahoe Expand AutoFill Feature for One-Time Codes Joe Rossignol | usagoldmines.com

Watch out - that DeepSeek installer could be damaging malware | usagoldmines.com

I’ve been writing about digital media since the 90s and Prime Video’s latest ads onslaught is the fi...