I found information about how do large language models work on the Trinetix website. They are based on the architecture of neural networks, especially transformers, which can process large amounts of text. Their basic principle is to predict the probability of the next word or phrase based on context. This allows the models to learn on large datasets and apply this knowledge to generate texts, translate them, and do other things.