
Ollama v0.19
Оптимизация для обработки больших языковых моделей на устройствах Apple, обеспечивающая быструю и эффективную работу.
Ollama теперь работает на Apple Silicon с MLX, обеспечивая высокую производительность для ускорения обработки запросов и кодирования. Поддержка NVFP4 улучшает качество ответов.

Подробнее о Ollama v0.19
Ollama v0.19
Ollama is a tool that runs large language models on your Mac. It leverages Apple's MLX framework for significantly improved performance. You can use it for personal assistants and coding agents, enhancing your workflow.
- •Fastest performance on Apple silicon:Built on Apple’s MLX framework, Ollama leverages GPU Neural Accelerators on M5, M5 Pro, and M5 Max chips, resulting in faster time to first token (TTFT) and improved generation speed (tokens per second).
- •NVFP4 support for accuracy:Ollama utilizes NVIDIA’s NVFP4 format to maintain model accuracy while reducing memory bandwidth and storage requirements, ensuring results comparable to production environments.
- •Improved caching for efficiency:Ollama reuses its cache across conversations, reducing memory utilization and increasing cache hits. It also stores snapshots at intelligent locations in the prompt, resulting in faster responses.
By choosing Ollama, you gain access to accelerated performance on Apple silicon. Experience faster responses and improved efficiency when working with language models. Enjoy enhanced coding and agentic tasks. This update provides a more responsive and efficient experience.










