Open Source Go Large Language Models (LLM) for Mac

Go Large Language Models (LLM) for Mac

View 131 business solutions

Browse free open source Go Large Language Models (LLM) for Mac and projects below. Use the toggles on the left to filter open source Go Large Language Models (LLM) for Mac by OS, license, language, programming language, and project status.

  • Cut Data Warehouse Costs by 54% Icon
    Cut Data Warehouse Costs by 54%

    Easily migrate from Snowflake, Redshift, or Databricks with free tools.

    BigQuery delivers 54% lower TCO with exabyte scale and flexible pricing. Free migration tools handle the SQL translation automatically.
    Try Free
  • Catch Bugs Before Your Customers Do Icon
    Catch Bugs Before Your Customers Do

    Real-time error alerts, performance insights, and anomaly detection across your full stack. Free 30-day trial.

    Move from alert to fix before users notice. AppSignal monitors errors, performance bottlenecks, host health, and uptime—all from one dashboard. Instant notifications on deployments, anomaly triggers for memory spikes or error surges, and seamless log management. Works out of the box with Rails, Django, Express, Phoenix, Next.js, and dozens more. Starts at $23/month with no hidden fees.
    Try AppSignal Free
  • 1
    Anyquery

    Anyquery

    Query anything (GitHub, Notion, +40 more) with SQL and let LLMs

    Anyquery is an open-source SQL query engine designed to allow users to query data from almost any source using a unified SQL interface. The system enables developers and analysts to run SQL queries on files, APIs, applications, and databases without needing separate connectors or query languages for each platform. Built on top of SQLite, the engine uses a plugin architecture that allows it to extend support to dozens of external services and data sources. Users can query structured files such as CSV, JSON, and Parquet as well as remote data sources like SaaS APIs, cloud storage services, and local applications. The platform also supports querying multiple data sources simultaneously and joining them together within a single SQL query, enabling powerful cross-system analysis. In addition to operating as a local query engine, the system can run as a MySQL-compatible server so that traditional database tools can connect to it.
    Downloads: 15 This Week
    Last Update:
    See Project
  • 2
    Casibase

    Casibase

    Open-source enterprise-level AI knowledge base and MCP

    Casibase is an open-source AI cloud platform designed to function as an enterprise knowledge base, container management system, and collaboration environment for AI-driven applications. The project combines knowledge management, messaging, and forum features with large language model integration to create an interactive platform for storing and querying domain-specific knowledge. Built with a separated frontend and backend architecture, Casibase provides a web-based administrative interface and supports high concurrency for enterprise environments. The platform integrates embedding techniques and prompt engineering to enable semantic knowledge retrieval and conversational interactions with stored data. It also supports integration with existing systems through database synchronization, allowing organizations to migrate data into the platform without major infrastructure changes.
    Downloads: 14 This Week
    Last Update:
    See Project
  • 3
    Beelzebub

    Beelzebub

    A secure low code honeypot framework

    Beelzebub is an open-source cybersecurity framework designed to create intelligent honeypot environments for detecting and studying cyber attacks. Honeypots are systems intentionally exposed to attackers in order to capture malicious behavior, and Beelzebub enhances this concept by incorporating artificial intelligence and virtualization techniques. The platform allows organizations and researchers to deploy decoy services that mimic real infrastructure while recording attacker interactions. By using AI models to simulate realistic system behavior, the honeypot becomes harder for attackers to identify, increasing the likelihood that malicious activity can be observed and analyzed. The framework is designed with a low-code configuration approach so security teams can easily deploy honeypots for multiple services and ports.
    Downloads: 12 This Week
    Last Update:
    See Project
  • 4
    LocalAI

    LocalAI

    Self-hosted, community-driven, local OpenAI compatible API

    Self-hosted, community-driven, local OpenAI compatible API. Drop-in replacement for OpenAI running LLMs on consumer-grade hardware. Free Open Source OpenAI alternative. No GPU is required. Runs ggml, GPTQ, onnx, TF compatible models: llama, gpt4all, rwkv, whisper, vicuna, koala, gpt4all-j, cerebras, falcon, dolly, starcoder, and many others. LocalAI is a drop-in replacement REST API that’s compatible with OpenAI API specifications for local inferencing. It allows you to run LLMs (and not only) locally or on-prem with consumer-grade hardware, supporting multiple model families that are compatible with the ggml format. Does not require GPU.
    Downloads: 10 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 5
    Slack MCP Server

    Slack MCP Server

    The most powerful MCP Slack Server with no permission requirements

    Slack MCP Server is an open-source server implementation that connects Slack workspaces to AI systems through the Model Context Protocol (MCP). MCP is a standardized protocol that allows large language models and AI agents to securely interact with external tools and data sources such as messaging platforms, databases, or file systems. The slack-mcp-server acts as an intermediary layer that exposes Slack data and messaging functionality to AI clients while enforcing access rules and communication standards. Through this architecture, AI assistants can read message histories, interact with channels, and retrieve contextual information from Slack conversations in order to perform tasks such as automated analysis, collaboration assistance, or contextual code review. The server supports multiple communication transports, including standard input/output streams, HTTP, and Server-Sent Events, allowing flexible integration with different AI client environments.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 6
    Zep

    Zep

    Zep: A long-term memory store for LLM / Chatbot applications

    Easily add relevant documents, chat history memory & rich user data to your LLM app's prompts. Understands chat messages, roles, and user metadata, not just texts and embeddings. Zep Memory and VectorStore implementations are shipped with your favorite frameworks: LangChain, LangChain.js, LlamaIndex, and more. Automatically embed texts and messages using state-of-the-art opeb source models, OpenAI, or bring your own vectors. Zep’s local embedding models and async enrichment ensure a snappy user experience.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 7
    Envoy AI Gateway

    Envoy AI Gateway

    Manages Unified Access to Generative AI Services

    Envoy AI Gateway is an open-source gateway system designed to manage network traffic between applications and generative AI services using the Envoy proxy ecosystem. The project extends Envoy Gateway to support AI-specific workloads, enabling organizations to route, secure, and scale requests to large language models and other generative AI services. In a typical deployment, the architecture uses a two-tier gateway model where an outer gateway handles authentication, routing, and global rate limiting while a second gateway manages traffic to self-hosted model serving clusters. This design allows organizations to centralize control over AI service access while maintaining flexibility in how backend model infrastructure is deployed. The gateway provides policy enforcement, observability, and routing capabilities that are specifically designed for AI inference workloads, including intelligent endpoint selection and request optimization.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 8
    LLocalSearch

    LLocalSearch

    LLocalSearch is a completely locally running search aggregator

    LLocalSearch is an open-source search engine framework designed to run entirely on local infrastructure using large language model agents to gather and synthesize information from the web. The system allows users to submit natural language questions, after which a chain of LLM-driven agents recursively searches for relevant information and compiles a response. Unlike many AI search tools, LLocalSearch operates without requiring external cloud APIs or proprietary services, making it suitable for privacy-focused or offline environments. The architecture integrates local language models with external tools such as search engines, enabling the system to gather up-to-date information while keeping model execution on local hardware. The tool also exposes the internal reasoning process of its agents so users can observe how queries are expanded and how results are retrieved during the search process.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 9
    tlm

    tlm

    Local CLI Copilot, powered by Ollama

    tlm is an open-source command-line AI assistant designed to provide intelligent terminal support using locally running large language models. The project functions as a CLI copilot that helps developers generate commands, explain shell instructions, and answer technical questions directly from the terminal. Instead of relying on cloud APIs or paid AI services, TLM runs entirely on the user’s workstation and integrates with local models managed through the Ollama runtime. This approach allows developers to use powerful open-source models such as Llama, Phi, DeepSeek, and Qwen while maintaining privacy and avoiding external service dependencies. The system supports contextual queries where the AI analyzes files within a directory and generates answers based on project documentation or source code. It also detects the user’s shell environment automatically, allowing it to generate commands tailored to shells such as Bash, Zsh, or PowerShell.
    Downloads: 6 This Week
    Last Update:
    See Project
  • Host LLMs in Production With On-Demand GPUs Icon
    Host LLMs in Production With On-Demand GPUs

    NVIDIA L4 GPUs. 5-second cold starts. Scale to zero when idle.

    Deploy your model, get an endpoint, pay only for compute time. No GPU provisioning or infrastructure management required.
    Try Free
  • 10
    ChatWiki

    ChatWiki

    ChatWiki WeChat official account's AI knowledge base workflow agent

    ChatWiki is an open-source AI knowledge base and workflow automation platform designed to help organizations build intelligent question-answering systems using large language models and retrieval-augmented generation techniques. The system enables companies to transform internal documents and data into searchable knowledge bases that can power AI assistants capable of answering domain-specific questions. It provides a complete pipeline for ingesting documents, preprocessing and segmenting content, generating vector embeddings, and retrieving relevant information during conversations. The platform supports multiple large language models and allows developers to easily connect cloud-based or local models to power the chatbot. ChatWiki also integrates workflow automation features that allow AI responses, messaging triggers, and customer interaction flows to be configured visually.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 11
    Gollama

    Gollama

    Go manage your Ollama models

    Gollama is a macOS and Linux tool for managing Ollama models through an interactive terminal-based interface. It provides a TUI that lets users list, inspect, sort, filter, edit, run, unload, copy, rename, delete, and push models from one place rather than relying entirely on manual command-line workflows. The project is aimed at developers and local AI users who frequently work with multiple Ollama models and want a more efficient operational layer for everyday maintenance. Beyond standard model management, Gollama can display metadata such as size, quantization level, model family, and modification date, which helps users compare models quickly. One of its more distinctive capabilities is a VRAM estimation system that can calculate memory requirements, estimate context limits, and help users choose quantization settings that fit available hardware.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 12
    csghub-server

    csghub-server

    csghub-server is the backend server for CSGHub

    csghub-server is the backend component of the CSGHub platform, an open-source infrastructure designed to manage and operate large language models, datasets, and AI development workflows within a private deployment environment. The server acts as a centralized management layer that allows teams to store, organize, and operate AI assets such as models, datasets, and machine learning applications in a manner similar to artifact repositories used in software engineering. Built primarily in the Go programming language, the system enables organizations to run model inference, training, and fine-tuning tasks within a unified platform. It integrates capabilities similar to model repositories like Hugging Face while allowing enterprises to host and manage their AI assets internally for security and compliance purposes.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 13
    ClaraVerse

    ClaraVerse

    Claraverse is a opesource privacy focused ecosystem to replace ChatGPT

    ClaraVerse is an open-source private AI workspace designed to give users a unified environment for interacting with large language models, building automations, and managing AI-driven tasks in a self-hosted environment. The platform combines chat interfaces, workflow automation, and long-running task management into a single application that can connect to both local and cloud-based AI models. Users can integrate models from multiple providers such as OpenAI, Anthropic, Google, or locally hosted systems like Ollama and LM Studio, enabling flexibility in how AI capabilities are deployed and managed. The system includes a visual workflow builder that allows users to create automation pipelines where AI tools interact with external services, APIs, or datasets. ClaraVerse also includes task-tracking capabilities that allow complex research, coding, or analysis jobs to run in the background while users monitor their progress through a dashboard.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 14
    aqueduct LLM

    aqueduct LLM

    Aqueduct allows you to run LLM and ML workloads on any infrastructure

    Aqueduct is an MLOps framework that allows you to define and deploy machine learning and LLM workloads on any cloud infrastructure. Aqueduct is an open-source MLOps framework that allows you to write code in vanilla Python, run that code on any cloud infrastructure you'd like to use, and gain visibility into the execution and performance of your models and predictions. Aqueduct's Python native API allows you to define ML tasks in regular Python code. You can connect Aqueduct to your existing cloud infrastructure (docs), and Aqueduct will seamlessly move your code from your laptop to the cloud or between different cloud infrastructure layers. Aqueduct provides a single interface to running machine learning tasks on your existing cloud infrastructure — Kubernetes, Spark, Lambda, etc. From the same Python API, you can run code across any or all of these systems seamlessly and gain visibility into how your code is performing.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 15
    Qwen2.5-Coder

    Qwen2.5-Coder

    Qwen2.5-Coder is the code version of Qwen2.5, the large language model

    Qwen2.5-Coder, developed by QwenLM, is an advanced open-source code generation model designed for developers seeking powerful and diverse coding capabilities. It includes multiple model sizes—ranging from 0.5B to 32B parameters—providing solutions for a wide array of coding needs. The model supports over 92 programming languages and offers exceptional performance in generating code, debugging, and mathematical problem-solving. Qwen2.5-Coder, with its long context length of 128K tokens, is ideal for a variety of use cases, from simple code assistants to complex programming scenarios, matching the capabilities of models like GPT-4o.
    Downloads: 34 This Week
    Last Update:
    See Project
  • 16
    AxonHub

    AxonHub

    Use any SDK to call 100+ LLMs

    AxonHub is an open-source AI gateway platform designed to simplify the process of integrating and switching between different large language model providers. The system acts as a compatibility layer that allows developers to use the same SDK interface while routing requests to various AI services behind the scenes. Instead of rewriting code when switching providers such as OpenAI or Anthropic, developers can simply change configuration settings within the gateway. AxonHub translates requests from one provider’s API format into another, enabling seamless interoperability across different AI platforms. The system also provides infrastructure features such as request routing, failover mechanisms, load balancing, and cost management for AI applications. This architecture makes it easier to experiment with multiple models and manage production deployments that rely on several providers simultaneously.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 17
    KubeAI

    KubeAI

    Private Open AI on Kubernetes

    Get inferencing running on Kubernetes: LLMs, Embeddings, Speech-to-Text. KubeAI serves an OpenAI compatible HTTP API. Admins can configure ML models by using the Model Kubernetes Custom Resources. KubeAI can be thought of as a Model Operator (See Operator Pattern) that manages vLLM and Ollama servers.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 18
    vLLM Semantic Router

    vLLM Semantic Router

    System Level Intelligent Router for Mixture-of-Models at Cloud

    Semantic Router is an open-source system designed to intelligently route requests across multiple large language models based on the semantic meaning and complexity of user queries. Instead of sending every prompt to the same model, the system analyzes the intent and reasoning requirements of the request and dynamically selects the most appropriate model to process it. This approach allows developers to combine multiple models with different strengths, such as lightweight models for simple queries and more advanced reasoning models for complex tasks. The router operates as an intelligent layer between users and model infrastructure, capturing signals from prompts, responses, and contextual data to improve decision-making. It can also integrate safety and monitoring mechanisms that detect issues such as jailbreak attempts, hallucinations, or sensitive information exposure.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    LLaMA.go

    LLaMA.go

    llama.go is like llama.cpp in pure Golang

    llama.go is like llama.cpp in pure Golang. The code of the project is based on the legendary ggml.cpp framework of Georgi Gerganov written in C++ with the same attitude to performance and elegance. Both models store FP32 weights, so you'll needs at least 32Gb of RAM (not VRAM or GPU RAM) for LLaMA-7B. Double to 64Gb for LLaMA-13B.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB