§ EcosystemOpen Source Ecosystem

A complementary open-source chain.

Large language models produce inefficient tokens in most languages and fail on domain-specific questions. At Magibu we solve this at every link of the chain with open source: from tokenization faithful to a language's morphology, to a methodology for building embeddings in your own language; from high-quality fine-tune data, to tools that connect the model to the outside world.

Language base

Morphological Tokenizer

Split text into morphology-faithful units and recombine them.

Methodology

Language-Native Embeddings

An open method for building tokenizers + embeddings in your own language.

Data

Fine-tune Datasets

High-quality open datasets tailored to task and persona.

Inference tools

LLM Tools

The ability to call the right tool at the right time.

How do I contribute?

  1. 01Go to the repo you're interested in on GitHub and review open issues
  2. 02Comment on an issue or open a new one
  3. 03Fork the repo and create a new branch
  4. 04Make your change, test it, and document it
  5. 05Open a pull request - explain what and why
  6. 06After review, merge - join the contributor list
§Open Source Projects

Turkey-focused open R&D. Pick up issues on GitHub or apply to join the team. These contributions are prioritized in future hiring.

Loading…

§Community

Open science, benchmarks, and community contributions.

Magibu AI Weekly

Open-source weekly digest: AI news, papers, models, benchmarks, and underrepresented language updates.

View archive →

GitHub

Benchmark code, eval harness, datasets, and community contributions.

github.com/magibu-ai →