UAE’s Falcon-H1 Arabic tops Open Arabic LLM rankings
Abu Dhabi's Technology Innovation Institute has unveiled Falcon-H1 Arabic, a new large language model that now sits at the top of the Open Arabic LLM Leaderboard across model sizes.
TII, which operates as the applied research arm of Abu Dhabi's Advanced Technology Research Council, said the model uses a hybrid Mamba-Transformer architecture. This marks a shift from the purely transformer-based designs that underpin many existing systems.
The institute said Falcon-H1 Arabic currently ranks as the highest-performing Arabic AI model on the Open Arabic LLM Leaderboard. The model outperforms rivals that contain several times more parameters. It targets accuracy, context handling and representation of Arabic language at lower computational scale.
Falcon-H1 Arabic forms part of a wider strategy in the UAE that emphasises sovereign AI tuned to Arabic language and culture rather than systems adapted from English-first models. The release extends the Falcon family of models that TII has developed over recent years.
His Excellency Faisal al Bannai, Adviser to the UAE President and Secretary General of the Advanced Technology Research Council, said the launch aligns with national objectives in advanced technology and AI.
"Falcon-H1 Arabic reflects our ongoing commitment to strengthening the UAE's position as a global hub for advanced technology and responsible AI. By delivering models that support the linguistic and cultural needs of the region, we enable innovation that is accessible, relevant, and impactful across our societies. This achievement is a testament to the depth of talent and research expertise within TII," said His Excellency Faisal al Bannai, Adviser to the UAE President and Secretary General of the Advanced Technology Research Council.
The Falcon-H1 Arabic family is available in 3 billion, 7 billion and 34 billion parameter sizes. TII said this range targets different infrastructure constraints and application types across organisations and developers in the region.
The institute said the new models incorporate changes in training data and design. These changes focus on data quality, coverage of Arabic dialects, long-context behaviour and mathematical reasoning. The group said the result is more accurate and context-aware handling of Arabic content across real-world tasks.
Falcon-H1 Arabic builds on earlier Falcon-Arabic releases that gained traction among developers searching for dedicated Arabic large language models. Those models demonstrated demand for systems that natively handle Arabic rather than relying on translation.
Najwa Aaraj, Chief Executive of TII, said Falcon-H1 Arabic follows years of research on Arabic AI inside the organisation.
"The development of Falcon-H1 Arabic builds on years of foundational work in Arabic AI and responds directly to the needs of our communities, including developer and businesses. By advancing architecture, data quality, and long-context reasoning, we are creating enablers that unlock new possibilities in education, healthcare, governance, and enterprise, and more, all in Arabic. This model represents an important step in our mission to deliver world-class AI that serves the region and contributes to global progress," said Aaraj.
Benchmark scores
On the Open Arabic LLM Leaderboard, TII said Falcon-H1 Arabic leads across several model bands. The 3B model records an average score of 61.87 per cent. This places it about ten percentage points ahead of leading 4B-class competitors, including Microsoft's Phi-4 Mini, according to the institute.
The 7B model records an average score of 71.47 per cent. TII said this surpasses all ~10B-class systems on the leaderboard, including Qatar's Fanar-1-9B and Saudi Arabia's HUMAIN ALLaM 7B.
The largest Falcon-H1 Arabic model, with 34 billion parameters, posts a score of 75.36 per cent. The institute said this exceeds results from 70B-plus parameter systems such as China's Qwen2.5 72B and Meta's Llama-3.3 70B in the benchmark comparison.
Falcon-H1 Arabic also records high scores on more focused Arabic benchmarks. These include 3LM for STEM reasoning, ArabCulture for cultural and contextual questions, and AraDice for dialect understanding.
TII said the combination of general and specialised benchmarks indicates a step change in the performance of Arabic language models. The group highlighted the balance of linguistic depth, reasoning and computational efficiency relative to model size.
Longer context
The models also introduce longer context windows. TII said Falcon-H1 Arabic supports context lengths up to 256,000 tokens.
This length enables processing of large volumes of material within a single interaction. TII pointed to use cases such as legal documents, medical notes, academic works and internal corporate knowledge bases that can be analysed without losing thread or structure across lengthy text.
Hakim Hacid, Chief Researcher at TII's Artificial Intelligence and Digital Research Centre, said the design of Falcon-H1 Arabic reflects practical deployment needs in the region.
"This model reflects our focus on building Arabic AI that is not only more advanced, but genuinely useful in real-world settings. By improving efficiency, depth of understanding and language coverage, we are enabling AI systems that can better support institutions, developers, and communities across the region," said Hacid.
TII said Falcon AI models from the institute have held top positions on regional and global benchmarks since 2023. Falcon-H1 Arabic now leads the Open Arabic LLM Leaderboard across multiple model sizes under that framework.
The institute said the results show the UAE's ability to field sovereign AI systems that stand alongside international offerings from the US, Europe and Asia in specialised domains. The new Falcon-H1 Arabic models are accessible through TII's public Falcon interface for organisations and developers that want to experiment with Arabic-focused AI systems.