
Making LLMs Agentic: How Tool Use and Synthetic Data Drive Real-World Intelligence
This talk presents advancements in agentic AI, focusing on creating and utilizing datasets and models that enable Large Language Models to effectively interact with external tools for real-world applications.
Overview
While Large Language Models (LLMs) have made impressive strides, their capabilities remain constrained without the ability to interact with real-time data or external tools (aka APIs) —an essential requirement for dynamic, real-world tasks. This talk explores the emerging field of agentic AI, where LLMs are empowered to autonomously call tools, make decisions, and act beyond their pre-trained knowledge. I will begin by introducing API-BLEND, a comprehensive corpus for training and benchmarking tool-augmented LLMs. By curating and transforming existing datasets, API-BLEND enables tasks such as API detection, slot filling (identifying and populating API arguments), and call sequencing (determining the correct order of API calls), boosting models' ability to understand and use external tools effectively. Building on API-BLEND, we generalize its principles into a versatile synthetic data generation framework capable of producing diverse, domain-agnostic datasets. This supports scalable and robust training pipelines across tool-use scenarios. Leveraging these resources, I will introduce Granite-20B-FunctionCalling, our 20B-parameter model trained via multi-task learning on seven granular tool-use tasks. The model ranks highly on the Berkeley Function Calling Leaderboard v1 and shows strong generalization to multiple out-of-domain tool-use benchmarks. The talk concludes by connecting the dots: how agentic AI, synthetic data generation, and specialized tool-use models converge to create next-generation LLMs—models that not only excel on benchmarks but also operate effectively in real-world, tool-rich environments.
Presenters
Brief Biography
Ibrahim Abdelaziz is a Senior Research Scientist at IBM Research, where he focuses on enhancing the capabilities of Large Language Models (LLMs), particularly on improving their agentic and reasoning abilities. Prior to this, Ibrahim led several projects in the areas of knowledge graphs, question answering, knowledge representation, and reasoning. He earned his Ph.D. from KAUST, where his research focused on graph analytics and distributed computing, specifically on building distributed systems to efficiently manage, query, and mine large-scale graphs.