Lightweight Safety Classification Using Pruned Language Models

Lightweight Safety Classification Using Pruned Language Models

This paper introduces Layer Enhanced Classification (LEC), a novel technique that outperforms GPT-4o and specialized models in content safety and prompt injection detection using fewer than 100 training examples with dramatically reduced computational requirements. The approach combines the computational efficiency of a streamlined Penalized Logistic Regression Classifier with the robust language understanding of an LLM. The results demonstrate that incredibly small transformer models (0.5B parameters) are robust feature extractors for classification tasks.

View on arXiv →
The Landscape of Emerging AI Agent Architectures for Reasoning, Planning, and Tool Calling: A Survey

The Landscape of Emerging AI Agent Architectures for Reasoning, Planning, and Tool Calling: A Survey

This comprehensive survey examines cutting-edge AI agent implementations, focusing on their capabilities for reasoning, planning, and tool execution. The paper maps out both single-agent and multi-agent architectures, identifies key design patterns, and provides critical insights for effective agent system development.

View on arXiv →