Agent S2
Agent S2 is an open, modular, and scalable framework for computer-use agents developed by Simular. These autonomous AI agents interact directly with graphical user interfaces (GUIs) on desktops, mobile devices, browsers, and various software applications, mimicking human-like control via mouse and keyboard. Building upon the initial Agent S framework, Agent S2 enhances performance and modularity by integrating both frontier foundation models and specialized models. It achieves state-of-the-art results, notably surpassing previous benchmarks on OSWorld and AndroidWorld evaluations. Key design principles include proactive hierarchical planning, where the agent dynamically updates its plans after each subtask; visual grounding for precise GUI interaction using raw screenshots; an improved Agent-Computer Interface (ACI) that delegates complex tasks to specialized modules; and an agentic memory mechanism that enables continual learning from experience.
Learn more
Surf.new
Surf.new is a free, open-source playground for testing and using AI agents that can browse the web. These agents surf the web and interact with webpages similarly to how a human would, making tasks like automation and web research easy and intuitive.
Whether you're a developer evaluating web agents for production use or someone looking to automate repetitive tasks like checking flights, scraping product information, or booking reservations, Surf.new provides an accessible environment to quickly experiment and see how web agents perform.
Key Features:
Swap between AI Agent Frameworks with a button: Supports Browser-use, an experimental Claude Computer-use-based agent, and integrates smoothly with LangChain—allowing easy experimentation with different approaches.
Diverse AI Model Compatibility: Compatible with popular models including Claude 3.7, DeepSeek R1, OpenAI models, Gemini 2.0 Flash, and others—giving you the flexibility to choose what works best.
Learn more
II-Agent
II-Agent is an open source intelligent assistant developed by Intelligent Internet, designed to enhance productivity across various domains such as research, content creation, data analysis, coding, automation, and problem-solving. It operates through a robust function-calling paradigm, driven by a powerful large language model (LLM), specifically Anthropic's Claude 3.7 Sonnet, and is supported by advanced planning, comprehensive execution capabilities, and intelligent context management. The agent's architecture includes a central reasoning and orchestration component that interfaces directly with the LLM, utilizing system prompting, interaction history management, and intelligent context management to maintain a coherent and efficient workflow. II-Agent's capabilities encompass multistep web search, source triangulation, structured note-taking, rapid summarization, blog and article drafting, lesson plan creation, creative prose, technical manuals, website creation, etc.
Learn more
AskUI
AskUI is an innovative platform that enables AI agents to visually perceive and interact with any computer interface, facilitating seamless automation across various operating systems and applications. Leveraging advanced vision models, AskUI's PTA-1 prompt-to-action model allows users to execute AI-driven actions on Windows, macOS, Linux, and mobile devices without the need for jailbreaking. This technology is particularly beneficial for tasks such as desktop and mobile automation, visual testing, and document or data processing. By integrating with tools like Jira, Jenkins, GitLab, and Docker, AskUI enhances workflow efficiency and reduces the burden on developers. Companies like Deutsche Bahn have reported significant improvements in internal processes, citing over a 90% increase in efficiency through the use of AskUI's test automation capabilities.
Learn more