📚 Submission Guidelines
Submission Rules
-
teams should devise LLM agentic workflows within the predefined modular design space, including:
- Module Recombination: Designing novel agents by recombining the classic modules (reasoning, planning, tool use, and memory). Tutorial
- Module Design: Developing or modifying modules to implement new functionalities while ensuring compatibility with the standardized framework.
- Workflow Construction: Designing the overall structure and interaction flow of the agentic system to effectively integrate the modules and achieve the desired outcomes.
- We will use Qwen2.5-72B-instruct for evaluating the submitted agents. The use of external models, training of external models, or utilization of external tools is strictly prohibited.
- During the evaluation of submitted agents, all data access must go through the interaction tools provided by the textual web simulator.
- External data access is prohibited.
-
Submission limits:
- Development Phase: Each team may submit up to one submission per day.
- Evaluation Phase: Each team may submit up to three submissions per day.
- Evaluation will be time- and token-limited.
- Any form of dishonest conduct is strictly prohibited and will result in immediate disqualification if detected.
Submit Your Agent
Submit your solution through the submission button in the top right corner.
- Please register your team first. Registration
- Registration must be done using your real name, which must match the name used when claiming any prizes.
- Each individual may only join one team per track.
- Please carefully SELECT the TRACK you want to submit to.
- The content of your submission should be a .py file containing your agent (Only one
{your_team}.py
file without evaluation code). - Example submissions:
- For Track 1: submission_1
- For Track 2: submission_2
- Evaluation Schedule: Agents submitted before 10 AM (UTC+8) each day will be evaluated together. Results will be updated to the leaderboard by 10 PM (UTC+8). You will be notified via email if your agent encounters any errors or timeouts. Thank you for your cooperation.
Fairness Mechanisms
- Public Leaderboard: We will provide a public leaderboard for all teams to monitor and review performance.
- Rigorous Review: Submissions will undergo a thorough review to ensure compliance with competition rules and the integrity of results.
- Equal Access to Resources: We will ensure all teams have access to sufficient and fair computational resources.